feat: move job providers to DB and suppress job-search link when none enabled #36

Merged
gelu merged 11 commits from feature/job-providers-db-and-link-guard into main 2026-05-29 10:07:16 +00:00
Owner

What

Two linked changes to make job-search provider config DB-driven and the CTA link conditional.

Why

  • Users were receiving a "Start a job search" link in match emails even when no providers were configured, leading to confusing zero-result emails.
  • Provider config in appsettings required a full redeploy to change; DB storage allows future admin tooling.

Changes

  • New cvSearch.JobProviders table (JobProviderEntity, migration AddJobProviders) seeded with 3 disabled defaults: ejobs.ro, bestjobs.eu, linkedin.com
  • CvSearchDbContext gains JobProviders DbSet with entity configuration
  • JobTokenService.CreateTokenAsync queries DB for enabled providers; returns null (no token) when none are enabled
  • JobTokenService.TriggerStartAsync reads and snapshots enabled providers from DB (was from appsettings)
  • IJobTokenService.CreateTokenAsyncTask<string?>; CreateJobSearchTokenResponse.TokenIdstring?
  • CvMatcherController guards link-building on non-null TokenId — CTA is suppressed when all providers are disabled
  • JobSearchSettings.Providers list removed; JobProviderConfig DTO retained for session snapshot format
  • CvSearchJobTask.GetProviders fallback simplified to return [] + warning (snapshot always populated from DB)

Testing

  • Full solution build: 0 errors, 0 warnings
  • Migration scaffolded and seeds 3 rows (all Enabled = 0)
  • No unit tests in this repo — manual integration test via Docker Compose

Risk Assessment

  • Breaking changes: TokenId on the response DTO is now nullable — callers must handle null (CvMatcherController updated)
  • JobSearch:Providers env vars in docker-compose are now ignored — providers must be managed in DB
  • Performance: adds one AnyAsync DB call per match request (negligible)

Closes #35

## What Two linked changes to make job-search provider config DB-driven and the CTA link conditional. ## Why - Users were receiving a "Start a job search" link in match emails even when no providers were configured, leading to confusing zero-result emails. - Provider config in appsettings required a full redeploy to change; DB storage allows future admin tooling. ## Changes - **New** `cvSearch.JobProviders` table (`JobProviderEntity`, migration `AddJobProviders`) seeded with 3 disabled defaults: ejobs.ro, bestjobs.eu, linkedin.com - `CvSearchDbContext` gains `JobProviders` DbSet with entity configuration - `JobTokenService.CreateTokenAsync` queries DB for enabled providers; returns `null` (no token) when none are enabled - `JobTokenService.TriggerStartAsync` reads and snapshots enabled providers from DB (was from appsettings) - `IJobTokenService.CreateTokenAsync` → `Task<string?>`; `CreateJobSearchTokenResponse.TokenId` → `string?` - `CvMatcherController` guards link-building on non-null `TokenId` — CTA is suppressed when all providers are disabled - `JobSearchSettings.Providers` list removed; `JobProviderConfig` DTO retained for session snapshot format - `CvSearchJobTask.GetProviders` fallback simplified to return `[]` + warning (snapshot always populated from DB) ## Testing - Full solution build: 0 errors, 0 warnings - Migration scaffolded and seeds 3 rows (all `Enabled = 0`) - No unit tests in this repo — manual integration test via Docker Compose ## Risk Assessment - Breaking changes: `TokenId` on the response DTO is now nullable — callers must handle null (`CvMatcherController` updated) - `JobSearch:Providers` env vars in docker-compose are now ignored — providers must be managed in DB - Performance: adds one `AnyAsync` DB call per match request (negligible) Closes #35
gelu added 3 commits 2026-05-29 08:47:41 +00:00
feat(cv-search-job): enrich diagnostics and add scan summary to results email
Build and Push Docker Images Staging / build (push) Successful in 24s
af3a14c7ed
Add funnel-level logging to HtmlJobSearcher (total anchors found,
stage-1 href-filter count, stage-2 keyword-filter count) and warn
when the keyword list is empty. Log the full search URL and response
size to catch silent HTTP failures or bot-block pages.

In CvSearchJobTask, log keywords and active providers at session start,
per-provider URL counts after each scrape, and every scored URL with its
verdict (ACCEPTED / rejected) at Information level.

Add a scan summary block to the results email (both non-empty and
empty-results paths) showing the CV keywords used as chips and the
comma-separated list of providers scanned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New JobProviderEntity persists provider config (name, URL template,
link filter, initial keywords, max results, display order) in the DB
instead of appsettings. Migration seeds three disabled defaults:
ejobs.ro, bestjobs.eu, and linkedin.com.

Closes #35

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any
enabled row; returns null (no token created) when the table is empty or
all providers are disabled. TriggerStartAsync snapshots enabled providers
from DB at session-start time, preserving the existing snapshot contract.

CvMatcherController guards link-building on a non-null TokenId so the
"Start a job search" CTA is omitted from match emails when no providers
are configured.

JobSearchSettings.Providers list removed — provider config now lives
exclusively in the DB. CvSearchJobTask.GetProviders falls back to an
empty list with a warning (snapshot should always be populated from DB).

Closes #35

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gelu added 1 commit 2026-05-29 08:53:23 +00:00
Provider config is no longer read from appsettings or env vars.
All three providers (ejobs.ro, bestjobs.eu, linkedin.com) are seeded
into cvSearch.JobProviders by the AddJobProviders migration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
gelu added 7 commits 2026-05-29 10:06:08 +00:00
Replace hardcoded "cvSearch" string literals with MigrationConstants.SchemaName
in the Up, InsertData, and Down methods, consistent with all other migrations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Down migration was referencing "emailApi" literal instead of MigrationConstants.SchemaName,
which would have dropped the wrong schema on rollback. Also fix stale comment in DbContext.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PDF text extraction often stores all content without newlines. The previous
line-based splitter would produce one line > 200 chars which was filtered out,
yielding empty keywords. Replace with word-level sampling of the first 2000
chars, splitting on whitespace and common delimiters, skipping phone fragments,
emails, and URLs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Piggybacks keyword extraction onto the existing CV-to-job LLM call —
no extra API calls. The system prompt now instructs the model to return
8-12 English job-search terms (job titles, technologies, skills, domains)
in a new `keywords` field alongside the existing score/summary fields.

Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest →
JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords
(copied at session-creation time, no RAG call needed).

Changes:
- Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity
- IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords
- JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords
  instead of fetching CV text from RAG — removes IRagApiClient dependency
- Remove heuristic ExtractKeywords method
- Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens
- Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed
  to include keywords in the JSON shape

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two migration files had literal schema strings that were missed in earlier passes:
- cv-search-data AddJobSearchTables: two CreateIndex calls used "cvSearch"
- rag-data InitialRagSchema: FK principalSchema used "rag"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deleted stale directories and stray .csproj files that were never added
to the solution after project renames:
- Apis/cv-search-models/  (renamed → cv-search-data)
- Apis/myai-models/       (renamed → myai-data)
- Apis/shared-models/     (empty leftover)
- Apis/cv-search-data/cv-search-models.csproj  (stray old csproj)
- Apis/myai-data/myai-models.csproj            (stray old csproj)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
string.Join("") produced no whitespace between inline-block spans,
causing keywords to visually merge in email clients that collapse margins.
Switched to string.Join(" ") and zeroed left margin on each badge so
they wrap cleanly without a gap on the first item.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
claude approved these changes 2026-05-29 10:06:32 +00:00
gelu merged commit 5ae65642c4 into main 2026-05-29 10:07:16 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: AI/myAi#36