feat: move job providers to DB and suppress job-search link when none enabled #36

Merged
gelu merged 11 commits from feature/job-providers-db-and-link-guard into main 2026-05-29 10:07:16 +00:00

11 Commits

Author SHA1 Message Date
claude 9cf3db089d fix(cv-search-job): separate keyword badges with whitespace in results email
string.Join("") produced no whitespace between inline-block spans,
causing keywords to visually merge in email clients that collapse margins.
Switched to string.Join(" ") and zeroed left margin on each badge so
they wrap cleanly without a gap on the first item.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 13:05:33 +03:00
claude e5b6f19c1a chore: remove orphaned project directories left over from renames
Deleted stale directories and stray .csproj files that were never added
to the solution after project renames:
- Apis/cv-search-models/  (renamed → cv-search-data)
- Apis/myai-models/       (renamed → myai-data)
- Apis/shared-models/     (empty leftover)
- Apis/cv-search-data/cv-search-models.csproj  (stray old csproj)
- Apis/myai-data/myai-models.csproj            (stray old csproj)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:49:01 +03:00
claude 9bedf57f39 fix(migrations): replace hardcoded schema strings with MigrationConstants.SchemaName
Two migration files had literal schema strings that were missed in earlier passes:
- cv-search-data AddJobSearchTables: two CreateIndex calls used "cvSearch"
- rag-data InitialRagSchema: FK principalSchema used "rag"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:46:41 +03:00
claude b78ede23cf feat(job-search): extract keywords from LLM match call instead of heuristics
Piggybacks keyword extraction onto the existing CV-to-job LLM call —
no extra API calls. The system prompt now instructs the model to return
8-12 English job-search terms (job titles, technologies, skills, domains)
in a new `keywords` field alongside the existing score/summary fields.

Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest →
JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords
(copied at session-creation time, no RAG call needed).

Changes:
- Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity
- IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords
- JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords
  instead of fetching CV text from RAG — removes IRagApiClient dependency
- Remove heuristic ExtractKeywords method
- Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens
- Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed
  to include keywords in the JSON shape

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:44:13 +03:00
claude a467fac35d fix(cv-matcher-api): fix keyword extraction for single-line PDF text
PDF text extraction often stores all content without newlines. The previous
line-based splitter would produce one line > 200 chars which was filtered out,
yielding empty keywords. Replace with word-level sampling of the first 2000
chars, splitting on whitespace and common delimiters, skipping phone fragments,
emails, and URLs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:29:10 +03:00
claude 25731868ee fix(email-data): replace hardcoded emailApi schema string with MigrationConstants
Down migration was referencing "emailApi" literal instead of MigrationConstants.SchemaName,
which would have dropped the wrong schema on rollback. Also fix stale comment in DbContext.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:19:58 +03:00
claude c675954f8a fix(cv-search-data): use MigrationConstants.SchemaName in AddJobProviders migration
Replace hardcoded "cvSearch" string literals with MigrationConstants.SchemaName
in the Up, InsertData, and Down methods, consistent with all other migrations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 12:15:44 +03:00
claude c8d1a21736 chore(config): remove Providers array from appsettings — now in DB
Provider config is no longer read from appsettings or env vars.
All three providers (ejobs.ro, bestjobs.eu, linkedin.com) are seeded
into cvSearch.JobProviders by the AddJobProviders migration.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 11:53:17 +03:00
claude d0d45bd2d3 feat(job-search): read providers from DB and suppress link when none enabled
JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any
enabled row; returns null (no token created) when the table is empty or
all providers are disabled. TriggerStartAsync snapshots enabled providers
from DB at session-start time, preserving the existing snapshot contract.

CvMatcherController guards link-building on a non-null TokenId so the
"Start a job search" CTA is omitted from match emails when no providers
are configured.

JobSearchSettings.Providers list removed — provider config now lives
exclusively in the DB. CvSearchJobTask.GetProviders falls back to an
empty list with a warning (snapshot should always be populated from DB).

Closes #35

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 11:46:44 +03:00
claude 7c09f5a871 feat(cv-search-data): add JobProviders table to cvSearch schema
New JobProviderEntity persists provider config (name, URL template,
link filter, initial keywords, max results, display order) in the DB
instead of appsettings. Migration seeds three disabled defaults:
ejobs.ro, bestjobs.eu, and linkedin.com.

Closes #35

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 11:46:34 +03:00
claude af3a14c7ed feat(cv-search-job): enrich diagnostics and add scan summary to results email
Build and Push Docker Images Staging / build (push) Successful in 24s
Add funnel-level logging to HtmlJobSearcher (total anchors found,
stage-1 href-filter count, stage-2 keyword-filter count) and warn
when the keyword list is empty. Log the full search URL and response
size to catch silent HTTP failures or bot-block pages.

In CvSearchJobTask, log keywords and active providers at session start,
per-provider URL counts after each scrape, and every scored URL with its
verdict (ACCEPTED / rejected) at Information level.

Add a scan summary block to the results email (both non-empty and
empty-results paths) showing the CV keywords used as chips and the
comma-separated list of providers scanned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-29 11:00:04 +03:00