Closes#41
- Add RequireKeywordInAnchor per-provider flag (default true); set false for
ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their
search URL already filters by relevance server-side
- Update AI system prompts (en + ro) to extract concise job-board-friendly
keywords (role title + key tech, not abstract concepts) and candidate location
- Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest ->
JobSearchTokenEntity -> JobSearchSessionEntity
- Add {location} and {location-slug} substitution in HtmlJobSearcher
- Update provider SearchUrlTemplates to include location:
ejobs.ro: /locuri-de-munca/{location-slug}?q={keywords}
bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords}
linkedin.com: ?keywords={keywords}&location={location}
- Three new migrations: AddRequireKeywordInAnchorAndLocation,
ImproveKeywordsAndAddLocation, AddLocationToProviders
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.
- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
(remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
npx; copy browser binaries to the final image; install Chromium system
libs (Ubuntu noble t64 variants)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Piggybacks keyword extraction onto the existing CV-to-job LLM call —
no extra API calls. The system prompt now instructs the model to return
8-12 English job-search terms (job titles, technologies, skills, domains)
in a new `keywords` field alongside the existing score/summary fields.
Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest →
JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords
(copied at session-creation time, no RAG call needed).
Changes:
- Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity
- IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords
- JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords
instead of fetching CV text from RAG — removes IRagApiClient dependency
- Remove heuristic ExtractKeywords method
- Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens
- Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed
to include keywords in the JSON shape
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any
enabled row; returns null (no token created) when the table is empty or
all providers are disabled. TriggerStartAsync snapshots enabled providers
from DB at session-start time, preserving the existing snapshot contract.
CvMatcherController guards link-building on a non-null TokenId so the
"Start a job search" CTA is omitted from match emails when no providers
are configured.
JobSearchSettings.Providers list removed — provider config now lives
exclusively in the DB. CvSearchJobTask.GetProviders falls back to an
empty list with a warning (snapshot should always be populated from DB).
Closes#35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New Apis/myai-models project: MyAiDbContext (schema myAi), TemplateEntity,
ITemplateService, DbTemplateService with 10-min in-memory cache
- Seeds EN+RO variants for all user-facing templates (match email, job search
results email, HTML status pages, AI system prompt)
- Match result email now sent in user's UI language (en/ro)
- Job search results email now respects session language
- Language propagates: MatchJobRequest -> token -> session -> email
- Add Language column to JobSearchTokens and JobSearchSessions (default 'en')
- All three Dockerfiles updated to include myai-models in build context
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The frontend sends the active language code (currentLang()) with every match
request. CvMatcherService injects a language instruction into the system prompt
so the LLM returns summary, strengths, gaps, recommendations, and evidence in
the correct language. The match result cache (CvMatchResults) now includes
Language as part of the lookup key so Romanian and English results are stored
and retrieved independently. Existing cached rows default to 'en'.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New cv-search-models shared library: EF entities + CvSearchDbContext for cvSearch schema (JobSearchTokens, JobSearchSessions, JobSearchResults tables)
- New cv-search-job worker service: polls DB for pending sessions, scrapes job boards via configurable HTML scraping, runs LLM scoring via cv-matcher-api, emails ranked results
- cv-matcher-api: JobTokenService creates one-time tokens; JobSearchController handles link clicks and creates sessions
- api: proxies job-search start endpoint, appends job search link to match result email
- CI workflow updated to build and push myai-cv-search-job:staging image
- CLAUDE.md documentation added for all affected services
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>