page-fetcher-api always uses Playwright (networkidle by default), so the
per-provider flag that chose between headless and plain HTTP is obsolete.
- Removed from JobProviderEntity, CvSearchDbContext, JobProviderConfig, JobTokenService
- HtmlJobSearcher no longer passes WaitFor (uses page-fetcher-api default)
- EF migration drops the column from cvSearch.JobProviders
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes#41
- Add RequireKeywordInAnchor per-provider flag (default true); set false for
ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their
search URL already filters by relevance server-side
- Update AI system prompts (en + ro) to extract concise job-board-friendly
keywords (role title + key tech, not abstract concepts) and candidate location
- Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest ->
JobSearchTokenEntity -> JobSearchSessionEntity
- Add {location} and {location-slug} substitution in HtmlJobSearcher
- Update provider SearchUrlTemplates to include location:
ejobs.ro: /locuri-de-munca/{location-slug}?q={keywords}
bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords}
linkedin.com: ?keywords={keywords}&location={location}
- Three new migrations: AddRequireKeywordInAnchorAndLocation,
ImproveKeywordsAndAddLocation, AddLocationToProviders
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.
- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
(remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
npx; copy browser binaries to the final image; install Chromium system
libs (Ubuntu noble t64 variants)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any
enabled row; returns null (no token created) when the table is empty or
all providers are disabled. TriggerStartAsync snapshots enabled providers
from DB at session-start time, preserving the existing snapshot contract.
CvMatcherController guards link-building on a non-null TokenId so the
"Start a job search" CTA is omitted from match emails when no providers
are configured.
JobSearchSettings.Providers list removed — provider config now lives
exclusively in the DB. CvSearchJobTask.GetProviders falls back to an
empty list with a warning (snapshot should always be populated from DB).
Closes#35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>