ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.
- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
(remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
npx; copy browser binaries to the final image; install Chromium system
libs (Ubuntu noble t64 variants)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add funnel-level logging to HtmlJobSearcher (total anchors found,
stage-1 href-filter count, stage-2 keyword-filter count) and warn
when the keyword list is empty. Log the full search URL and response
size to catch silent HTTP failures or bot-block pages.
In CvSearchJobTask, log keywords and active providers at session start,
per-provider URL counts after each scrape, and every scored URL with its
verdict (ACCEPTED / rejected) at Information level.
Add a scan summary block to the results email (both non-empty and
empty-results paths) showing the CV keywords used as chips and the
comma-separated list of providers scanned.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Update CLAUDE.md: replace incorrect 'no XML doc on internal code' rule
with the correct convention (XML doc on all public methods and
non-trivial private/protected helpers)
- Restore /// <summary> on FileDownloadController private helpers
(HandleRangeRequest, StreamRangeAsync)
- Add full XML doc to all service contracts:
ICaptchaVerifier, IEmailSender, ICvMatcherService, IJobTextExtractor,
IJobTokenService, IDocumentClassifier, IRagService, ITextChunker,
ITextExtractor, IEmailTemplateService, ITemplateService
- Add /// <summary> and /// <inheritdoc /> to all concrete service classes
and their methods: RecaptchaVerifier, EmailApiEmailSender,
SmtpEmailDispatcher, CvMatcherService, JobTextExtractor, JobTokenService,
RagService, DocumentClassifier, TextChunker, TextExtractor,
HtmlJobSearcher, CvSearchEmailSender, CvSearchJobTask,
EmailTemplateService, DbTemplateService
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- EmailController: add class summary, full SwaggerResponse/ProducesResponseType
for 400 and 500, and Description on SwaggerOperation
- ContactController: fix terse "Failed." error message to
"Could not process subscription."
- FileDownloadController: remove redundant XML <response code> tags from
the public action doc block; convert private-method /// <summary> to //
(project convention: no XML doc on internal code)
- CvMatcherService: remove two dead commented-out blocks (old email send
and BuildEmailBody helper)
- JobTokenService: comment the phone/contact-line regex filter in
ExtractKeywords
- DocumentClassifier: comment the keyword-frequency scoring approach and
the confidence formula
- TextChunker: comment the sliding-window step (chunkSize - overlap)
- CvSearchJobTask: comment the GdprConsent = true rationale and the
BuildCvFileName sanitisation logic
- HtmlJobSearcher: comment GetLeftPart(UriPartial.Path) query-strip dedup
Closes#26
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New cv-search-models shared library: EF entities + CvSearchDbContext for cvSearch schema (JobSearchTokens, JobSearchSessions, JobSearchResults tables)
- New cv-search-job worker service: polls DB for pending sessions, scrapes job boards via configurable HTML scraping, runs LLM scoring via cv-matcher-api, emails ranked results
- cv-matcher-api: JobTokenService creates one-time tokens; JobSearchController handles link clicks and creates sessions
- api: proxies job-search start endpoint, appends job search link to match result email
- CI workflow updated to build and push myai-cv-search-job:staging image
- CLAUDE.md documentation added for all affected services
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>