1. Pass session.Language to MatchJobRequest in cv-search-job so the LLM
uses the correct language-specific prompt for job titles and summaries.
2. Replace hardcoded "Manual job description" with a template-driven label
(email.match.manual-job-label) seeded in English and Romanian, so the
match email subject and Job row reflect the user's language.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds nullable JobSearchSessionId to PageFetchEntity and FetchPageRequest.
cv-search-job passes session.Id on every fetch so all Playwright page
loads for a job search session can be traced back to their session.
Includes index on JobSearchSessionId for efficient lookup.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously always stored empty string; now stores the full page text
returned by page-fetcher-api, which is already in scope at save time.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Captures client IP at job-search link-click time and threads it through to the session.
Both Email and ClientIpAddress are copied from session to each result row during processing.
Closes#47
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
page-fetcher-api always uses Playwright (networkidle by default), so the
per-provider flag that chose between headless and plain HTTP is obsolete.
- Removed from JobProviderEntity, CvSearchDbContext, JobProviderConfig, JobTokenService
- HtmlJobSearcher no longer passes WaitFor (uses page-fetcher-api default)
- EF migration drops the column from cvSearch.JobProviders
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Refit 10.1.6 signing certificate was revoked; upgraded to 11.0.1 in Directory.Packages.props
- cv-matcher-api/Dockerfile and cv-search-job/Dockerfile were missing COPY steps
for page-fetcher-api-models (added in this feature branch)
All 8 images now build cleanly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removes the spurious Api segment to match the pattern used by all other
models projects: CvMatcher.Models.*, Rag.Models.*, PageFetcher.Models.*.
Updated all consumers: email-api, api, cv-search-job.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces page-fetcher-api, a new internal ASP.NET Core service that
centralises all web-page fetching through a single Playwright (headless
Chromium) browser instance. All fetches are persisted to the pageFetcher
SQL schema for auditing.
New projects:
- Apis/page-fetcher-api-models: FetchPageRequest, FetchPageResponse, IPageFetcherApiClient
- Apis/page-fetcher-data: PageFetchDbContext, PageFetchEntity, InitialSchema migration (schema: pageFetcher)
- Apis/page-fetcher-api: PlaywrightBrowserService (singleton), PageFetcherService, PageController
Changes to existing services:
- cv-matcher-api: JobTextExtractor now calls IPageFetcherApiClient instead of HttpClient
- cv-search-job: HtmlJobSearcher uses IPageFetcherApiClient (removes inline Playwright);
CvSearchJobTask fetches individual job pages and applies keyword pre-filter before
LLM call; passes pre-fetched JobDescription to cv-matcher-api to skip re-fetch
- common: add PageFetcherApiSettings
- docker-compose.yml, build.yml: add new service + env vars for callers
Closes#43
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After resolving relative hrefs against the base search URL, some ejobs.ro
links were producing file:/// URIs (e.g. file:///user/locuri-de-munca/...).
These were sent to cv-matcher-api and rejected with HTTP 400, causing 0 matches.
Added a scheme guard after URI resolution to skip any URL that is not
http:// or https://, preventing malformed URLs from reaching the matcher.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Show the candidate's location in the scan summary block of the results email
alongside keywords and providers, for both en and ro templates.
- CvSearchEmailSender.SendResultsAsync accepts location and passes it to BuildScanSummary
- BuildScanSummary passes {{location}} to the template (falls back to '-' when absent)
- CvSearchJobTask passes session.Location to SendResultsAsync
- New migration AddLocationToScanSummaryTemplate updates both language variants of
email.search-results.scan-summary to include a 'Location / Locație căutată' row
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes#41
- Add RequireKeywordInAnchor per-provider flag (default true); set false for
ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their
search URL already filters by relevance server-side
- Update AI system prompts (en + ro) to extract concise job-board-friendly
keywords (role title + key tech, not abstract concepts) and candidate location
- Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest ->
JobSearchTokenEntity -> JobSearchSessionEntity
- Add {location} and {location-slug} substitution in HtmlJobSearcher
- Update provider SearchUrlTemplates to include location:
ejobs.ro: /locuri-de-munca/{location-slug}?q={keywords}
bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords}
linkedin.com: ?keywords={keywords}&location={location}
- Three new migrations: AddRequireKeywordInAnchorAndLocation,
ImproveKeywordsAndAddLocation, AddLocationToProviders
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Serilog.Settings.Configuration cannot deserialize NetworkCredential or
MailKit's SecureSocketOptions from JSON, causing an InvalidOperationException
in the binder and preventing containers from starting.
Fix: remove Email from the WriteTo JSON array entirely and wire it in code
inside ConfigureJsonSerilog using a dedicated SerilogEmail:* config section.
The sink is skipped when From/To/Host are absent, so local dev is unaffected.
Also renames the docker-compose env vars from the verbose
Serilog__WriteTo__2__Args__* prefix to the clean SerilogEmail__* prefix.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
HTTP and Playwright fetch failures in HtmlJobSearcher are now logged at
Error so that Serilog's email sink triggers an alert when a job provider
is unreachable. Per-URL match failures remain at Warning (expected noise).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Serilog.Sinks.Email v4 renamed all configuration parameters from their
v2 names. The old names were silently ignored, so no error alert emails
were ever sent.
Parameter renames applied across all 6 appsettings.json and docker-compose:
fromEmail → from
toEmail → to
mailServer → host
networkCredential → credentials
enableSsl: true → connectionSecurity: StartTls
emailSubject → subject
outputTemplate → body
batchPostingLimit / period removed (v4 batching uses a separate overload)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace div-based layouts with table-based HTML throughout (max-width/border-radius/display:inline-block ignored by Outlook)
- email.match.body: width:100% table with per-cell borders and fixed 130px label column
- email.match.job-search-footer: table-based button with bgcolor attribute
- email.search-results.empty: div replaced with full-width table
- email.search-results.body: remove div wrapper around items
- Add email.search-results.scan-summary and email.search-results.item templates
- CvSearchEmailSender: remove all hardcoded HTML; render via IEmailTemplateService
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Refactoring:
- Rename EmailApiDbContext class to EmailDbContext for consistency with other DbContext naming
- Rename DbSet property from EmailTemplates to Templates
- Rename table from EmailTemplates to Templates
- Update all references in Program.cs files (email-api, api, cv-search-job)
- Update all migration files and model snapshot
- Fix cv-search-job migrations assembly name: email-api-data → email-data
This improves naming consistency across the solution.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.
- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
(remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
npx; copy browser binaries to the final image; install Chromium system
libs (Ubuntu noble t64 variants)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
string.Join("") produced no whitespace between inline-block spans,
causing keywords to visually merge in email clients that collapse margins.
Switched to string.Join(" ") and zeroed left margin on each badge so
they wrap cleanly without a gap on the first item.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Provider config is no longer read from appsettings or env vars.
All three providers (ejobs.ro, bestjobs.eu, linkedin.com) are seeded
into cvSearch.JobProviders by the AddJobProviders migration.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any
enabled row; returns null (no token created) when the table is empty or
all providers are disabled. TriggerStartAsync snapshots enabled providers
from DB at session-start time, preserving the existing snapshot contract.
CvMatcherController guards link-building on a non-null TokenId so the
"Start a job search" CTA is omitted from match emails when no providers
are configured.
JobSearchSettings.Providers list removed — provider config now lives
exclusively in the DB. CvSearchJobTask.GetProviders falls back to an
empty list with a warning (snapshot should always be populated from DB).
Closes#35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add funnel-level logging to HtmlJobSearcher (total anchors found,
stage-1 href-filter count, stage-2 keyword-filter count) and warn
when the keyword list is empty. Log the full search URL and response
size to catch silent HTTP failures or bot-block pages.
In CvSearchJobTask, log keywords and active providers at session start,
per-provider URL counts after each scrape, and every scored URL with its
verdict (ACCEPTED / rejected) at Information level.
Add a scan summary block to the results email (both non-empty and
empty-results paths) showing the CV keywords used as chips and the
comma-separated list of providers scanned.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Update all Dockerfile COPY commands to reference the renamed email-data project
instead of email-api-data. This resolves Docker build failures introduced by the
email-api-data → email-data rename.
- Apis/api/Dockerfile: Update lines 8 and 20
- Apis/email-api/Dockerfile: Update lines 6 and 17
- Jobs/cv-search-job/Dockerfile: Update lines 10 and 23
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Rename project folder Apis/email-api-data → Apis/email-data
- Rename csproj file: email-api-data.csproj → email-data.csproj
- Update csproj properties: AssemblyName and RootNamespace (email-data, Email.Data)
- Update C# namespaces: EmailApi.Data → Email.Data across all email-data files
- Update project references in api.csproj and email-api.csproj
- Update migration assembly references in api/Program.cs and email-api/Program.cs
- Update cv-search-job references to use email-data project and Email.Data namespace
- Update solution file to reference new email-data project path
- Remove hardcoded schema name from SmtpEmailDispatcher, use template service instead
This maintains consistency with other data project naming convention (no service-type suffix).
All tests passing, build succeeds.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Both api and cv-search-job need to connect to email-api for sending emails.
Add EmailApi section to their appsettings.json with BaseUrl and InternalApiKey
placeholders. Environment variables from docker-compose will populate these at runtime.
Also add EmailApi credentials to docker-compose/.env:
- EmailApi__BaseUrl=http://email-api:8080
- EmailApi__InternalApiKey=<shared key>
- EmailApi__RequireApiKey=true
This ensures both services can authenticate and call the email-api service.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
The cv-search-job.csproj references both email-api-data and email-api-models,
but the Dockerfile was not copying them into the build context. This caused
compilation errors about missing EmailApi namespace types.
Added COPY commands for both projects before restore and publish steps.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The explicitly added versions were conflicting with the centralized version
definitions in Directory.Packages.props. Removed all explicit versions from:
- web/web.csproj
- Jobs/cv-cleanup-job/cv-cleanup-job.csproj
- Jobs/cv-search-job/cv-search-job.csproj
NuGet will now resolve versions from Directory.Packages.props which has the
canonical version definitions for the entire solution.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Docker builds were failing because the centralized package version
management file (Directory.Packages.props) was not being copied into the
build context. This file is required for NuGet to resolve package versions
in projects that don't specify explicit versions.
Updated all Dockerfiles to copy Directory.Packages.props before running
dotnet restore:
- Apis/api/Dockerfile
- Apis/cv-matcher-api/Dockerfile
- Apis/rag-api/Dockerfile
- Jobs/cv-cleanup-job/Dockerfile
- Jobs/cv-search-job/Dockerfile
- web/Dockerfile
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Update CLAUDE.md: replace incorrect 'no XML doc on internal code' rule
with the correct convention (XML doc on all public methods and
non-trivial private/protected helpers)
- Restore /// <summary> on FileDownloadController private helpers
(HandleRangeRequest, StreamRangeAsync)
- Add full XML doc to all service contracts:
ICaptchaVerifier, IEmailSender, ICvMatcherService, IJobTextExtractor,
IJobTokenService, IDocumentClassifier, IRagService, ITextChunker,
ITextExtractor, IEmailTemplateService, ITemplateService
- Add /// <summary> and /// <inheritdoc /> to all concrete service classes
and their methods: RecaptchaVerifier, EmailApiEmailSender,
SmtpEmailDispatcher, CvMatcherService, JobTextExtractor, JobTokenService,
RagService, DocumentClassifier, TextChunker, TextExtractor,
HtmlJobSearcher, CvSearchEmailSender, CvSearchJobTask,
EmailTemplateService, DbTemplateService
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- EmailController: add class summary, full SwaggerResponse/ProducesResponseType
for 400 and 500, and Description on SwaggerOperation
- ContactController: fix terse "Failed." error message to
"Could not process subscription."
- FileDownloadController: remove redundant XML <response code> tags from
the public action doc block; convert private-method /// <summary> to //
(project convention: no XML doc on internal code)
- CvMatcherService: remove two dead commented-out blocks (old email send
and BuildEmailBody helper)
- JobTokenService: comment the phone/contact-line regex filter in
ExtractKeywords
- DocumentClassifier: comment the keyword-frequency scoring approach and
the confidence formula
- TextChunker: comment the sliding-window step (chunkSize - overlap)
- CvSearchJobTask: comment the GdprConsent = true rationale and the
BuildCvFileName sanitisation logic
- HtmlJobSearcher: comment GetLeftPart(UriPartial.Path) query-strip dedup
Closes#26
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add ProjectReference to email-api-data; remove myai-data reference
- Program.cs: register EmailApiDbContext (no migrate), IEmailTemplateRepository
(scoped), IEmailTemplateService (singleton); remove MyAiDbContext +
ITemplateService registrations and their migration call
- CvSearchEmailSender: inject IEmailTemplateService; replace
_config["Contact:ToEmail"] with GetOperatorCopy("email.search-results.subject")
for operator copy logic; remove IConfiguration injection
- docker-compose: remove Contact__ToEmail from cv-search-job service block;
add Database__* env vars to email-api service (needed for EmailApiDbContext)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- CvSearchEmailSender now injects IEmailApiClient instead of IConfiguration+MailKit
- BuildBody updated to produce styled HTML job cards
- CvSearchJobTask passes only filename (not full path) to email sender
- IEmailApiClient registered in Program.cs with EmailApi:BaseUrl + InternalApiKey
- MailKit removed from cv-search-job.csproj
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New Apis/myai-models project: MyAiDbContext (schema myAi), TemplateEntity,
ITemplateService, DbTemplateService with 10-min in-memory cache
- Seeds EN+RO variants for all user-facing templates (match email, job search
results email, HTML status pages, AI system prompt)
- Match result email now sent in user's UI language (en/ro)
- Job search results email now respects session language
- Language propagates: MatchJobRequest -> token -> session -> email
- Add Language column to JobSearchTokens and JobSearchSessions (default 'en')
- All three Dockerfiles updated to include myai-models in build context
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New cv-search-models shared library: EF entities + CvSearchDbContext for cvSearch schema (JobSearchTokens, JobSearchSessions, JobSearchResults tables)
- New cv-search-job worker service: polls DB for pending sessions, scrapes job boards via configurable HTML scraping, runs LLM scoring via cv-matcher-api, emails ranked results
- cv-matcher-api: JobTokenService creates one-time tokens; JobSearchController handles link clicks and creates sessions
- api: proxies job-search start endpoint, appends job search link to match result email
- CI workflow updated to build and push myai-cv-search-job:staging image
- CLAUDE.md documentation added for all affected services
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>