The Results table had a unique constraint on (CvDocumentId, JobDocumentId) but the code
expects uniqueness on (CvDocumentId, JobDocumentId, Language). When matching the same CV
against the same job in different languages, this caused duplicate key violations.
Changes:
- Updated CvMatcherDbContext to define 3-column unique index including Language
- Generated proper EF Core migration to drop 2-column index and create 3-column index
- Updated ModelSnapshot to reflect new 3-column index definition
- Added exception handling in SaveMatchAsync to gracefully handle any race conditions
where duplicate key violations could occur between the existence check and insert
The migration will be automatically applied on container startup via db.Database.Migrate().
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
The unique constraint on cvMatcher.Results was defined as (CvDocumentId, JobDocumentId)
but the code checks for (CvDocumentId, JobDocumentId, Language). This mismatch caused
duplicate key violations when matching the same CV+Job in different languages.
Update the constraint to (CvDocumentId, JobDocumentId, Language) to allow different
languages for the same CV+Job pair while preventing true duplicates.
Resolves: Duplicate key constraint violations on concurrent/repeated match requests
ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.
- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
(remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
npx; copy browser binaries to the final image; install Chromium system
libs (Ubuntu noble t64 variants)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Individual job listings on bestjobs.eu use /loc-de-munca/{slug} URLs.
The seeded JobLinkContains value /ro/locuri-de-munca/ matched only the
category navigation links (Vanzari, Inginerie, Management...), so
zero job URLs passed the stage-1 href filter and the scraper returned
nothing. Migration updates the stored record (Id=2) to /loc-de-munca/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deleted stale directories and stray .csproj files that were never added
to the solution after project renames:
- Apis/cv-search-models/ (renamed → cv-search-data)
- Apis/myai-models/ (renamed → myai-data)
- Apis/shared-models/ (empty leftover)
- Apis/cv-search-data/cv-search-models.csproj (stray old csproj)
- Apis/myai-data/myai-models.csproj (stray old csproj)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two migration files had literal schema strings that were missed in earlier passes:
- cv-search-data AddJobSearchTables: two CreateIndex calls used "cvSearch"
- rag-data InitialRagSchema: FK principalSchema used "rag"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Piggybacks keyword extraction onto the existing CV-to-job LLM call —
no extra API calls. The system prompt now instructs the model to return
8-12 English job-search terms (job titles, technologies, skills, domains)
in a new `keywords` field alongside the existing score/summary fields.
Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest →
JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords
(copied at session-creation time, no RAG call needed).
Changes:
- Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity
- IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords
- JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords
instead of fetching CV text from RAG — removes IRagApiClient dependency
- Remove heuristic ExtractKeywords method
- Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens
- Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed
to include keywords in the JSON shape
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PDF text extraction often stores all content without newlines. The previous
line-based splitter would produce one line > 200 chars which was filtered out,
yielding empty keywords. Replace with word-level sampling of the first 2000
chars, splitting on whitespace and common delimiters, skipping phone fragments,
emails, and URLs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Down migration was referencing "emailApi" literal instead of MigrationConstants.SchemaName,
which would have dropped the wrong schema on rollback. Also fix stale comment in DbContext.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hardcoded "cvSearch" string literals with MigrationConstants.SchemaName
in the Up, InsertData, and Down methods, consistent with all other migrations.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Provider config is no longer read from appsettings or env vars.
All three providers (ejobs.ro, bestjobs.eu, linkedin.com) are seeded
into cvSearch.JobProviders by the AddJobProviders migration.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any
enabled row; returns null (no token created) when the table is empty or
all providers are disabled. TriggerStartAsync snapshots enabled providers
from DB at session-start time, preserving the existing snapshot contract.
CvMatcherController guards link-building on a non-null TokenId so the
"Start a job search" CTA is omitted from match emails when no providers
are configured.
JobSearchSettings.Providers list removed — provider config now lives
exclusively in the DB. CvSearchJobTask.GetProviders falls back to an
empty list with a warning (snapshot should always be populated from DB).
Closes#35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New JobProviderEntity persists provider config (name, URL template,
link filter, initial keywords, max results, display order) in the DB
instead of appsettings. Migration seeds three disabled defaults:
ejobs.ro, bestjobs.eu, and linkedin.com.
Closes#35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix schema name references in migration Designer.cs and ModelSnapshot files.
Previously these files contained hardcoded 'emailApi' schema name instead of
using MigrationConstants.SchemaName constant. This was causing EF Core to
detect pending model changes and fail migrations.
Changes:
- 20260528100000_CreateEmailTemplates.Designer.cs: Use MigrationConstants.SchemaName
- 20260528130652_SeedEmailTemplates.Designer.cs: Use MigrationConstants.SchemaName
- EmailApiDbContextModelSnapshot.cs: Use MigrationConstants.SchemaName and updated namespace
Also updated entity namespace references from EmailApi.Data to Email.Data.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Update all Dockerfile COPY commands to reference the renamed email-data project
instead of email-api-data. This resolves Docker build failures introduced by the
email-api-data → email-data rename.
- Apis/api/Dockerfile: Update lines 8 and 20
- Apis/email-api/Dockerfile: Update lines 6 and 17
- Jobs/cv-search-job/Dockerfile: Update lines 10 and 23
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Rename project folder Apis/email-api-data → Apis/email-data
- Rename csproj file: email-api-data.csproj → email-data.csproj
- Update csproj properties: AssemblyName and RootNamespace (email-data, Email.Data)
- Update C# namespaces: EmailApi.Data → Email.Data across all email-data files
- Update project references in api.csproj and email-api.csproj
- Update migration assembly references in api/Program.cs and email-api/Program.cs
- Update cv-search-job references to use email-data project and Email.Data namespace
- Update solution file to reference new email-data project path
- Remove hardcoded schema name from SmtpEmailDispatcher, use template service instead
This maintains consistency with other data project naming convention (no service-type suffix).
All tests passing, build succeeds.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Delete cv-search-models/Data (duplicate of cv-search-data/Data)
- Delete myai-models/Data (duplicate of myai-data/Data)
- DbContext and Entities belong only in -data projects, not -models
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Update MigrationConstants.SchemaName in email-api-data from 'emailApi' to 'email'
- All migrations automatically use the new schema name via MigrationConstants reference
- Aligns with naming convention: 'email', 'rag', 'cvMatcher', 'cvSearch', 'myAi'
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Move IRagRepository, EfRagRepository, and VectorSerializer from rag-api/Data to rag-data/Repositories
- Add rag-api-models ProjectReference to rag-data.csproj for model type availability
- Delete rag-api/Data folder (no longer needed; all data access is now in rag-data)
- This aligns RAG with email-api and other services: all data code in the data project
Pattern: rag-api (API logic) → rag-data (repository, EF entities, migrations)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Each DbContext now explicitly configures its migration history table to use
the schema-qualified name pattern [schemaName].[_Migrations]:
- [cvMatcher].[_Migrations] for CvMatcherDbContext
- [emailApi].[_Migrations] for EmailApiDbContext
- [cvSearch].[_Migrations] for CvSearchDbContext
- [rag].[_Migrations] for RagDbContext
- [myAi].[_Migrations] for MyAiDbContext
This is done via OnConfiguring() with UseSqlServer().MigrationsHistoryTable(name, schema).
Removed incorrect rename migrations that were created due to misunderstanding
of the proper EF Core configuration approach.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Rename EmailApiDbContext MigrationTableName from '_EmailApiMigrations' to '_Migrations'
- Rename MyAiDbContext MigrationTableName from '_MyAiMigrations' to '_Migrations'
- Add migrations to rename tables in database: emailApi._EmailApiMigrations → emailApi._Migrations, myAi._MyAiMigrations → myAi._Migrations
- Aligns with naming convention used in other schemas (cvMatcher, cvSearch, rag)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Remove Seed() call from CreateEmailTemplates Up() method to prevent
duplicate key violation when applying SeedEmailTemplates migration.
The original migration was attempting to seed data during schema creation,
but data seeding is now handled by the separate SeedEmailTemplates migration
(20260528130652). Keeping both Seed() calls caused PRIMARY KEY violation on
(email.html-shell.start, *) when the second migration tried to insert
already-existing templates.
This maintains the migration order: schema creation first, then data seeding
in a separate, dedicated migration.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Create SeedEmailTemplates migration (20260528130652) with all email templates
- Add Microsoft.EntityFrameworkCore.Design to email-api.csproj for EF migrations
- Add EmailApiDbContext registration and migration support to email-api Program.cs
- Configure IEmailTemplateRepository and IEmailTemplateService in email-api
- All 14 email templates now seeded in emailApi schema (HTML shells, CV match, job search)
- Templates include proper placeholder support ({{score}}, {{count}}, {{jobLabel}}, etc.)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Both api and cv-search-job need to connect to email-api for sending emails.
Add EmailApi section to their appsettings.json with BaseUrl and InternalApiKey
placeholders. Environment variables from docker-compose will populate these at runtime.
Also add EmailApi credentials to docker-compose/.env:
- EmailApi__BaseUrl=http://email-api:8080
- EmailApi__InternalApiKey=<shared key>
- EmailApi__RequireApiKey=true
This ensures both services can authenticate and call the email-api service.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
The email-api service was missing its configuration file, which is required
for Serilog logging setup, database connection, and SMTP settings.
Created appsettings.json with:
- Serilog configuration for console, file, and email logging
- Database connection settings
- SMTP configuration for email sending
- Internal API key configuration
- File storage path configuration
This fixes the container crash loop caused by missing configuration.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The api.csproj references both email-api-data and email-api-models, but the
Dockerfile was not copying them. This caused compilation warnings and potential
build failures.
Added COPY commands for both projects before restore and publish steps.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The email-api.csproj references email-api-data as a project dependency,
but the Dockerfile was not copying it into the build context. This caused
'Skipping project' warnings during restore/publish.
Added COPY commands for both .csproj (before restore) and source directory
(before publish) to include email-api-data in the Docker build.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Docker builds were failing because the centralized package version
management file (Directory.Packages.props) was not being copied into the
build context. This file is required for NuGet to resolve package versions
in projects that don't specify explicit versions.
Updated all Dockerfiles to copy Directory.Packages.props before running
dotnet restore:
- Apis/api/Dockerfile
- Apis/cv-matcher-api/Dockerfile
- Apis/rag-api/Dockerfile
- Jobs/cv-cleanup-job/Dockerfile
- Jobs/cv-search-job/Dockerfile
- web/Dockerfile
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- UseJsonExceptionHandler now maps InvalidOperationException to 400 (was 500),
so upstream business-rule rejections reach the browser as actionable messages.
- CvMatcherController forwards Refit 4xx bodies from cv-matcher-api instead
of swallowing them in a generic 502.
- ErrorResponse.Score removed; CaptchaController puts the score in Detail.
- Frontend extractApiError helper reads the server Error/error/title field for
4xx responses and falls back to a generic i18n string for 5xx / missing body.
- All four failure handlers (CV upload, CV match, contact form, subscribe form)
updated to use extractApiError with the correct rate-limit i18n key.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Update CLAUDE.md: replace incorrect 'no XML doc on internal code' rule
with the correct convention (XML doc on all public methods and
non-trivial private/protected helpers)
- Restore /// <summary> on FileDownloadController private helpers
(HandleRangeRequest, StreamRangeAsync)
- Add full XML doc to all service contracts:
ICaptchaVerifier, IEmailSender, ICvMatcherService, IJobTextExtractor,
IJobTokenService, IDocumentClassifier, IRagService, ITextChunker,
ITextExtractor, IEmailTemplateService, ITemplateService
- Add /// <summary> and /// <inheritdoc /> to all concrete service classes
and their methods: RecaptchaVerifier, EmailApiEmailSender,
SmtpEmailDispatcher, CvMatcherService, JobTextExtractor, JobTokenService,
RagService, DocumentClassifier, TextChunker, TextExtractor,
HtmlJobSearcher, CvSearchEmailSender, CvSearchJobTask,
EmailTemplateService, DbTemplateService
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- EmailController: add class summary, full SwaggerResponse/ProducesResponseType
for 400 and 500, and Description on SwaggerOperation
- ContactController: fix terse "Failed." error message to
"Could not process subscription."
- FileDownloadController: remove redundant XML <response code> tags from
the public action doc block; convert private-method /// <summary> to //
(project convention: no XML doc on internal code)
- CvMatcherService: remove two dead commented-out blocks (old email send
and BuildEmailBody helper)
- JobTokenService: comment the phone/contact-line regex filter in
ExtractKeywords
- DocumentClassifier: comment the keyword-frequency scoring approach and
the confidence formula
- TextChunker: comment the sliding-window step (chunkSize - overlap)
- CvSearchJobTask: comment the GdprConsent = true rationale and the
BuildCvFileName sanitisation logic
- HtmlJobSearcher: comment GetLeftPart(UriPartial.Path) query-strip dedup
Closes#26
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Upgrades 8 email body templates from plain text to styled HTML.
Templates: email.match.body, email.match.job-search-footer,
email.search-results.body, email.search-results.empty (en + ro each).
All use inline CSS only (Gmail-compatible). Branded #2c5282 accent.
Closes#22
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- EmailApiEmailSender calls email-api via IEmailApiClient Refit client
- HTML bodies built inline for contact/subscribe/file-download emails
- match and job-search emails use DB templates (rendered in caller)
- SmtpSettings moved from api-models to email-api (kept in Models.Settings namespace)
- MailKit removed from api.csproj
- SmtpEmailSender deleted; IEmailSender interface unchanged
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New internal service that centralises SMTP email sending.
- email-api-models: SendEmailRequest DTO, IEmailApiClient Refit interface, EmailApiSettings
- email-api: SmtpEmailDispatcher (MailKit), EmailController (POST /api/email/send),
branded HTML shell wrapper, shared-Files-volume attachment support
- Protected by X-Internal-Api-Key via UseInternalApiKeyProtection()
- No exposed Docker port — internal network only (http://email-api:8080)
Closes#22
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- New Apis/myai-models project: MyAiDbContext (schema myAi), TemplateEntity,
ITemplateService, DbTemplateService with 10-min in-memory cache
- Seeds EN+RO variants for all user-facing templates (match email, job search
results email, HTML status pages, AI system prompt)
- Match result email now sent in user's UI language (en/ro)
- Job search results email now respects session language
- Language propagates: MatchJobRequest -> token -> session -> email
- Add Language column to JobSearchTokens and JobSearchSessions (default 'en')
- All three Dockerfiles updated to include myai-models in build context
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The frontend sends the active language code (currentLang()) with every match
request. CvMatcherService injects a language instruction into the system prompt
so the LLM returns summary, strengths, gaps, recommendations, and evidence in
the correct language. The match result cache (CvMatchResults) now includes
Language as part of the lookup key so Romanian and English results are stored
and retrieved independently. Existing cached rows default to 'en'.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every public action now has <summary>, <param>, and <returns> XML docs
plus matching SwaggerOperation/SwaggerResponse attributes with typed response
descriptions. Class-level summaries added to CvController, JobSearchController,
and RagController. Explanatory inline comments removed from FileDownloadController
per project conventions.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Uses GetApplicationVersion(Assembly.GetExecutingAssembly()) — the same
timestamp-based version already logged at startup and baked into the
assembly via the csproj <Version> property. Removes the minimal-API
/version endpoint from web/Program.cs and reverts the web Dockerfile
APP_VERSION build-arg (no longer needed).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>