The AI prompt now instructs the LLM to derive keywords entirely from the
candidate's CV (seniority level, primary role title, core technologies they
emphasize) rather than from the job description being matched. This ensures
the job-board search keywords used by cv-search-job represent who the
candidate actually is, not a mirror of the job they happened to match against.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Threads the caller's email and client IP through the match pipeline so
every Results row records who triggered the match and from where.
Closes#45
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes#41
- Add RequireKeywordInAnchor per-provider flag (default true); set false for
ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their
search URL already filters by relevance server-side
- Update AI system prompts (en + ro) to extract concise job-board-friendly
keywords (role title + key tech, not abstract concepts) and candidate location
- Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest ->
JobSearchTokenEntity -> JobSearchSessionEntity
- Add {location} and {location-slug} substitution in HtmlJobSearcher
- Update provider SearchUrlTemplates to include location:
ejobs.ro: /locuri-de-munca/{location-slug}?q={keywords}
bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords}
linkedin.com: ?keywords={keywords}&location={location}
- Three new migrations: AddRequireKeywordInAnchorAndLocation,
ImproveKeywordsAndAddLocation, AddLocationToProviders
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The LLM JSON shape was missing the keywords array so res.Keywords was
always empty, causing "none detected" in job search emails. Both en/ro
prompts now include "keywords" in the required JSON shape so the LLM
extracts relevant job-search terms from the CV/job pair.
Note: the cvMatcher.CvMatchResults cache must be cleared on existing DBs
so cached responses (which lack keywords) are not served.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove html-shell entries from InitialSchema Seed() method
- Create new AddHtmlShellTemplates migration to insert html-shell templates
- Prevents duplicate key errors from having same data in two migrations
- InitialSchema seeds 32 templates (16 keys × 2 languages)
- AddHtmlShellTemplates seeds 2 html-shell templates (start, end)
- Total: 34 templates after both migrations run
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Email migration includes seed data for 14 templates (en, ro)
- CV matcher migration includes seed data for 2 AI prompts (en, ro)
- Tables are created successfully by migrations
- Issue: migrationBuilder.Sql() statements not being executed by EF Core
- Workaround needed: Current seeding approach not working automatically
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Refactored the AI prompt system to use proper language-specific prompts (en and ro) instead of a single wildcard prompt with runtime {{languageName}} placeholder substitution.
Benefits:
- Language-specific instructions optimized for each language
- Better control over LLM behavior per language
- Cleaner code without placeholder substitution
- Easier to maintain and update prompts per language
Changes:
- Updated cvMatcher InitialSchema migration to seed en and ro prompts separately
- Modified CvMatcherService to retrieve language-specific prompts directly
- Removed LanguageName() helper method (no longer needed)
- Added fallback prompts in service for safety
The English and Romanian prompts now include specific JSON examples in their respective languages, ensuring the LLM understands the expected output format for each language variant.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Added seeding of the ai.cv-match.system-prompt to the AiPrompts table. This prompt is retrieved by CvMatcherService when scoring CV-job pairs with the LLM. The {{languageName}} placeholder is substituted at runtime based on the requested language.
The prompt has a fallback in the service code, but seeding it ensures the proper version is used and avoids relying on the hardcoded fallback.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Deleted all incremental migrations and regenerated fresh InitialSchema migrations
that contain the complete, correct schema from the start:
- CvMatcher: InitialSchema with 3-column unique constraint on Results
(CvDocumentId, JobDocumentId, Language)
- Email: InitialSchema with Templates table (consolidated from EmailTemplates)
This creates a cleaner migration history and faster fresh deployments. Since there
is no production data, consolidation is safe and improves maintainability.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
When a match result is inserted concurrently between the existence check and the
database insert, log a warning to help with diagnostics while gracefully handling
the idempotent duplicate key violation.
The Results table had a unique constraint on (CvDocumentId, JobDocumentId) but the code
expects uniqueness on (CvDocumentId, JobDocumentId, Language). When matching the same CV
against the same job in different languages, this caused duplicate key violations.
Changes:
- Updated CvMatcherDbContext to define 3-column unique index including Language
- Generated proper EF Core migration to drop 2-column index and create 3-column index
- Updated ModelSnapshot to reflect new 3-column index definition
- Added exception handling in SaveMatchAsync to gracefully handle any race conditions
where duplicate key violations could occur between the existence check and insert
The migration will be automatically applied on container startup via db.Database.Migrate().
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
The unique constraint on cvMatcher.Results was defined as (CvDocumentId, JobDocumentId)
but the code checks for (CvDocumentId, JobDocumentId, Language). This mismatch caused
duplicate key violations when matching the same CV+Job in different languages.
Update the constraint to (CvDocumentId, JobDocumentId, Language) to allow different
languages for the same CV+Job pair while preventing true duplicates.
Resolves: Duplicate key constraint violations on concurrent/repeated match requests
Piggybacks keyword extraction onto the existing CV-to-job LLM call —
no extra API calls. The system prompt now instructs the model to return
8-12 English job-search terms (job titles, technologies, skills, domains)
in a new `keywords` field alongside the existing score/summary fields.
Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest →
JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords
(copied at session-creation time, no RAG call needed).
Changes:
- Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity
- IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords
- JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords
instead of fetching CV text from RAG — removes IRagApiClient dependency
- Remove heuristic ExtractKeywords method
- Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens
- Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed
to include keywords in the JSON shape
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move IRagRepository, EfRagRepository, and VectorSerializer from rag-api/Data to rag-data/Repositories
- Add rag-api-models ProjectReference to rag-data.csproj for model type availability
- Delete rag-api/Data folder (no longer needed; all data access is now in rag-data)
- This aligns RAG with email-api and other services: all data code in the data project
Pattern: rag-api (API logic) → rag-data (repository, EF entities, migrations)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Each DbContext now explicitly configures its migration history table to use
the schema-qualified name pattern [schemaName].[_Migrations]:
- [cvMatcher].[_Migrations] for CvMatcherDbContext
- [emailApi].[_Migrations] for EmailApiDbContext
- [cvSearch].[_Migrations] for CvSearchDbContext
- [rag].[_Migrations] for RagDbContext
- [myAi].[_Migrations] for MyAiDbContext
This is done via OnConfiguring() with UseSqlServer().MigrationsHistoryTable(name, schema).
Removed incorrect rename migrations that were created due to misunderstanding
of the proper EF Core configuration approach.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>