The unique constraint on cvMatcher.Results was defined as (CvDocumentId, JobDocumentId)
but the code checks for (CvDocumentId, JobDocumentId, Language). This mismatch caused
duplicate key violations when matching the same CV+Job in different languages.
Update the constraint to (CvDocumentId, JobDocumentId, Language) to allow different
languages for the same CV+Job pair while preventing true duplicates.
Resolves: Duplicate key constraint violations on concurrent/repeated match requests
ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS
bundle. This change equips cv-search-job with a headless Chromium
(Playwright 1.60) so it can fully render SPA pages before extracting
job links.
- Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig,
and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag
is included in the session provider-config snapshot
- Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL
(remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true
- HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync;
plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial
content rather than failing outright
- Dockerfile: download Playwright Chromium in the SDK build stage via
npx; copy browser binaries to the final image; install Chromium system
libs (Ubuntu noble t64 variants)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Individual job listings on bestjobs.eu use /loc-de-munca/{slug} URLs.
The seeded JobLinkContains value /ro/locuri-de-munca/ matched only the
category navigation links (Vanzari, Inginerie, Management...), so
zero job URLs passed the stage-1 href filter and the scraper returned
nothing. Migration updates the stored record (Id=2) to /loc-de-munca/.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
string.Join("") produced no whitespace between inline-block spans,
causing keywords to visually merge in email clients that collapse margins.
Switched to string.Join(" ") and zeroed left margin on each badge so
they wrap cleanly without a gap on the first item.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deleted stale directories and stray .csproj files that were never added
to the solution after project renames:
- Apis/cv-search-models/ (renamed → cv-search-data)
- Apis/myai-models/ (renamed → myai-data)
- Apis/shared-models/ (empty leftover)
- Apis/cv-search-data/cv-search-models.csproj (stray old csproj)
- Apis/myai-data/myai-models.csproj (stray old csproj)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two migration files had literal schema strings that were missed in earlier passes:
- cv-search-data AddJobSearchTables: two CreateIndex calls used "cvSearch"
- rag-data InitialRagSchema: FK principalSchema used "rag"
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Piggybacks keyword extraction onto the existing CV-to-job LLM call —
no extra API calls. The system prompt now instructs the model to return
8-12 English job-search terms (job titles, technologies, skills, domains)
in a new `keywords` field alongside the existing score/summary fields.
Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest →
JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords
(copied at session-creation time, no RAG call needed).
Changes:
- Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity
- IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords
- JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords
instead of fetching CV text from RAG — removes IRagApiClient dependency
- Remove heuristic ExtractKeywords method
- Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens
- Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed
to include keywords in the JSON shape
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PDF text extraction often stores all content without newlines. The previous
line-based splitter would produce one line > 200 chars which was filtered out,
yielding empty keywords. Replace with word-level sampling of the first 2000
chars, splitting on whitespace and common delimiters, skipping phone fragments,
emails, and URLs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Down migration was referencing "emailApi" literal instead of MigrationConstants.SchemaName,
which would have dropped the wrong schema on rollback. Also fix stale comment in DbContext.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hardcoded "cvSearch" string literals with MigrationConstants.SchemaName
in the Up, InsertData, and Down methods, consistent with all other migrations.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Provider config is no longer read from appsettings or env vars.
All three providers (ejobs.ro, bestjobs.eu, linkedin.com) are seeded
into cvSearch.JobProviders by the AddJobProviders migration.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any
enabled row; returns null (no token created) when the table is empty or
all providers are disabled. TriggerStartAsync snapshots enabled providers
from DB at session-start time, preserving the existing snapshot contract.
CvMatcherController guards link-building on a non-null TokenId so the
"Start a job search" CTA is omitted from match emails when no providers
are configured.
JobSearchSettings.Providers list removed — provider config now lives
exclusively in the DB. CvSearchJobTask.GetProviders falls back to an
empty list with a warning (snapshot should always be populated from DB).
Closes#35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New JobProviderEntity persists provider config (name, URL template,
link filter, initial keywords, max results, display order) in the DB
instead of appsettings. Migration seeds three disabled defaults:
ejobs.ro, bestjobs.eu, and linkedin.com.
Closes#35
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add funnel-level logging to HtmlJobSearcher (total anchors found,
stage-1 href-filter count, stage-2 keyword-filter count) and warn
when the keyword list is empty. Log the full search URL and response
size to catch silent HTTP failures or bot-block pages.
In CvSearchJobTask, log keywords and active providers at session start,
per-provider URL counts after each scrape, and every scored URL with its
verdict (ACCEPTED / rejected) at Information level.
Add a scan summary block to the results email (both non-empty and
empty-results paths) showing the CV keywords used as chips and the
comma-separated list of providers scanned.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix schema name references in migration Designer.cs and ModelSnapshot files.
Previously these files contained hardcoded 'emailApi' schema name instead of
using MigrationConstants.SchemaName constant. This was causing EF Core to
detect pending model changes and fail migrations.
Changes:
- 20260528100000_CreateEmailTemplates.Designer.cs: Use MigrationConstants.SchemaName
- 20260528130652_SeedEmailTemplates.Designer.cs: Use MigrationConstants.SchemaName
- EmailApiDbContextModelSnapshot.cs: Use MigrationConstants.SchemaName and updated namespace
Also updated entity namespace references from EmailApi.Data to Email.Data.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Update all Dockerfile COPY commands to reference the renamed email-data project
instead of email-api-data. This resolves Docker build failures introduced by the
email-api-data → email-data rename.
- Apis/api/Dockerfile: Update lines 8 and 20
- Apis/email-api/Dockerfile: Update lines 6 and 17
- Jobs/cv-search-job/Dockerfile: Update lines 10 and 23
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Rename project folder Apis/email-api-data → Apis/email-data
- Rename csproj file: email-api-data.csproj → email-data.csproj
- Update csproj properties: AssemblyName and RootNamespace (email-data, Email.Data)
- Update C# namespaces: EmailApi.Data → Email.Data across all email-data files
- Update project references in api.csproj and email-api.csproj
- Update migration assembly references in api/Program.cs and email-api/Program.cs
- Update cv-search-job references to use email-data project and Email.Data namespace
- Update solution file to reference new email-data project path
- Remove hardcoded schema name from SmtpEmailDispatcher, use template service instead
This maintains consistency with other data project naming convention (no service-type suffix).
All tests passing, build succeeds.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Delete cv-search-models/Data (duplicate of cv-search-data/Data)
- Delete myai-models/Data (duplicate of myai-data/Data)
- DbContext and Entities belong only in -data projects, not -models
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Update MigrationConstants.SchemaName in email-api-data from 'emailApi' to 'email'
- All migrations automatically use the new schema name via MigrationConstants reference
- Aligns with naming convention: 'email', 'rag', 'cvMatcher', 'cvSearch', 'myAi'
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Move IRagRepository, EfRagRepository, and VectorSerializer from rag-api/Data to rag-data/Repositories
- Add rag-api-models ProjectReference to rag-data.csproj for model type availability
- Delete rag-api/Data folder (no longer needed; all data access is now in rag-data)
- This aligns RAG with email-api and other services: all data code in the data project
Pattern: rag-api (API logic) → rag-data (repository, EF entities, migrations)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Added inline comments throughout myai.css to:
- Clarify complex CSS selectors (input:not selectors, specificity explanations)
- Explain design patterns (.is-invalid error state pattern)
- Document focus and error states
- Describe layout decisions (sticky result panel, hamburger dropdown)
- Clarify responsive breakpoints and what changes at each
- Explain the relationship between CSS and JS (e.g., .is-open, .is-invalid)
Comments are strategic and concise—added to complex/non-obvious sections
without bloating the file. All CSS rules remain unchanged—purely additive
documentation.
Key sections now have better context:
- Form field selectors and state handling
- Hamburger menu responsive behavior
- Result panel sticky positioning strategy
- Responsive grid layout changes at 900px and 560px
- Cookie banner and loader overlay behavior
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Refactored legal.js from 135 → 124 lines (8% reduction) by:
- Removing local browserLang() and getLang() that are now in utils
- Simplifying to focus on page-specific injection logic
Kept legal page-specific functionality:
- Local LANG_KEY storage for page language preference
- injectTopbar() with language switcher buttons
- injectFooter() with language-aware copyright and legal links
- Event delegation for language link clicks
- DOMContentLoaded handler
Added clear JSDoc comments explaining the injection pattern and
how legal pages dynamically reuse common UI elements while supporting
language switching via event delegation.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Refactored cv-matcher.js from 257 → 213 lines (17% reduction) by:
- Removing duplicate helper functions now in utils/form-helpers.js
- Removing duplicate i18n logic now in utils/i18n.js
- Removing API loading code now in utils/api.js
Kept CV matcher-specific logic:
- CV file input change handler
- Async CV upload and match flow with two captcha tokens
- Match result rendering with score badge and lists
- escapeHtml() XSS prevention utility
- $(window).on('load') to load reCaptcha
All function calls updated to use window.MyAi.* utilities for consistency.
Added detailed JSDoc comments explaining the two-step async flow.
Updated cv-matcher/index.html to load all utilities before cv-matcher.js.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Refactored main.js from 544 → 266 lines (51% reduction) by:
- Removing duplicate functions now in utils/form-helpers.js
- Removing duplicate i18n logic now in utils/i18n.js
- Removing API loading code now in utils/api.js
- Removing cookie consent handlers now in modules/cookie-consent.js
Kept only page-specific form handlers:
- Contact form submission with reCaptcha
- Subscribe form submission with reCaptcha
- Language switcher initialization
- Footer year and version display
All calls now use window.MyAi.* utilities for consistency.
Updated index.html to load all utilities before main.js.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Create reusable utility modules to eliminate duplication across main.js,
cv-matcher.js, and legal.js:
- js/utils/form-helpers.js: showFieldError, clearFieldErrors, isValidEmail,
extractApiError — shared form validation and error handling
- js/utils/i18n.js: currentLang, t, applyLanguage, updateLegalLinks,
browserLang — shared translation and language switching
- js/utils/api.js: checkApiLive, getRecaptchaWebKey, getGoogleTagManagerId,
loadGoogleTagManager — shared API configuration loading
- js/modules/cookie-consent.js: getConsent, setConsent, initConsent,
setupConsentHandlers — cookie banner and consent management
All utilities exposed on window.MyAi namespace for use by existing pages.
Full JSDoc headers and inline comments for maintainability.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Each DbContext now explicitly configures its migration history table to use
the schema-qualified name pattern [schemaName].[_Migrations]:
- [cvMatcher].[_Migrations] for CvMatcherDbContext
- [emailApi].[_Migrations] for EmailApiDbContext
- [cvSearch].[_Migrations] for CvSearchDbContext
- [rag].[_Migrations] for RagDbContext
- [myAi].[_Migrations] for MyAiDbContext
This is done via OnConfiguring() with UseSqlServer().MigrationsHistoryTable(name, schema).
Removed incorrect rename migrations that were created due to misunderstanding
of the proper EF Core configuration approach.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Rename EmailApiDbContext MigrationTableName from '_EmailApiMigrations' to '_Migrations'
- Rename MyAiDbContext MigrationTableName from '_MyAiMigrations' to '_Migrations'
- Add migrations to rename tables in database: emailApi._EmailApiMigrations → emailApi._Migrations, myAi._MyAiMigrations → myAi._Migrations
- Aligns with naming convention used in other schemas (cvMatcher, cvSearch, rag)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Remove Seed() call from CreateEmailTemplates Up() method to prevent
duplicate key violation when applying SeedEmailTemplates migration.
The original migration was attempting to seed data during schema creation,
but data seeding is now handled by the separate SeedEmailTemplates migration
(20260528130652). Keeping both Seed() calls caused PRIMARY KEY violation on
(email.html-shell.start, *) when the second migration tried to insert
already-existing templates.
This maintains the migration order: schema creation first, then data seeding
in a separate, dedicated migration.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Create SeedEmailTemplates migration (20260528130652) with all email templates
- Add Microsoft.EntityFrameworkCore.Design to email-api.csproj for EF migrations
- Add EmailApiDbContext registration and migration support to email-api Program.cs
- Configure IEmailTemplateRepository and IEmailTemplateService in email-api
- All 14 email templates now seeded in emailApi schema (HTML shells, CV match, job search)
- Templates include proper placeholder support ({{score}}, {{count}}, {{jobLabel}}, etc.)
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Both api and cv-search-job need to connect to email-api for sending emails.
Add EmailApi section to their appsettings.json with BaseUrl and InternalApiKey
placeholders. Environment variables from docker-compose will populate these at runtime.
Also add EmailApi credentials to docker-compose/.env:
- EmailApi__BaseUrl=http://email-api:8080
- EmailApi__InternalApiKey=<shared key>
- EmailApi__RequireApiKey=true
This ensures both services can authenticate and call the email-api service.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Updated general-dev-workflow skill to stress that for Docker projects,
BOTH build success AND successful container startup are required before
considering a change complete.
Key additions:
- Docker container startup verification is mandatory, not optional
- Containers must have status 'Up', not 'Restarting' or 'Exited'
- Check container logs for errors during startup
- Missing config files, invalid env vars, DB issues only show at runtime
- Startup failures block production deployments
- Updated Phase 4 checkpoint to include container startup validation
Lesson from email-api issue: missing appsettings.json caused restart loop
that had no visible logs during build. This must be caught in Phase 4
before code review and prevents production issues.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merges refactor/web-cleanup-31 into main with all infrastructure fixes:
- Directory.Packages.props handling in all Dockerfiles
- Missing email-api-data dependencies in build configs
- Missing appsettings.json for email-api
- myai-smoke-test skill for automated end-to-end testing
- general-dev-workflow skill updates for build verification
All 7 containers building and running successfully.
Closes#33
The email-api service was missing its configuration file, which is required
for Serilog logging setup, database connection, and SMTP settings.
Created appsettings.json with:
- Serilog configuration for console, file, and email logging
- Database connection settings
- SMTP configuration for email sending
- Internal API key configuration
- File storage path configuration
This fixes the container crash loop caused by missing configuration.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updated Phase 4 (Test) to include mandatory build verification:
- dotnet build for .NET projects
- docker compose --build for Docker projects
- Catch missing dependencies and configuration issues early
- Prevent build failures during code review and CI/CD
Also added tip about verifying builds before opening PR, and updated
Phase 4 checkpoint to include successful build requirement.
Lesson learned from docker build issues: catching these early saves
reviewers and CI/CD time, and prevents 'works on my machine' problems.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The email-api service was missing from the CI/CD build pipeline. Added:
- EMAIL_API_IMAGE environment variable
- Build step for email-api Dockerfile
- Push step for email-api image to registry
This ensures email-api images are built and pushed alongside other services
during the staging build workflow.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The api.csproj references both email-api-data and email-api-models, but the
Dockerfile was not copying them. This caused compilation warnings and potential
build failures.
Added COPY commands for both projects before restore and publish steps.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The cv-search-job.csproj references both email-api-data and email-api-models,
but the Dockerfile was not copying them into the build context. This caused
compilation errors about missing EmailApi namespace types.
Added COPY commands for both projects before restore and publish steps.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The email-api.csproj references email-api-data as a project dependency,
but the Dockerfile was not copying it into the build context. This caused
'Skipping project' warnings during restore/publish.
Added COPY commands for both .csproj (before restore) and source directory
(before publish) to include email-api-data in the Docker build.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The explicitly added versions were conflicting with the centralized version
definitions in Directory.Packages.props. Removed all explicit versions from:
- web/web.csproj
- Jobs/cv-cleanup-job/cv-cleanup-job.csproj
- Jobs/cv-search-job/cv-search-job.csproj
NuGet will now resolve versions from Directory.Packages.props which has the
canonical version definitions for the entire solution.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Docker builds were failing because the centralized package version
management file (Directory.Packages.props) was not being copied into the
build context. This file is required for NuGet to resolve package versions
in projects that don't specify explicit versions.
Updated all Dockerfiles to copy Directory.Packages.props before running
dotnet restore:
- Apis/api/Dockerfile
- Apis/cv-matcher-api/Dockerfile
- Apis/rag-api/Dockerfile
- Jobs/cv-cleanup-job/Dockerfile
- Jobs/cv-search-job/Dockerfile
- web/Dockerfile
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The PackageReference items were missing version specifications:
- Microsoft.VisualStudio.Azure.Containers.Tools.Targets (added 1.21.0)
- Yarp.ReverseProxy (added 2.2.0)
This fixes the NuGet error NU1015 that prevented Docker builds from
successfully restoring package dependencies.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The smoke test now correctly includes docker-compose.override.yml which
configures local image builds instead of pulling from remote registry.
This fixes the 'failed to resolve reference' error when building containers
locally.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>