6293fa89e3
Build and Push Docker Images / build (push) Failing after 1m36s
- New cv-search-models shared library: EF entities + CvSearchDbContext for cvSearch schema (JobSearchTokens, JobSearchSessions, JobSearchResults tables) - New cv-search-job worker service: polls DB for pending sessions, scrapes job boards via configurable HTML scraping, runs LLM scoring via cv-matcher-api, emails ranked results - cv-matcher-api: JobTokenService creates one-time tokens; JobSearchController handles link clicks and creates sessions - api: proxies job-search start endpoint, appends job search link to match result email - CI workflow updated to build and push myai-cv-search-job:staging image - CLAUDE.md documentation added for all affected services Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2.7 KiB
2.7 KiB
cv-matcher-api — Internal CV Match Engine
Internal port 8082. Only reachable from api and cv-search-job via X-Internal-Api-Key.
Responsibilities
- Indexes CV PDFs into the RAG system via
rag-api - Matches a CV against a job posting URL (scrapes job HTML, scores pair with LLM)
- Manages job search tokens and sessions for the one-click job search feature
- Owns two EF DbContexts:
CvMatcherDbContext(schemacvMatcher) andCvSearchDbContext(schemacvSearch) - Runs EF migrations for both contexts on startup
Key routes
| Method | Route | Description |
|---|---|---|
| POST | /api/cv/upload |
Index CV PDF into RAG |
| POST | /api/cv/match-job |
Score CV against a job URL (LLM call) |
| POST | /api/cv/find-jobs |
Find matching jobs from the RAG index |
| POST | /api/cv/job-search/token |
Create a job search token (called by api after a match) |
| POST | /api/cv/job-search/token/{tokenId}/start |
Validate token, create Pending session (called by api on link click) |
| GET | /api/health |
Health check |
Core services
CvMatcherService— orchestrates upload + match; callsIRagApiClientandIMatcherAiClientJobTextExtractor— fetches a job page URL and extracts plain textJobTokenService— creates tokens; validates + starts job search sessions; extracts CV keywords using simple heuristics (first 5 meaningful non-empty lines of CV text, split into words)
AI providers
Configured under Ai:Provider (OpenAI or Ollama). Both providers implement IMatcherAiClient.
Default model: gpt-4o-mini. Timeout: 90 s.
Database contexts
Both contexts use the same SQL Server connection string (from Database:* settings).
CvMatcherDbContext— schemacvMatcher; migrations incv-matcher-apiassemblyCvSearchDbContext— schemacvSearch; migrations incv-search-modelsassembly (MigrationsAssembly = "cv-search-models")
Keyword extraction (JobTokenService.ExtractKeywords)
No LLM call. Takes the first 5 non-empty lines of CV text that are:
- Longer than 5 characters
- Not purely numeric or contact-line patterns
Splits into words, strips punctuation, deduplicates, returns up to 10 comma-separated keywords.
These keywords are stored in JobSearchSessionEntity.Keywords and used by cv-search-job for scraping.
Settings
| Section | Notes |
|---|---|
Database |
Shared SQL Server connection |
RagApi |
BaseUrl + InternalApiKey for rag-api |
Ai |
Provider, model, timeout |
Matcher |
TopK, DeepScoreTopN, MaxJobTextChars |
JobSearch |
TokenExpiryDays, providers list (stored in session JSON) |
InternalApi |
ApiKey used by UseInternalApiKeyProtection middleware |