# cv-matcher-api — Internal CV Match Engine Internal port 8082. Only reachable from `api` and `cv-search-job` via `X-Internal-Api-Key`. ## Responsibilities - Indexes CV PDFs into the RAG system via `rag-api` - Matches a CV against a job posting URL (scrapes job HTML, scores pair with LLM) - Manages job search tokens and sessions for the one-click job search feature - Owns two EF DbContexts: `CvMatcherDbContext` (schema `cvMatcher`) and `CvSearchDbContext` (schema `cvSearch`) - Runs EF migrations for both contexts on startup ## Key routes | Method | Route | Description | |--------|-------|-------------| | POST | `/api/cv/upload` | Index CV PDF into RAG | | POST | `/api/cv/match-job` | Score CV against a job URL (LLM call) | | POST | `/api/cv/find-jobs` | Find matching jobs from the RAG index | | POST | `/api/cv/job-search/token` | Create a job search token (called by api after a match) | | POST | `/api/cv/job-search/token/{tokenId}/start` | Validate token, create Pending session (called by api on link click) | | GET | `/api/health` | Health check | ## Core services - `CvMatcherService` — orchestrates upload + match; calls `IRagApiClient` and `IMatcherAiClient` - `JobTextExtractor` — fetches a job page URL and extracts plain text - `JobTokenService` — creates tokens; validates + starts job search sessions; extracts CV keywords using simple heuristics (first 5 meaningful non-empty lines of CV text, split into words) ## AI providers Configured under `Ai:Provider` (`OpenAI` or `Ollama`). Both providers implement `IMatcherAiClient`. Default model: `gpt-4o-mini`. Timeout: 90 s. ## Database contexts Both contexts use the same SQL Server connection string (from `Database:*` settings). - `CvMatcherDbContext` — schema `cvMatcher`; migrations in `cv-matcher-api` assembly - `CvSearchDbContext` — schema `cvSearch`; migrations in `cv-search-models` assembly (MigrationsAssembly = "cv-search-models") ## Keyword extraction (JobTokenService.ExtractKeywords) No LLM call. Takes the first 5 non-empty lines of CV text that are: - Longer than 5 characters - Not purely numeric or contact-line patterns Splits into words, strips punctuation, deduplicates, returns up to 10 comma-separated keywords. These keywords are stored in `JobSearchSessionEntity.Keywords` and used by `cv-search-job` for scraping. ## Settings | Section | Notes | |---------|-------| | `Database` | Shared SQL Server connection | | `RagApi` | BaseUrl + InternalApiKey for rag-api | | `Ai` | Provider, model, timeout | | `Matcher` | TopK, DeepScoreTopN, MaxJobTextChars | | `JobSearch` | TokenExpiryDays, providers list (stored in session JSON) | | `InternalApi` | ApiKey used by UseInternalApiKeyProtection middleware |