CV Matcher
The CV matcher is the core feature of myAi.ro. Users upload a CV PDF and either paste a single job URL/description or rely on the RAG index to find the best matches — and get a scored, structured analysis from an LLM with strengths, gaps, and recommendations.
Service Chain
Browser / web
-> api (port 8080) -- captcha, rate limiting, email, CV file cache
-> cv-matcher-api (port 8082) -- match logic, RAG orchestration, LLM scoring
-> rag-api (port 8081) -- vector indexing and semantic search
-> OpenAI / Ollama -- LLM scoring (gpt-4o-mini by default)
api is the only internet-facing service. All calls to cv-matcher-api and rag-api require the X-Internal-Api-Key header.
Flows
1 -- CV Upload
- Browser
POST /api/cv-matcher/upload(multipart PDF, GDPR consent, captcha token) apiverifies reCAPTCHA, forwards PDF tocv-matcher-api POST /api/cv/uploadcv-matcher-apicallsrag-api POST /api/rag/indexto chunk and embed the PDFrag-apireturns{ documentId, textHash, chunks, characters, cached }apicaches the PDF to{FileStorage:Path}/{documentId}.pdffor later email attachment- Returns
CvUploadResponseto the browser
If the same PDF was previously uploaded (same textHash), rag-api returns the cached document — no re-embedding cost.
2 -- Match CV to a Single Job
- Browser
POST /api/cv-matcher/match-jobwith{ cvDocumentId, jobUrl or jobDescription, email, gdprConsent, captchaToken } apiverifies reCAPTCHA, forwards tocv-matcher-api POST /api/cv/match-jobcv-matcher-api:- Fetches CV text from
rag-api GET /api/rag/document/{cvDocumentId} - Fetches and strips HTML from
jobUrlviaJobTextExtractor(or uses pastedjobDescription) - Indexes the job text into
rag-api(type = "job") - Runs a semantic search against the RAG index to find matching job chunks
- Calls
ScorePairAsync(LLM) to produce the structured match result - Caches the result in
cvMatcher.CvMatchResultsby(cvDocumentId, jobDocumentId)hash
- Fetches CV text from
api(on return):- If
emailwas provided, creates a job search token viaIJobSearchApi.CreateTokenAsync - Sends match result email with CV PDF attached and job search link included
- If
- Returns
JobMatchResponseto the browser
3 -- Find Jobs from RAG Index
- Browser
POST /api/cv-matcher/find-jobswith{ cvDocumentId, topK } cv-matcher-apifetches CV text fromrag-api- Builds a CV search profile string from the CV text
- Calls
rag-apisemantic search against indexed jobs (targetDocumentTypes: ["job"]) - Takes top
DeepScoreTopNresults (default 5), runsScorePairAsyncLLM scoring on each - Returns
FindJobsResponse { jobs: JobMatchResponse[] }
LLM Scoring (ScorePairAsync)
Called for both match-job and find-jobs. Checks the DB cache first -- if a result exists for the same (cvId, jobId) pair it is returned immediately (no AI call).
If not cached:
- Truncates CV text to 18 000 chars, job text to 14 000 chars
- Takes up to 4 RAG evidence chunks (or first 4 000 chars of job text as fallback)
- Sends
system + userprompt to the configured AI provider withtemperature = 0.2 - Expects JSON response; falls back to a safe error object if parsing fails
- Persists the raw AI chat response in
cvMatcher.CvMatcherChatCacheby a hash of(provider, model, temperature, systemPrompt, userPrompt)
Match Result Structure (JobMatchResponse)
| Field | Type | Description |
|---|---|---|
score |
int 0-100 | Overall match percentage |
summary |
string | One-paragraph narrative |
strengths |
string[] | CV aspects that match well |
gaps |
string[] | Missing or weak areas |
recommendations |
string[] | Actionable advice for the candidate |
evidence |
string[] | RAG chunks that drove the score |
cached |
bool | True if returned from DB cache |
jobDocumentId |
string? | RAG document id of the indexed job |
jobUrl |
string? | Source URL of the job |
JobTextExtractor
Extracts plain text from a job posting for the LLM prompt.
- If
jobDescription(pasted text) is provided it is used directly -- no HTTP call - Otherwise fetches
jobUrl, strips<script>,<style>, and all HTML tags, decodes HTML entities, collapses whitespace - Truncates to
MaxJobTextChars(default 60 000, minimum 4 000) - Throws
InvalidOperationExceptionif the extracted text is under 80 characters
User-agent sent: MyAi.ro CV Matcher/1.0. HTTP timeout: 25 seconds.
AI Providers
Configured under Ai:Provider (OpenAI or Ollama).
| Setting | Default | Notes |
|---|---|---|
Ai:Provider |
OpenAI |
Switch to Ollama for local/offline |
Ai:OpenAI:ChatModel |
gpt-4o-mini |
Any OpenAI chat model |
Ai:OpenAI:TimeoutSeconds |
90 |
Per-request timeout |
Ai:Ollama:BaseUrl |
http://host.docker.internal:11434 |
Local Ollama instance |
Ai:Ollama:ChatModel |
llama3.1:8b |
Any Ollama chat model |
Both providers use response_format: json_object (or Ollama format: "json") to guarantee parseable output. All AI responses are cached in the DB by content hash -- repeated identical prompts never hit the API twice.
Caching
Two layers of caching in cvMatcher schema:
| Cache | Table | Key | What's stored |
|---|---|---|---|
| AI responses | CvMatcherChatCache |
SHA256 of full prompt + model | Raw JSON string from LLM |
| Match results | CvMatchResults |
(cvDocumentId, jobDocumentId) |
Full JobMatchResponse |
The match result cache means re-matching the same CV against the same job URL is instant and free.
API Routes
api (public, port 8080)
| Method | Route | Description |
|---|---|---|
| POST | /api/cv-matcher/upload |
Upload CV PDF (multipart) |
| POST | /api/cv-matcher/match-job |
Match CV to a job URL or pasted description |
| GET | /api/cv-matcher/job-search/start?t= |
One-click job search start (token link) |
Rate limited by the cvMatcher policy: 10 requests / 10 minutes per IP.
cv-matcher-api (internal, port 8082)
| Method | Route | Description |
|---|---|---|
| POST | /api/cv/upload |
Index CV PDF into RAG |
| POST | /api/cv/match-job |
Score CV against a job URL or text |
| POST | /api/cv/find-jobs |
Find top jobs from RAG index for a CV |
| POST | /api/cv/job-search/token |
Create job search token |
| POST | /api/cv/job-search/token/{id}/start |
Validate token, create Pending session |
| GET | /api/health |
Health check |
Settings Reference
Matcher section (cv-matcher-api)
| Key | Default | Description |
|---|---|---|
TopK |
10 |
RAG search result count |
DeepScoreTopN |
5 |
How many RAG results get LLM deep scoring |
MaxJobTextChars |
60000 |
Max job text length sent to LLM |
FileStorage section (api)
| Key | Default | Description |
|---|---|---|
Path |
Files |
Directory for cached CV PDFs (relative to app root or absolute) |
Shared via bind mount with cv-cleanup-job and cv-search-job.
Match Email
Sent by api via SMTP after a successful match when email is provided.
- Subject:
MyAi.ro CV Match: {score}% -- {jobLabel} - Body: score, summary, strengths, gaps, recommendations
- Attachment: cached CV PDF from
{FileStorage:Path}/{documentId}.pdf - Footer: job search link (if token creation succeeded) — see Features/Internet-Job-Search
Sending is fire-and-forget: email failure does not affect the match result returned to the browser.
Database Schema (cvMatcher)
Managed by CvMatcherDbContext. Migrations live in Apis/cv-matcher-api/Migrations/.
dotnet ef migrations add <Name> --context CvMatcherDbContext --project Apis/cv-matcher-api