Files
myAi/Apis/cv-matcher-api/CLAUDE.md
claude e95ed36647 refactor: restructure solution into -models/-data/-api project taxonomy
Phases 1-10 of the planned refactoring:

Phase 1: rename shared-models -> common
  - namespace Shared.Models -> Common throughout
  - remove stale AspNetCore.Http.Features 5.0 reference

Phase 2: create shared-data with abstract BaseEntity
  - BaseEntity: required string Id { get; init; } + DateTime CreatedAt { get; init; }

Phase 3: rename myai-models -> myai-data
  - namespace MyAi.Models -> MyAi.Data
  - MigrationsAssembly("myai-data")

Phase 4: rename cv-search-models -> cv-search-data
  - namespace CvSearch.Models -> CvSearch.Data
  - move JobSearchSettings to cv-matcher-api-models
  - JobSearch*Entity now inherits BaseEntity

Phase 5: extract rag-data from rag-api
  - new project: Apis/rag-data with RagDbContext + entities + migrations
  - RagDocumentEntity inherits BaseEntity; cache entities use CacheKey PK
  - fix duplicate AddHttpClient<RagAiClient>/AddScoped registrations in rag-api
  - MigrationsAssembly("rag-data")

Phase 6: extract cv-matcher-data from cv-matcher-api
  - new project: Apis/cv-matcher-data with CvMatcherDbContext + entities + migrations
  - CvMatchResultEntity inherits BaseEntity; CvMatcherChatCacheEntity uses CacheKey PK
  - MigrationsAssembly("cv-matcher-data")

Phase 7: create empty cv-cleanup-job-models and cv-search-job-models

Phase 8: update all 5 Dockerfiles for renamed/new projects

Phase 9: reorganise .sln virtual folders (Apis/Jobs/Models/Data/Helpers)
  - update root CLAUDE.md with new project taxonomy and migration commands
  - update cv-matcher-api/CLAUDE.md and cv-search-job/CLAUDE.md

Phase 10: add Directory.Packages.props for centralised NuGet versions
  - remove Version= from all PackageReference elements in active .csproj files

No database changes. No runtime behaviour changes.
All MigrationId strings in __EFMigrationsHistory are unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-27 15:26:03 +03:00

2.7 KiB

cv-matcher-api — Internal CV Match Engine

Internal port 8082. Only reachable from api and cv-search-job via X-Internal-Api-Key.

Responsibilities

  • Indexes CV PDFs into the RAG system via rag-api
  • Matches a CV against a job posting URL (scrapes job HTML, scores pair with LLM)
  • Manages job search tokens and sessions for the one-click job search feature
  • Owns two EF DbContexts: CvMatcherDbContext (schema cvMatcher) and CvSearchDbContext (schema cvSearch)
  • Runs EF migrations for both contexts on startup

Key routes

Method Route Description
POST /api/cv/upload Index CV PDF into RAG
POST /api/cv/match-job Score CV against a job URL (LLM call)
POST /api/cv/find-jobs Find matching jobs from the RAG index
POST /api/cv/job-search/token Create a job search token (called by api after a match)
POST /api/cv/job-search/token/{tokenId}/start Validate token, create Pending session (called by api on link click)
GET /api/health Health check

Core services

  • CvMatcherService — orchestrates upload + match; calls IRagApiClient and IMatcherAiClient
  • JobTextExtractor — fetches a job page URL and extracts plain text
  • JobTokenService — creates tokens; validates + starts job search sessions; extracts CV keywords using simple heuristics (first 5 meaningful non-empty lines of CV text, split into words)

AI providers

Configured under Ai:Provider (OpenAI or Ollama). Both providers implement IMatcherAiClient.
Default model: gpt-4o-mini. Timeout: 90 s.

Database contexts

Both contexts use the same SQL Server connection string (from Database:* settings).

  • CvMatcherDbContext — schema cvMatcher; migrations in cv-matcher-data assembly (Apis/cv-matcher-data/)
  • CvSearchDbContext — schema cvSearch; migrations in cv-search-data assembly (Apis/cv-search-data/)

Keyword extraction (JobTokenService.ExtractKeywords)

No LLM call. Takes the first 5 non-empty lines of CV text that are:

  • Longer than 5 characters
  • Not purely numeric or contact-line patterns

Splits into words, strips punctuation, deduplicates, returns up to 10 comma-separated keywords.
These keywords are stored in JobSearchSessionEntity.Keywords and used by cv-search-job for scraping.

Settings

Section Notes
Database Shared SQL Server connection
RagApi BaseUrl + InternalApiKey for rag-api
Ai Provider, model, timeout
Matcher TopK, DeepScoreTopN, MaxJobTextChars
JobSearch TokenExpiryDays, providers list (stored in session JSON)
InternalApi ApiKey used by UseInternalApiKeyProtection middleware