AI/myAi - myAi - Gitea: Git with a cup of tea

AI/myAi

Author	SHA1	Message	Date
claude	a83f6f705f	Remove UseHeadlessBrowser from JobProvider — all fetches now go via page-fetcher-api page-fetcher-api always uses Playwright (networkidle by default), so the per-provider flag that chose between headless and plain HTTP is obsolete. - Removed from JobProviderEntity, CvSearchDbContext, JobProviderConfig, JobTokenService - HtmlJobSearcher no longer passes WaitFor (uses page-fetcher-api default) - EF migration drops the column from cvSearch.JobProviders Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 18:43:42 +03:00
claude	dcfc50ff32	Fix Docker builds: upgrade Refit to 11.0.1, add page-fetcher-api-models to Dockerfiles - Refit 10.1.6 signing certificate was revoked; upgraded to 11.0.1 in Directory.Packages.props - cv-matcher-api/Dockerfile and cv-search-job/Dockerfile were missing COPY steps for page-fetcher-api-models (added in this feature branch) All 8 images now build cleanly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 18:35:41 +03:00
claude	898dd09d50	feat: add page-fetcher-api — centralised Playwright page fetcher Introduces page-fetcher-api, a new internal ASP.NET Core service that centralises all web-page fetching through a single Playwright (headless Chromium) browser instance. All fetches are persisted to the pageFetcher SQL schema for auditing. New projects: - Apis/page-fetcher-api-models: FetchPageRequest, FetchPageResponse, IPageFetcherApiClient - Apis/page-fetcher-data: PageFetchDbContext, PageFetchEntity, InitialSchema migration (schema: pageFetcher) - Apis/page-fetcher-api: PlaywrightBrowserService (singleton), PageFetcherService, PageController Changes to existing services: - cv-matcher-api: JobTextExtractor now calls IPageFetcherApiClient instead of HttpClient - cv-search-job: HtmlJobSearcher uses IPageFetcherApiClient (removes inline Playwright); CvSearchJobTask fetches individual job pages and applies keyword pre-filter before LLM call; passes pre-fetched JobDescription to cv-matcher-api to skip re-fetch - common: add PageFetcherApiSettings - docker-compose.yml, build.yml: add new service + env vars for callers Closes #43 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 17:43:56 +03:00
claude	99e5cfb76b	Fix job search: location filtering, keyword quality, anchor filter bypass Closes #41 - Add RequireKeywordInAnchor per-provider flag (default true); set false for ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their search URL already filters by relevance server-side - Update AI system prompts (en + ro) to extract concise job-board-friendly keywords (role title + key tech, not abstract concepts) and candidate location - Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest -> JobSearchTokenEntity -> JobSearchSessionEntity - Add {location} and {location-slug} substitution in HtmlJobSearcher - Update provider SearchUrlTemplates to include location: ejobs.ro: /locuri-de-munca/{location-slug}?q={keywords} bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords} linkedin.com: ?keywords={keywords}&location={location} - Three new migrations: AddRequireKeywordInAnchorAndLocation, ImproveKeywordsAndAddLocation, AddLocationToProviders Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 15:45:45 +03:00
claude	b67e926c5f	Fix Serilog email sink: configure in code, not JSON config Serilog.Settings.Configuration cannot deserialize NetworkCredential or MailKit's SecureSocketOptions from JSON, causing an InvalidOperationException in the binder and preventing containers from starting. Fix: remove Email from the WriteTo JSON array entirely and wire it in code inside ConfigureJsonSerilog using a dedicated SerilogEmail:* config section. The sink is skipped when From/To/Host are absent, so local dev is unaffected. Also renames the docker-compose env vars from the verbose Serilog__WriteTo__2__Args__* prefix to the clean SerilogEmail__* prefix. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 22:25:26 +03:00
claude	8679bd1efd	Fix Serilog email sink config for v4 API breaking changes Serilog.Sinks.Email v4 renamed all configuration parameters from their v2 names. The old names were silently ignored, so no error alert emails were ever sent. Parameter renames applied across all 6 appsettings.json and docker-compose: fromEmail → from toEmail → to mailServer → host networkCredential → credentials enableSsl: true → connectionSecurity: StartTls emailSubject → subject outputTemplate → body batchPostingLimit / period removed (v4 batching uses a separate overload) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-01 21:57:06 +03:00
claude	b114156e9c	Return 500 errors for missing email templates and AI prompts Changed configuration error handling to throw InvalidOperationException instead of silently using fallback values. This ensures: 1. Missing email templates (critical config) → 500 error to UI 2. Missing AI prompts (critical config) → 500 error to UI 3. Clear error messages indicating config issue 4. Prompts administrators to check database seeding Services updated: - EmailTemplateService.Get() throws for missing template - CvMatcherService.ScorePairAsync() throws for missing AI prompt This prevents silent failures with degraded service quality and makes it obvious to users that the system has a configuration problem that needs fixing. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-06-01 16:58:11 +03:00
claude	64e003a639	Use language-specific AI prompts instead of wildcard substitution Refactored the AI prompt system to use proper language-specific prompts (en and ro) instead of a single wildcard prompt with runtime {{languageName}} placeholder substitution. Benefits: - Language-specific instructions optimized for each language - Better control over LLM behavior per language - Cleaner code without placeholder substitution - Easier to maintain and update prompts per language Changes: - Updated cvMatcher InitialSchema migration to seed en and ro prompts separately - Modified CvMatcherService to retrieve language-specific prompts directly - Removed LanguageName() helper method (no longer needed) - Added fallback prompts in service for safety The English and Romanian prompts now include specific JSON examples in their respective languages, ensuring the LLM understands the expected output format for each language variant. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-06-01 16:56:29 +03:00
claude	e38f40732f	feat(providers): add headless browser scraping via Playwright for SPA job sites Build and Push Docker Images Staging / build (push) Successful in 5m20s Details ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS bundle. This change equips cv-search-job with a headless Chromium (Playwright 1.60) so it can fully render SPA pages before extracting job links. - Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig, and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag is included in the session provider-config snapshot - Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL (remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true - HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync; plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial content rather than failing outright - Dockerfile: download Playwright Chromium in the SDK build stage via npx; copy browser binaries to the final image; install Chromium system libs (Ubuntu noble t64 variants) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 13:42:52 +03:00
claude	b78ede23cf	feat(job-search): extract keywords from LLM match call instead of heuristics Piggybacks keyword extraction onto the existing CV-to-job LLM call — no extra API calls. The system prompt now instructs the model to return 8-12 English job-search terms (job titles, technologies, skills, domains) in a new `keywords` field alongside the existing score/summary fields. Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest → JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords (copied at session-creation time, no RAG call needed). Changes: - Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity - IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords - JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords instead of fetching CV text from RAG — removes IRagApiClient dependency - Remove heuristic ExtractKeywords method - Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens - Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed to include keywords in the JSON shape Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 12:44:13 +03:00
claude	a467fac35d	fix(cv-matcher-api): fix keyword extraction for single-line PDF text PDF text extraction often stores all content without newlines. The previous line-based splitter would produce one line > 200 chars which was filtered out, yielding empty keywords. Replace with word-level sampling of the first 2000 chars, splitting on whitespace and common delimiters, skipping phone fragments, emails, and URLs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 12:29:10 +03:00
claude	c8d1a21736	chore(config): remove Providers array from appsettings — now in DB Provider config is no longer read from appsettings or env vars. All three providers (ejobs.ro, bestjobs.eu, linkedin.com) are seeded into cvSearch.JobProviders by the AddJobProviders migration. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 11:53:17 +03:00
claude	d0d45bd2d3	feat(job-search): read providers from DB and suppress link when none enabled JobTokenService.CreateTokenAsync queries cvSearch.JobProviders for any enabled row; returns null (no token created) when the table is empty or all providers are disabled. TriggerStartAsync snapshots enabled providers from DB at session-start time, preserving the existing snapshot contract. CvMatcherController guards link-building on a non-null TokenId so the "Start a job search" CTA is omitted from match emails when no providers are configured. JobSearchSettings.Providers list removed — provider config now lives exclusively in the DB. CvSearchJobTask.GetProviders falls back to an empty list with a warning (snapshot should always be populated from DB). Closes #35 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 11:46:44 +03:00
claude	43017036fd	Move CV matcher repositories from cv-matcher-api to cv-matcher-data - Move IAiPromptsRepository, EfAiPromptsRepository to cv-matcher-data/Repositories - Move IMatcherRepository, EfMatcherRepository to cv-matcher-data/Repositories - Add cv-matcher-api-models ProjectReference to cv-matcher-data.csproj - Delete cv-matcher-api/Data folder (all data access now in cv-matcher-data) Pattern: cv-matcher-api (logic) → cv-matcher-data (repositories, EF entities, migrations) Aligns with rag-api → rag-data consolidation Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-05-29 09:44:57 +03:00
claude	0c5b85e63c	Add Directory.Packages.props copy to all Dockerfiles The Docker builds were failing because the centralized package version management file (Directory.Packages.props) was not being copied into the build context. This file is required for NuGet to resolve package versions in projects that don't specify explicit versions. Updated all Dockerfiles to copy Directory.Packages.props before running dotnet restore: - Apis/api/Dockerfile - Apis/cv-matcher-api/Dockerfile - Apis/rag-api/Dockerfile - Jobs/cv-cleanup-job/Dockerfile - Jobs/cv-search-job/Dockerfile - web/Dockerfile Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 14:13:48 +03:00
claude	16bb195cb5	Add XML doc to all service interfaces and implementations (#26 ) - Update CLAUDE.md: replace incorrect 'no XML doc on internal code' rule with the correct convention (XML doc on all public methods and non-trivial private/protected helpers) - Restore /// <summary> on FileDownloadController private helpers (HandleRangeRequest, StreamRangeAsync) - Add full XML doc to all service contracts: ICaptchaVerifier, IEmailSender, ICvMatcherService, IJobTextExtractor, IJobTokenService, IDocumentClassifier, IRagService, ITextChunker, ITextExtractor, IEmailTemplateService, ITemplateService - Add /// <summary> and /// <inheritdoc /> to all concrete service classes and their methods: RecaptchaVerifier, EmailApiEmailSender, SmtpEmailDispatcher, CvMatcherService, JobTextExtractor, JobTokenService, RagService, DocumentClassifier, TextChunker, TextExtractor, HtmlJobSearcher, CvSearchEmailSender, CvSearchJobTask, EmailTemplateService, DbTemplateService Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 09:17:42 +03:00
claude	4ee4a59b5e	Improve comments and Swagger annotations across services (#26 ) - EmailController: add class summary, full SwaggerResponse/ProducesResponseType for 400 and 500, and Description on SwaggerOperation - ContactController: fix terse "Failed." error message to "Could not process subscription." - FileDownloadController: remove redundant XML <response code> tags from the public action doc block; convert private-method /// <summary> to // (project convention: no XML doc on internal code) - CvMatcherService: remove two dead commented-out blocks (old email send and BuildEmailBody helper) - JobTokenService: comment the phone/contact-line regex filter in ExtractKeywords - DocumentClassifier: comment the keyword-frequency scoring approach and the confidence formula - TextChunker: comment the sliding-window step (chunkSize - overlap) - CvSearchJobTask: comment the GdprConsent = true rationale and the BuildCvFileName sanitisation logic - HtmlJobSearcher: comment GetLeftPart(UriPartial.Path) query-strip dedup Closes #26 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 09:07:23 +03:00
claude	a1c145e861	feat(cv-matcher): add AiPrompts table; remove MyAiDbContext dependency cv-matcher-data: - Add AiPromptEntity (Key, Language, Value, Description, UpdatedAt) - Add AiPrompts DbSet to CvMatcherDbContext with composite PK - Migration AddAiPrompts: create cvMatcher.AiPrompts table and seed ai.cv-match.system-prompt (language "*") with the current prompt value cv-matcher-api: - Add IAiPromptsRepository / EfAiPromptsRepository under Data/Repositories/ - CvMatcherService: inject IAiPromptsRepository; replace _templates.Render(...) with async DB lookup + simple string replacement - Program.cs: register IAiPromptsRepository (scoped); remove MyAiDbContext, ITemplateService/DbTemplateService registrations and MyAiDbContext migration call - Remove myai-data ProjectReference Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 08:48:39 +03:00
claude	e95ed36647	refactor: restructure solution into -models/-data/-api project taxonomy Phases 1-10 of the planned refactoring: Phase 1: rename shared-models -> common - namespace Shared.Models -> Common throughout - remove stale AspNetCore.Http.Features 5.0 reference Phase 2: create shared-data with abstract BaseEntity - BaseEntity: required string Id { get; init; } + DateTime CreatedAt { get; init; } Phase 3: rename myai-models -> myai-data - namespace MyAi.Models -> MyAi.Data - MigrationsAssembly("myai-data") Phase 4: rename cv-search-models -> cv-search-data - namespace CvSearch.Models -> CvSearch.Data - move JobSearchSettings to cv-matcher-api-models - JobSearch*Entity now inherits BaseEntity Phase 5: extract rag-data from rag-api - new project: Apis/rag-data with RagDbContext + entities + migrations - RagDocumentEntity inherits BaseEntity; cache entities use CacheKey PK - fix duplicate AddHttpClient<RagAiClient>/AddScoped registrations in rag-api - MigrationsAssembly("rag-data") Phase 6: extract cv-matcher-data from cv-matcher-api - new project: Apis/cv-matcher-data with CvMatcherDbContext + entities + migrations - CvMatchResultEntity inherits BaseEntity; CvMatcherChatCacheEntity uses CacheKey PK - MigrationsAssembly("cv-matcher-data") Phase 7: create empty cv-cleanup-job-models and cv-search-job-models Phase 8: update all 5 Dockerfiles for renamed/new projects Phase 9: reorganise .sln virtual folders (Apis/Jobs/Models/Data/Helpers) - update root CLAUDE.md with new project taxonomy and migration commands - update cv-matcher-api/CLAUDE.md and cv-search-job/CLAUDE.md Phase 10: add Directory.Packages.props for centralised NuGet versions - remove Version= from all PackageReference elements in active .csproj files No database changes. No runtime behaviour changes. All MigrationId strings in __EFMigrationsHistory are unaffected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-27 15:26:03 +03:00
claude	fc6fe7a78b	feat: DB-backed localized templates + language-aware emails - New Apis/myai-models project: MyAiDbContext (schema myAi), TemplateEntity, ITemplateService, DbTemplateService with 10-min in-memory cache - Seeds EN+RO variants for all user-facing templates (match email, job search results email, HTML status pages, AI system prompt) - Match result email now sent in user's UI language (en/ro) - Job search results email now respects session language - Language propagates: MatchJobRequest -> token -> session -> email - Add Language column to JobSearchTokens and JobSearchSessions (default 'en') - All three Dockerfiles updated to include myai-models in build context Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-24 18:06:44 +03:00
claude	b6878e3b45	Respect UI language in match result — LLM responds in user's selected language The frontend sends the active language code (currentLang()) with every match request. CvMatcherService injects a language instruction into the system prompt so the LLM returns summary, strengths, gaps, recommendations, and evidence in the correct language. The match result cache (CvMatchResults) now includes Language as part of the lookup key so Romanian and English results are stored and retrieved independently. Existing cached rows default to 'en'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-24 17:04:21 +03:00
claude	1fcf1e1470	Add complete XML doc and Swagger annotations to all controller endpoints Every public action now has <summary>, <param>, and <returns> XML docs plus matching SwaggerOperation/SwaggerResponse attributes with typed response descriptions. Class-level summaries added to CvController, JobSearchController, and RagController. Explanatory inline comments removed from FileDownloadController per project conventions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 20:47:47 +03:00
claude	a4c128fdf4	Fix cv-matcher-api Dockerfile: add cv-search-models to build context Build and Push Docker Images / build (push) Successful in 3m42s Details dotnet restore failed in CI because cv-search-models.csproj was added as a ProjectReference but not copied into the Docker build context. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 18:17:58 +03:00
claude	6293fa89e3	Add internet job search feature (cv-search-job) Build and Push Docker Images / build (push) Failing after 1m36s Details - New cv-search-models shared library: EF entities + CvSearchDbContext for cvSearch schema (JobSearchTokens, JobSearchSessions, JobSearchResults tables) - New cv-search-job worker service: polls DB for pending sessions, scrapes job boards via configurable HTML scraping, runs LLM scoring via cv-matcher-api, emails ranked results - cv-matcher-api: JobTokenService creates one-time tokens; JobSearchController handles link clicks and creates sessions - api: proxies job-search start endpoint, appends job search link to match result email - CI workflow updated to build and push myai-cv-search-job:staging image - CLAUDE.md documentation added for all affected services Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-22 17:56:23 +03:00
claude	75bc9509c5	Changes Build and Push Docker Images / build (push) Successful in 4m35s Details	2026-05-14 14:12:29 +03:00
claude	92278ae375	Update Dockerfile paths and project references to reflect new directory structure under 'Apis' and 'Jobs'	2026-05-14 13:56:45 +03:00

26 Commits