AI/myAi - myAi - Gitea: Git with a cup of tea

AI/myAi

Author	SHA1	Message	Date
claude	99e5cfb76b	Fix job search: location filtering, keyword quality, anchor filter bypass Closes #41 - Add RequireKeywordInAnchor per-provider flag (default true); set false for ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their search URL already filters by relevance server-side - Update AI system prompts (en + ro) to extract concise job-board-friendly keywords (role title + key tech, not abstract concepts) and candidate location - Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest -> JobSearchTokenEntity -> JobSearchSessionEntity - Add {location} and {location-slug} substitution in HtmlJobSearcher - Update provider SearchUrlTemplates to include location: ejobs.ro: /locuri-de-munca/{location-slug}?q={keywords} bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords} linkedin.com: ?keywords={keywords}&location={location} - Three new migrations: AddRequireKeywordInAnchorAndLocation, ImproveKeywordsAndAddLocation, AddLocationToProviders Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 15:45:45 +03:00
claude	e38f40732f	feat(providers): add headless browser scraping via Playwright for SPA job sites Build and Push Docker Images Staging / build (push) Successful in 5m20s Details ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS bundle. This change equips cv-search-job with a headless Chromium (Playwright 1.60) so it can fully render SPA pages before extracting job links. - Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig, and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag is included in the session provider-config snapshot - Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL (remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true - HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync; plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial content rather than failing outright - Dockerfile: download Playwright Chromium in the SDK build stage via npx; copy browser binaries to the final image; install Chromium system libs (Ubuntu noble t64 variants) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 13:42:52 +03:00
claude	b78ede23cf	feat(job-search): extract keywords from LLM match call instead of heuristics Piggybacks keyword extraction onto the existing CV-to-job LLM call — no extra API calls. The system prompt now instructs the model to return 8-12 English job-search terms (job titles, technologies, skills, domains) in a new `keywords` field alongside the existing score/summary fields. Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest → JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords (copied at session-creation time, no RAG call needed). Changes: - Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity - IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords - JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords instead of fetching CV text from RAG — removes IRagApiClient dependency - Remove heuristic ExtractKeywords method - Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens - Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed to include keywords in the JSON shape Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 12:44:13 +03:00
claude	7c09f5a871	feat(cv-search-data): add JobProviders table to cvSearch schema New JobProviderEntity persists provider config (name, URL template, link filter, initial keywords, max results, display order) in the DB instead of appsettings. Migration seeds three disabled defaults: ejobs.ro, bestjobs.eu, and linkedin.com. Closes #35 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 11:46:34 +03:00
claude	487924e345	Move RAG repository from rag-api to rag-data — consolidate data layer ownership - Move IRagRepository, EfRagRepository, and VectorSerializer from rag-api/Data to rag-data/Repositories - Add rag-api-models ProjectReference to rag-data.csproj for model type availability - Delete rag-api/Data folder (no longer needed; all data access is now in rag-data) - This aligns RAG with email-api and other services: all data code in the data project Pattern: rag-api (API logic) → rag-data (repository, EF entities, migrations) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-05-29 09:38:25 +03:00
claude	57e8cb3f4b	fix: Configure EF Core migration history tables with schema-qualified names Each DbContext now explicitly configures its migration history table to use the schema-qualified name pattern [schemaName].[_Migrations]: - [cvMatcher].[_Migrations] for CvMatcherDbContext - [emailApi].[_Migrations] for EmailApiDbContext - [cvSearch].[_Migrations] for CvSearchDbContext - [rag].[_Migrations] for RagDbContext - [myAi].[_Migrations] for MyAiDbContext This is done via OnConfiguring() with UseSqlServer().MigrationsHistoryTable(name, schema). Removed incorrect rename migrations that were created due to misunderstanding of the proper EF Core configuration approach. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-05-29 08:37:23 +03:00
claude	e95ed36647	refactor: restructure solution into -models/-data/-api project taxonomy Phases 1-10 of the planned refactoring: Phase 1: rename shared-models -> common - namespace Shared.Models -> Common throughout - remove stale AspNetCore.Http.Features 5.0 reference Phase 2: create shared-data with abstract BaseEntity - BaseEntity: required string Id { get; init; } + DateTime CreatedAt { get; init; } Phase 3: rename myai-models -> myai-data - namespace MyAi.Models -> MyAi.Data - MigrationsAssembly("myai-data") Phase 4: rename cv-search-models -> cv-search-data - namespace CvSearch.Models -> CvSearch.Data - move JobSearchSettings to cv-matcher-api-models - JobSearch*Entity now inherits BaseEntity Phase 5: extract rag-data from rag-api - new project: Apis/rag-data with RagDbContext + entities + migrations - RagDocumentEntity inherits BaseEntity; cache entities use CacheKey PK - fix duplicate AddHttpClient<RagAiClient>/AddScoped registrations in rag-api - MigrationsAssembly("rag-data") Phase 6: extract cv-matcher-data from cv-matcher-api - new project: Apis/cv-matcher-data with CvMatcherDbContext + entities + migrations - CvMatchResultEntity inherits BaseEntity; CvMatcherChatCacheEntity uses CacheKey PK - MigrationsAssembly("cv-matcher-data") Phase 7: create empty cv-cleanup-job-models and cv-search-job-models Phase 8: update all 5 Dockerfiles for renamed/new projects Phase 9: reorganise .sln virtual folders (Apis/Jobs/Models/Data/Helpers) - update root CLAUDE.md with new project taxonomy and migration commands - update cv-matcher-api/CLAUDE.md and cv-search-job/CLAUDE.md Phase 10: add Directory.Packages.props for centralised NuGet versions - remove Version= from all PackageReference elements in active .csproj files No database changes. No runtime behaviour changes. All MigrationId strings in __EFMigrationsHistory are unaffected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-27 15:26:03 +03:00

7 Commits