AI/myAi - myAi - Gitea: Git with a cup of tea

AI/myAi

Author	SHA1	Message	Date
claude	473c36d65f	Store match-time ClientIpAddress on cvSearch.JobSearchTokens Captures the IP when the user submits the CV match form and stores it on the token, giving a full audit trail: token holds the match-site IP, session holds the email link-click IP. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 19:20:02 +03:00
claude	d56729de42	Add Email and ClientIpAddress audit fields to cvSearch.JobSearchSessions and JobSearchResults Captures client IP at job-search link-click time and threads it through to the session. Both Email and ClientIpAddress are copied from session to each result row during processing. Closes #47 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 19:11:50 +03:00
claude	a83f6f705f	Remove UseHeadlessBrowser from JobProvider — all fetches now go via page-fetcher-api page-fetcher-api always uses Playwright (networkidle by default), so the per-provider flag that chose between headless and plain HTTP is obsolete. - Removed from JobProviderEntity, CvSearchDbContext, JobProviderConfig, JobTokenService - HtmlJobSearcher no longer passes WaitFor (uses page-fetcher-api default) - EF migration drops the column from cvSearch.JobProviders Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 18:43:42 +03:00
claude	99e5cfb76b	Fix job search: location filtering, keyword quality, anchor filter bypass Closes #41 - Add RequireKeywordInAnchor per-provider flag (default true); set false for ejobs.ro and bestjobs.eu so Stage 2 anchor-text filter is skipped — their search URL already filters by relevance server-side - Update AI system prompts (en + ro) to extract concise job-board-friendly keywords (role title + key tech, not abstract concepts) and candidate location - Propagate location through JobMatchResponse -> CreateJobSearchTokenRequest -> JobSearchTokenEntity -> JobSearchSessionEntity - Add {location} and {location-slug} substitution in HtmlJobSearcher - Update provider SearchUrlTemplates to include location: ejobs.ro: /locuri-de-munca/{location-slug}?q={keywords} bestjobs.eu: /ro/locuri-de-munca-in-{location-slug}?keywords={keywords} linkedin.com: ?keywords={keywords}&location={location} - Three new migrations: AddRequireKeywordInAnchorAndLocation, ImproveKeywordsAndAddLocation, AddLocationToProviders Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-08 15:45:45 +03:00
claude	e38f40732f	feat(providers): add headless browser scraping via Playwright for SPA job sites Build and Push Docker Images Staging / build (push) Successful in 5m20s Details ejobs.ro migrated to a Nuxt SPA - plain HTTP GET returns only the JS bundle. This change equips cv-search-job with a headless Chromium (Playwright 1.60) so it can fully render SPA pages before extracting job links. - Add UseHeadlessBrowser flag to JobProviderEntity, JobProviderConfig, and CvSearchDbContext; map it in JobTokenService.ToConfig so the flag is included in the session provider-config snapshot - Migration: add UseHeadlessBrowser column; fix ejobs.ro search URL (remove /user/ prefix that caused 404) and set UseHeadlessBrowser=true - HtmlJobSearcher: detect flag and dispatch to FetchWithPlaywrightAsync; plain-HTTP path is unchanged; NetworkIdle timeout falls back to partial content rather than failing outright - Dockerfile: download Playwright Chromium in the SDK build stage via npx; copy browser binaries to the final image; install Chromium system libs (Ubuntu noble t64 variants) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 13:42:52 +03:00
claude	209325ace5	fix(providers): correct bestjobs.eu job link filter pattern Individual job listings on bestjobs.eu use /loc-de-munca/{slug} URLs. The seeded JobLinkContains value /ro/locuri-de-munca/ matched only the category navigation links (Vanzari, Inginerie, Management...), so zero job URLs passed the stage-1 href filter and the scraper returned nothing. Migration updates the stored record (Id=2) to /loc-de-munca/. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 13:16:35 +03:00
claude	9bedf57f39	fix(migrations): replace hardcoded schema strings with MigrationConstants.SchemaName Two migration files had literal schema strings that were missed in earlier passes: - cv-search-data AddJobSearchTables: two CreateIndex calls used "cvSearch" - rag-data InitialRagSchema: FK principalSchema used "rag" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 12:46:41 +03:00
claude	b78ede23cf	feat(job-search): extract keywords from LLM match call instead of heuristics Piggybacks keyword extraction onto the existing CV-to-job LLM call — no extra API calls. The system prompt now instructs the model to return 8-12 English job-search terms (job titles, technologies, skills, domains) in a new `keywords` field alongside the existing score/summary fields. Keywords flow: LLM JSON → JobMatchResponse.Keywords → CreateJobSearchTokenRequest → JobSearchTokenEntity.Keywords (stored comma-separated) → JobSearchSessionEntity.Keywords (copied at session-creation time, no RAG call needed). Changes: - Add Keywords to JobMatchResponse, CreateJobSearchTokenRequest, JobSearchTokenEntity - IJobTokenService.CreateTokenAsync now accepts IReadOnlyList<string> keywords - JobTokenService: store keywords on token; TriggerStartAsync reads token.Keywords instead of fetching CV text from RAG — removes IRagApiClient dependency - Remove heuristic ExtractKeywords method - Migration AddKeywordsToJobSearchTokens: adds Keywords column to cvSearch.JobSearchTokens - Migration UpdateCvMatchSystemPromptKeywords: updates ai.cv-match.system-prompt seed to include keywords in the JSON shape Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 12:44:13 +03:00
claude	c675954f8a	fix(cv-search-data): use MigrationConstants.SchemaName in AddJobProviders migration Replace hardcoded "cvSearch" string literals with MigrationConstants.SchemaName in the Up, InsertData, and Down methods, consistent with all other migrations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 12:15:44 +03:00
claude	7c09f5a871	feat(cv-search-data): add JobProviders table to cvSearch schema New JobProviderEntity persists provider config (name, URL template, link filter, initial keywords, max results, display order) in the DB instead of appsettings. Migration seeds three disabled defaults: ejobs.ro, bestjobs.eu, and linkedin.com. Closes #35 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 11:46:34 +03:00
claude	487924e345	Move RAG repository from rag-api to rag-data — consolidate data layer ownership - Move IRagRepository, EfRagRepository, and VectorSerializer from rag-api/Data to rag-data/Repositories - Add rag-api-models ProjectReference to rag-data.csproj for model type availability - Delete rag-api/Data folder (no longer needed; all data access is now in rag-data) - This aligns RAG with email-api and other services: all data code in the data project Pattern: rag-api (API logic) → rag-data (repository, EF entities, migrations) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>	2026-05-29 09:38:25 +03:00
claude	e95ed36647	refactor: restructure solution into -models/-data/-api project taxonomy Phases 1-10 of the planned refactoring: Phase 1: rename shared-models -> common - namespace Shared.Models -> Common throughout - remove stale AspNetCore.Http.Features 5.0 reference Phase 2: create shared-data with abstract BaseEntity - BaseEntity: required string Id { get; init; } + DateTime CreatedAt { get; init; } Phase 3: rename myai-models -> myai-data - namespace MyAi.Models -> MyAi.Data - MigrationsAssembly("myai-data") Phase 4: rename cv-search-models -> cv-search-data - namespace CvSearch.Models -> CvSearch.Data - move JobSearchSettings to cv-matcher-api-models - JobSearch*Entity now inherits BaseEntity Phase 5: extract rag-data from rag-api - new project: Apis/rag-data with RagDbContext + entities + migrations - RagDocumentEntity inherits BaseEntity; cache entities use CacheKey PK - fix duplicate AddHttpClient<RagAiClient>/AddScoped registrations in rag-api - MigrationsAssembly("rag-data") Phase 6: extract cv-matcher-data from cv-matcher-api - new project: Apis/cv-matcher-data with CvMatcherDbContext + entities + migrations - CvMatchResultEntity inherits BaseEntity; CvMatcherChatCacheEntity uses CacheKey PK - MigrationsAssembly("cv-matcher-data") Phase 7: create empty cv-cleanup-job-models and cv-search-job-models Phase 8: update all 5 Dockerfiles for renamed/new projects Phase 9: reorganise .sln virtual folders (Apis/Jobs/Models/Data/Helpers) - update root CLAUDE.md with new project taxonomy and migration commands - update cv-matcher-api/CLAUDE.md and cv-search-job/CLAUDE.md Phase 10: add Directory.Packages.props for centralised NuGet versions - remove Version= from all PackageReference elements in active .csproj files No database changes. No runtime behaviour changes. All MigrationId strings in __EFMigrationsHistory are unaffected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-27 15:26:03 +03:00

12 Commits