From 3ae2f8e43704deab8071ff37f73ceb1430c413db Mon Sep 17 00:00:00 2001 From: gelu Date: Fri, 22 May 2026 19:01:05 +0300 Subject: [PATCH] Add Features/Internet-Job-Search wiki page --- Features-Internet-Job-Search.md | 81 +++++++++++++++++++++++++++++++++ 1 file changed, 81 insertions(+) create mode 100644 Features-Internet-Job-Search.md diff --git a/Features-Internet-Job-Search.md b/Features-Internet-Job-Search.md new file mode 100644 index 0000000..d1d2082 --- /dev/null +++ b/Features-Internet-Job-Search.md @@ -0,0 +1,81 @@ +# Internet Job Search from CV Match Email + +## Summary + +After a CV match with email, the result email contains a one-time job search link. Clicking it triggers an automated internet job search: the system scrapes configured job boards, pre-filters using CV keywords, runs the existing LLM CV-to-job match on the top candidates, and emails a ranked list of real-world job matches back to the user. + +## Status + +Deployed to staging 2026-05-22. End-to-end testing pending. + +## Commits + +- `6293fa8` -- Add internet job search feature (cv-search-job) +- `a4c128f` -- Fix cv-matcher-api Dockerfile: add cv-search-models to build context +- `cf06453` -- Refactor docker-compose: single deployable file + local override + +## Related Issues + +- #2 cv-search-models shared library +- #3 cv-search-job background worker +- #4 cv-matcher-api job search token + session endpoints +- #5 api job-search proxy route and email link +- #6 fix: cv-matcher-api Dockerfile missing cv-search-models COPY +- #7 chore: docker-compose refactor + +## Architecture + +### Flow + +1. User does CV match and provides email +2. cv-matcher-api saves a JobSearchToken (UUID, 7-day expiry) in the cvSearch DB schema +3. Match result email includes link: https://myai.ro/api/cv-matcher/job-search/start?t=tokenId +4. User clicks link -- api validates token, creates JobSearchSession (Pending), returns styled HTML confirmation +5. cv-search-job polls every 30s, picks up Pending sessions: + - Extracts CV keywords (heuristic, no LLM cost) + - Scrapes each enabled provider via HtmlJobSearcher + - Runs existing ScorePairAsync LLM match on top MaxJobsToMatch URLs + - Emails ranked results to user + Contact:ToEmail with CV PDF attached + +### New Projects + +| Project | Path | Role | +|---------|------|------| +| cv-search-models | Apis/cv-search-models/ | Shared EF entities + CvSearchDbContext (schema cvSearch) | +| cv-search-job | Jobs/cv-search-job/ | Worker: polls Pending sessions, scrapes, scores, emails | + +### Database Schema (cvSearch) + +| Table | Purpose | +|-------|---------| +| JobSearchTokens | One-time tokens created after each CV match with email | +| JobSearchSessions | Lifecycle tracking: Pending to Processing to Done / Failed | +| JobSearchResults | Per-job score, title, full extracted text, LLM result JSON | + +### Provider Configuration + +Fully configurable in appsettings under JobSearch:Providers. Each entry has SearchUrlTemplate (with {keywords} placeholder), JobLinkContains, InitialKeywords merged with CV keywords, and MaxResults. All providers disabled by default. Configured providers: ejobs.ro, bestjobs.eu, linkedin.com. + +## Docker Compose + +docker-compose.yml is the single file to paste into Portainer for both staging and production. + +| Variable | Default (Portainer) | Local dev (.env) | +|----------|--------------------|--------------------| +| IMAGE_TAG | staging / production | local | +| LOGS_PATH | /opt/myai/logs | ./logs | +| FILES_PATH | /opt/myai/files | ../Apis/api/Files | + +docker-compose.override.yml adds build, ports, and env_file for local dev (auto-merged by docker compose up). + +## EF Migrations + + dotnet ef migrations add --context CvSearchDbContext --project Apis/cv-search-models --startup-project Apis/cv-matcher-api + +## Key Implementation Notes + +- cv-matcher-api/Dockerfile MUST COPY Apis/cv-search-models/ -- omitting this causes CI build failure +- Use $$""" raw string literals for HTML generation -- CSS braces cause CS9006 with $""" +- Orphaned Processing sessions (from container crash) are reset to Pending on cv-search-job startup +- cv-search-job and api share the same bind-mount volume for CV PDF email attachment +- No captcha on the job-search start link -- the one-time UUID token is the credential