Add internet job search feature (cv-search-job)
Build and Push Docker Images / build (push) Failing after 1m36s
Build and Push Docker Images / build (push) Failing after 1m36s
- New cv-search-models shared library: EF entities + CvSearchDbContext for cvSearch schema (JobSearchTokens, JobSearchSessions, JobSearchResults tables) - New cv-search-job worker service: polls DB for pending sessions, scrapes job boards via configurable HTML scraping, runs LLM scoring via cv-matcher-api, emails ranked results - cv-matcher-api: JobTokenService creates one-time tokens; JobSearchController handles link clicks and creates sessions - api: proxies job-search start endpoint, appends job search link to match result email - CI workflow updated to build and push myai-cv-search-job:staging image - CLAUDE.md documentation added for all affected services Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,90 @@
|
||||
# cv-search-job — Internet Job Search Worker
|
||||
|
||||
Background worker. Polls the database every 30 s for pending job search sessions and processes them.
|
||||
|
||||
## What it does (per session)
|
||||
|
||||
1. Reads session from DB (`Status = Pending`)
|
||||
2. Sets `Status = Processing`
|
||||
3. Deserializes `ProviderConfigJson` (snapshot of provider configs taken at token-start time)
|
||||
4. For each enabled provider: calls `HtmlJobSearcher` to scrape job URLs
|
||||
5. Deduplicates URLs across providers, caps at `MaxJobsToMatch` (default 15)
|
||||
6. Calls `cv-matcher-api POST /api/cv/match-job` for each URL (uses existing LLM scoring)
|
||||
7. Saves each result as `JobSearchResultEntity`
|
||||
8. Filters to `Score >= MinMatchScore` (default 15)
|
||||
9. Sets `Status = Done`, saves keywords + provider snapshot to session
|
||||
10. Sends ranked results email via `CvSearchEmailSender` (dual-recipient: user + `Contact:ToEmail`)
|
||||
11. Attaches CV PDF from shared file storage if it exists
|
||||
|
||||
## Crash recovery
|
||||
|
||||
On every tick, sessions with `Status = Processing` AND `CreatedAt < UtcNow - 10 min` are reset to `Pending`. This handles container restarts mid-processing.
|
||||
|
||||
## HtmlJobSearcher — generic HTML scraper
|
||||
|
||||
No per-provider logic. Config-driven. For each provider:
|
||||
1. Combines `provider.InitialKeywords` + CV keywords from session, URL-encodes as space-joined string
|
||||
2. `GET {SearchUrlTemplate}` with keyword substitution
|
||||
3. Regex-parses all `<a href="..." >text</a>` tags
|
||||
4. Two-stage filter:
|
||||
- Stage 1: `href` must contain `JobLinkContains`
|
||||
- Stage 2: anchor text must contain at least one CV keyword
|
||||
5. Makes hrefs absolute, deduplicates, returns up to `MaxResults` URLs
|
||||
|
||||
## Provider config
|
||||
|
||||
Defined under `JobSearch:Providers` in appsettings / docker-compose env vars. Three providers ship as defaults (all `Enabled: false`):
|
||||
|
||||
| Name | Notes |
|
||||
|------|-------|
|
||||
| `ejobs.ro` | Romanian job board; reliable HTML structure |
|
||||
| `bestjobs.eu` | Romanian job board |
|
||||
| `linkedin.com` | Likely to return empty results due to bot detection |
|
||||
|
||||
Provider config is snapshotted to `JobSearchSessionEntity.ProviderConfigJson` at session creation time (in `cv-matcher-api`), so changes to config do not affect in-flight sessions.
|
||||
|
||||
To enable a provider via docker-compose env var (index-based):
|
||||
```
|
||||
JobSearch__Providers__0__Enabled=true # ejobs.ro
|
||||
JobSearch__Providers__1__Enabled=true # bestjobs.eu
|
||||
JobSearch__Providers__2__Enabled=true # linkedin.com
|
||||
```
|
||||
|
||||
## Email
|
||||
|
||||
`CvSearchEmailSender` reads SMTP config directly from `IConfiguration` (same `Smtp:*` keys as `api`).
|
||||
Sends to both `toEmail` (from session) and `Contact:ToEmail` (operator copy).
|
||||
CV PDF attached from `{FileStorage:Path}/{cvDocumentId}.pdf` if the file exists.
|
||||
|
||||
## Shared volume
|
||||
|
||||
`../Apis/api/Files:/app/Files` — same bind mount as `api` and `cv-cleanup-job`.
|
||||
CV PDFs written by `api` are readable here without any API call.
|
||||
|
||||
## Key settings
|
||||
|
||||
| Section | Env var | Notes |
|
||||
|---------|---------|-------|
|
||||
| `Database` | `Database__*` | Same SQL Server as other services |
|
||||
| `CvMatcherApi` | `CvMatcherApi__BaseUrl`, `CvMatcherApi__InternalApiKey` | Internal call to match-job endpoint |
|
||||
| `Smtp` | `Smtp__*` | Same vars as `api` |
|
||||
| `Contact` | `Contact__ToEmail` | Operator copy recipient |
|
||||
| `FileStorage` | `FileStorage__Path` | Must match the shared volume mount path |
|
||||
| `JobSearch` | `JobSearch__Enabled`, `MinMatchScore`, `MaxJobsToMatch` | Core search limits |
|
||||
| `Jobs:Tasks:0` | `Jobs__Tasks__0__Interval` | Poll interval (default `00:00:30`) |
|
||||
|
||||
## Logging
|
||||
|
||||
Follows the same scheme as `cv-cleanup-job`:
|
||||
- **Console** — `[HH:mm:ss LVL] SourceContext: Message`
|
||||
- **File** — `logs/cv-search-job-.log`, daily rolling, 30-day retention
|
||||
- **Email** (index 2) — Errors only, wired via `Serilog__WriteTo__2__Args__*` env vars in docker-compose
|
||||
- **Enrich** — `FromLogContext`, `WithMachineName`, `WithEnvironmentName`
|
||||
|
||||
`Serilog.Sinks.Email` is available transitively through `startup-helpers` — no extra package needed in the csproj.
|
||||
|
||||
## EF migrations
|
||||
|
||||
This project runs `CvSearchDbContext.Database.Migrate()` on startup.
|
||||
Migrations live in `Apis/cv-search-models/Migrations/`.
|
||||
To add a migration: see root CLAUDE.md.
|
||||
@@ -0,0 +1,11 @@
|
||||
using CvMatcher.Models.Requests;
|
||||
using CvMatcher.Models.Responses;
|
||||
using Refit;
|
||||
|
||||
namespace CvSearchJob.Clients;
|
||||
|
||||
public interface ICvMatcherInternalApi
|
||||
{
|
||||
[Post("/api/cv/match-job")]
|
||||
Task<JobMatchResponse> MatchJobAsync([Body] MatchJobRequest request, CancellationToken ct);
|
||||
}
|
||||
@@ -0,0 +1,28 @@
|
||||
FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build
|
||||
ARG BUILD_CONFIGURATION=Release
|
||||
WORKDIR /src
|
||||
|
||||
COPY Jobs/cv-search-job/cv-search-job.csproj Jobs/cv-search-job/
|
||||
COPY Jobs/job-scheduler/job-scheduler.csproj Jobs/job-scheduler/
|
||||
COPY Apis/cv-search-models/cv-search-models.csproj Apis/cv-search-models/
|
||||
COPY Apis/cv-matcher-api-models/cv-matcher-api-models.csproj Apis/cv-matcher-api-models/
|
||||
COPY Apis/shared-models/shared-models.csproj Apis/shared-models/
|
||||
COPY Helpers/startup-helpers/startup-helpers.csproj Helpers/startup-helpers/
|
||||
|
||||
RUN dotnet restore Jobs/cv-search-job/cv-search-job.csproj
|
||||
|
||||
COPY Jobs/cv-search-job/ Jobs/cv-search-job/
|
||||
COPY Jobs/job-scheduler/ Jobs/job-scheduler/
|
||||
COPY Apis/cv-search-models/ Apis/cv-search-models/
|
||||
COPY Apis/cv-matcher-api-models/ Apis/cv-matcher-api-models/
|
||||
COPY Apis/shared-models/ Apis/shared-models/
|
||||
COPY Helpers/startup-helpers/ Helpers/startup-helpers/
|
||||
|
||||
RUN dotnet publish Jobs/cv-search-job/cv-search-job.csproj -c $BUILD_CONFIGURATION -o /app/publish /p:UseAppHost=false
|
||||
|
||||
FROM mcr.microsoft.com/dotnet/aspnet:10.0 AS final
|
||||
WORKDIR /app
|
||||
|
||||
COPY --from=build /app/publish .
|
||||
|
||||
ENTRYPOINT ["dotnet", "cv-search-job.dll"]
|
||||
@@ -0,0 +1,86 @@
|
||||
using System.Reflection;
|
||||
using CvSearch.Models.Data;
|
||||
using CvSearch.Models.Settings;
|
||||
using CvSearchJob.Clients;
|
||||
using CvSearchJob.Services;
|
||||
using CvSearchJob.Tasks;
|
||||
using JobScheduler.Scheduling;
|
||||
using JobScheduler.Tasks;
|
||||
using Microsoft.EntityFrameworkCore;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Hosting;
|
||||
using Refit;
|
||||
using Serilog;
|
||||
using Shared.Models.Settings;
|
||||
using StartupHelpers;
|
||||
|
||||
const string ServiceName = "cv-search-job";
|
||||
|
||||
StartupExtensions.LoadDotEnvFile();
|
||||
var appVersion = StartupExtensions.GetApplicationVersion(Assembly.GetExecutingAssembly());
|
||||
|
||||
try
|
||||
{
|
||||
var builder = Host.CreateApplicationBuilder(args);
|
||||
|
||||
builder.ConfigureJsonSerilog(ServiceName, appVersion);
|
||||
Log.Information("Starting {Service} version {AppVersion}", ServiceName, appVersion);
|
||||
|
||||
builder.Services.Configure<JobSearchSettings>(builder.Configuration.GetSection("JobSearch"));
|
||||
builder.Services.Configure<DatabaseSettings>(builder.Configuration.GetSection("Database"));
|
||||
|
||||
builder.Services.AddDbContext<CvSearchDbContext>(options =>
|
||||
{
|
||||
var connectionString = builder.Services.GetConfiguredDbConnectionString(builder.Configuration);
|
||||
options.UseSqlServer(connectionString, sql =>
|
||||
{
|
||||
sql.MigrationsAssembly("cv-search-models");
|
||||
sql.MigrationsHistoryTable(CvSearchDbContext.MigrationTableName, CvSearchDbContext.SchemaName);
|
||||
});
|
||||
});
|
||||
|
||||
builder.Services.AddRefitClient<ICvMatcherInternalApi>()
|
||||
.ConfigureHttpClient((sp, client) =>
|
||||
{
|
||||
var config = sp.GetRequiredService<Microsoft.Extensions.Configuration.IConfiguration>();
|
||||
var baseUrl = config["CvMatcherApi:BaseUrl"] ?? string.Empty;
|
||||
if (!string.IsNullOrWhiteSpace(baseUrl))
|
||||
client.BaseAddress = new Uri(baseUrl.TrimEnd('/') + "/");
|
||||
var key = config["CvMatcherApi:InternalApiKey"];
|
||||
if (!string.IsNullOrWhiteSpace(key))
|
||||
client.DefaultRequestHeaders.Add("X-Internal-Api-Key", key);
|
||||
});
|
||||
|
||||
builder.Services.AddHttpClient<HtmlJobSearcher>();
|
||||
builder.Services.AddSingleton<CvSearchEmailSender>();
|
||||
|
||||
builder.Services.AddSingleton<CvSearchJobTask>();
|
||||
builder.Services.AddSingleton<IEnumerable<IJobTask>>(sp => new IJobTask[]
|
||||
{
|
||||
sp.GetRequiredService<CvSearchJobTask>(),
|
||||
});
|
||||
|
||||
builder.Services.AddHostedService<JobSchedulerHostedService>();
|
||||
|
||||
var host = builder.Build();
|
||||
|
||||
host.LogHostStartupDiagnostics(ServiceName);
|
||||
|
||||
Log.Information("Running EF Core migrations for CvSearchDbContext");
|
||||
using (var scope = host.Services.CreateScope())
|
||||
{
|
||||
var db = scope.ServiceProvider.GetRequiredService<CvSearchDbContext>();
|
||||
db.Database.Migrate();
|
||||
}
|
||||
|
||||
Log.Information("{Service} startup complete. Background scheduler is running.", ServiceName);
|
||||
await host.RunAsync();
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
Log.Fatal(ex, "{Service} terminated unexpectedly", ServiceName);
|
||||
}
|
||||
finally
|
||||
{
|
||||
Log.CloseAndFlush();
|
||||
}
|
||||
@@ -0,0 +1,108 @@
|
||||
using CvMatcher.Models.Responses;
|
||||
using CvSearch.Models.Data.Entities;
|
||||
using MailKit.Net.Smtp;
|
||||
using MailKit.Security;
|
||||
using Microsoft.Extensions.Configuration;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using MimeKit;
|
||||
|
||||
namespace CvSearchJob.Services;
|
||||
|
||||
public sealed class CvSearchEmailSender
|
||||
{
|
||||
private readonly IConfiguration _config;
|
||||
private readonly ILogger<CvSearchEmailSender> _logger;
|
||||
|
||||
public CvSearchEmailSender(IConfiguration config, ILogger<CvSearchEmailSender> logger)
|
||||
{
|
||||
_config = config;
|
||||
_logger = logger;
|
||||
}
|
||||
|
||||
public async Task SendResultsAsync(
|
||||
string toEmail,
|
||||
string? attachmentPath,
|
||||
IReadOnlyList<JobSearchResultEntity> results,
|
||||
CancellationToken ct)
|
||||
{
|
||||
var smtpHost = _config["Smtp:Host"];
|
||||
var smtpPort = int.TryParse(_config["Smtp:Port"], out var port) ? port : 587;
|
||||
var smtpUser = _config["Smtp:Username"];
|
||||
var smtpPass = _config["Smtp:Password"];
|
||||
var useStartTls = bool.TryParse(_config["Smtp:UseStartTls"], out var tls) && tls;
|
||||
var contactToEmail = _config["Contact:ToEmail"];
|
||||
|
||||
if (string.IsNullOrWhiteSpace(smtpHost)) return;
|
||||
|
||||
var recipients = new List<string>();
|
||||
if (!string.IsNullOrWhiteSpace(toEmail)) recipients.Add(toEmail);
|
||||
if (!string.IsNullOrWhiteSpace(contactToEmail) &&
|
||||
!recipients.Any(r => string.Equals(r, contactToEmail, StringComparison.OrdinalIgnoreCase)))
|
||||
recipients.Add(contactToEmail);
|
||||
|
||||
if (recipients.Count == 0) return;
|
||||
|
||||
var body = BuildBody(results);
|
||||
var subject = $"MyAi.ro: {results.Count} joburi potrivite CV-ului tau";
|
||||
var environmentName = Environment.GetEnvironmentVariable("APP_ENVIRONMENT_NAME") ?? "Development";
|
||||
|
||||
foreach (var recipient in recipients)
|
||||
{
|
||||
var msg = new MimeMessage();
|
||||
msg.From.Add(MailboxAddress.Parse(smtpUser!));
|
||||
msg.To.Add(MailboxAddress.Parse(recipient));
|
||||
msg.Subject = $"[{environmentName}] {subject}";
|
||||
|
||||
var builder = new BodyBuilder { TextBody = body };
|
||||
if (!string.IsNullOrWhiteSpace(attachmentPath) && File.Exists(attachmentPath))
|
||||
builder.Attachments.Add(attachmentPath);
|
||||
|
||||
msg.Body = builder.ToMessageBody();
|
||||
|
||||
try
|
||||
{
|
||||
using var client = new SmtpClient();
|
||||
var tls2 = useStartTls ? SecureSocketOptions.StartTls : SecureSocketOptions.Auto;
|
||||
await client.ConnectAsync(smtpHost, smtpPort, tls2, ct);
|
||||
if (!string.IsNullOrWhiteSpace(smtpUser))
|
||||
await client.AuthenticateAsync(smtpUser, smtpPass ?? string.Empty, ct);
|
||||
await client.SendAsync(msg, ct);
|
||||
await client.DisconnectAsync(true, ct);
|
||||
_logger.LogInformation("Job search results email sent to {Recipient}", recipient);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex, "Failed to send job search results email to {Recipient}", recipient);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static string BuildBody(IReadOnlyList<JobSearchResultEntity> results)
|
||||
{
|
||||
if (results.Count == 0)
|
||||
return "MyAi.ro nu a gasit joburi care sa corespunda CV-ului tau. Incercati mai tarziu sau ajustati CV-ul.";
|
||||
|
||||
var lines = new System.Text.StringBuilder();
|
||||
lines.AppendLine($"MyAi.ro a gasit {results.Count} joburi potrivite CV-ului tau:");
|
||||
lines.AppendLine();
|
||||
|
||||
for (int i = 0; i < results.Count; i++)
|
||||
{
|
||||
var r = results[i];
|
||||
var matchResp = TryParseResult(r.ResultJson);
|
||||
lines.AppendLine($"{i + 1}. {r.JobTitle} ({r.Score}% match) [{r.ProviderName}]");
|
||||
lines.AppendLine($" {r.JobUrl}");
|
||||
if (matchResp is not null && !string.IsNullOrWhiteSpace(matchResp.Summary))
|
||||
lines.AppendLine($" {matchResp.Summary}");
|
||||
lines.AppendLine();
|
||||
}
|
||||
|
||||
return lines.ToString();
|
||||
}
|
||||
|
||||
private static JobMatchResponse? TryParseResult(string json)
|
||||
{
|
||||
try { return System.Text.Json.JsonSerializer.Deserialize<JobMatchResponse>(json, new System.Text.Json.JsonSerializerOptions(System.Text.Json.JsonSerializerDefaults.Web)); }
|
||||
catch { return null; }
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,86 @@
|
||||
using System.Text.RegularExpressions;
|
||||
using System.Web;
|
||||
using CvSearch.Models.Settings;
|
||||
using Microsoft.Extensions.Logging;
|
||||
|
||||
namespace CvSearchJob.Services;
|
||||
|
||||
public sealed class HtmlJobSearcher
|
||||
{
|
||||
private readonly HttpClient _http;
|
||||
private readonly ILogger<HtmlJobSearcher> _logger;
|
||||
|
||||
public HtmlJobSearcher(HttpClient http, ILogger<HtmlJobSearcher> logger)
|
||||
{
|
||||
_http = http;
|
||||
_logger = logger;
|
||||
_http.Timeout = TimeSpan.FromSeconds(20);
|
||||
_http.DefaultRequestHeaders.UserAgent.ParseAdd("Mozilla/5.0 (compatible; MyAi.ro CV-Search/1.0)");
|
||||
}
|
||||
|
||||
public async Task<IReadOnlyList<string>> SearchJobUrlsAsync(
|
||||
JobProviderConfig provider,
|
||||
IReadOnlyList<string> cvKeywords,
|
||||
CancellationToken ct)
|
||||
{
|
||||
var allKeywords = provider.InitialKeywords
|
||||
.Concat(cvKeywords)
|
||||
.Where(k => !string.IsNullOrWhiteSpace(k))
|
||||
.Distinct(StringComparer.OrdinalIgnoreCase)
|
||||
.ToList();
|
||||
|
||||
if (allKeywords.Count == 0)
|
||||
return [];
|
||||
|
||||
var keywordsEncoded = HttpUtility.UrlEncode(string.Join(" ", allKeywords));
|
||||
var searchUrl = provider.SearchUrlTemplate.Replace("{keywords}", keywordsEncoded);
|
||||
|
||||
string html;
|
||||
try
|
||||
{
|
||||
html = await _http.GetStringAsync(searchUrl, ct);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex, "Failed to fetch search results from {Provider} at {Url}", provider.Name, searchUrl);
|
||||
return [];
|
||||
}
|
||||
|
||||
var baseUri = new Uri(searchUrl);
|
||||
var results = new List<string>();
|
||||
var seen = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
// Match all anchor tags capturing href and inner text
|
||||
var anchorPattern = new Regex(@"<a[^>]+href=[""']([^""']+)[""'][^>]*>(.*?)</a>",
|
||||
RegexOptions.IgnoreCase | RegexOptions.Singleline);
|
||||
|
||||
foreach (Match match in anchorPattern.Matches(html))
|
||||
{
|
||||
if (results.Count >= provider.MaxResults) break;
|
||||
|
||||
var href = match.Groups[1].Value.Trim();
|
||||
var anchorText = Regex.Replace(match.Groups[2].Value, "<[^>]+>", " ").Trim();
|
||||
|
||||
if (!href.Contains(provider.JobLinkContains, StringComparison.OrdinalIgnoreCase))
|
||||
continue;
|
||||
|
||||
// Stage 2: anchor text must contain at least one CV keyword
|
||||
if (!cvKeywords.Any(k => anchorText.Contains(k, StringComparison.OrdinalIgnoreCase)))
|
||||
continue;
|
||||
|
||||
// Make absolute URL
|
||||
if (!Uri.TryCreate(href, UriKind.Absolute, out var absoluteUri))
|
||||
{
|
||||
if (!Uri.TryCreate(baseUri, href, out absoluteUri))
|
||||
continue;
|
||||
}
|
||||
|
||||
var url = absoluteUri.GetLeftPart(UriPartial.Path);
|
||||
if (seen.Add(url))
|
||||
results.Add(url);
|
||||
}
|
||||
|
||||
_logger.LogInformation("Provider {Provider}: found {Count} job URLs", provider.Name, results.Count);
|
||||
return results;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,203 @@
|
||||
using System.Text.Json;
|
||||
using CvMatcher.Models.Requests;
|
||||
using CvSearch.Models.Data;
|
||||
using CvSearch.Models.Data.Entities;
|
||||
using CvSearch.Models.Settings;
|
||||
using CvSearchJob.Clients;
|
||||
using CvSearchJob.Services;
|
||||
using JobScheduler.Tasks;
|
||||
using Microsoft.EntityFrameworkCore;
|
||||
using Microsoft.Extensions.Configuration;
|
||||
using Microsoft.Extensions.DependencyInjection;
|
||||
using Microsoft.Extensions.Logging;
|
||||
using Microsoft.Extensions.Options;
|
||||
|
||||
namespace CvSearchJob.Tasks;
|
||||
|
||||
public sealed class CvSearchJobTask : IJobTask
|
||||
{
|
||||
private readonly IServiceScopeFactory _scopeFactory;
|
||||
private readonly JobSearchSettings _settings;
|
||||
private readonly HtmlJobSearcher _searcher;
|
||||
private readonly ICvMatcherInternalApi _matcherApi;
|
||||
private readonly CvSearchEmailSender _emailSender;
|
||||
private readonly ILogger<CvSearchJobTask> _logger;
|
||||
private readonly string _fileStoragePath;
|
||||
|
||||
public string TaskType => "CvSearch";
|
||||
|
||||
public CvSearchJobTask(
|
||||
IServiceScopeFactory scopeFactory,
|
||||
IOptions<JobSearchSettings> settings,
|
||||
HtmlJobSearcher searcher,
|
||||
ICvMatcherInternalApi matcherApi,
|
||||
CvSearchEmailSender emailSender,
|
||||
IConfiguration config,
|
||||
ILogger<CvSearchJobTask> logger)
|
||||
{
|
||||
_scopeFactory = scopeFactory;
|
||||
_settings = settings.Value;
|
||||
_searcher = searcher;
|
||||
_matcherApi = matcherApi;
|
||||
_emailSender = emailSender;
|
||||
_logger = logger;
|
||||
_fileStoragePath = config["FileStorage:Path"] ?? "Files";
|
||||
if (!Path.IsPathRooted(_fileStoragePath))
|
||||
_fileStoragePath = Path.GetFullPath(Path.Combine(Directory.GetCurrentDirectory(), _fileStoragePath));
|
||||
}
|
||||
|
||||
public async Task ExecuteAsync(IConfiguration parametersSection, CancellationToken cancellationToken)
|
||||
{
|
||||
if (!_settings.Enabled) return;
|
||||
|
||||
using var scope = _scopeFactory.CreateScope();
|
||||
var db = scope.ServiceProvider.GetRequiredService<CvSearchDbContext>();
|
||||
|
||||
// Recover orphaned Processing sessions (container crashed mid-run)
|
||||
var stuckCutoff = DateTime.UtcNow.AddMinutes(-10);
|
||||
var stuckSessions = await db.JobSearchSessions
|
||||
.Where(s => s.Status == JobSearchStatus.Processing && s.CreatedAt < stuckCutoff)
|
||||
.ToListAsync(cancellationToken);
|
||||
foreach (var stuck in stuckSessions)
|
||||
{
|
||||
stuck.Status = JobSearchStatus.Pending;
|
||||
_logger.LogWarning("Reset stuck session {SessionId} back to Pending", stuck.Id);
|
||||
}
|
||||
if (stuckSessions.Count > 0)
|
||||
await db.SaveChangesAsync(cancellationToken);
|
||||
|
||||
var pending = await db.JobSearchSessions
|
||||
.Where(s => s.Status == JobSearchStatus.Pending)
|
||||
.OrderBy(s => s.CreatedAt)
|
||||
.Take(1)
|
||||
.FirstOrDefaultAsync(cancellationToken);
|
||||
|
||||
if (pending is null) return;
|
||||
|
||||
_logger.LogInformation("Processing job search session {SessionId}", pending.Id);
|
||||
pending.Status = JobSearchStatus.Processing;
|
||||
await db.SaveChangesAsync(cancellationToken);
|
||||
|
||||
try
|
||||
{
|
||||
var results = await RunSearchAsync(pending, db, cancellationToken);
|
||||
|
||||
pending.Status = JobSearchStatus.Done;
|
||||
await db.SaveChangesAsync(cancellationToken);
|
||||
|
||||
var attachmentPath = BuildCvPath(pending.CvDocumentId);
|
||||
await _emailSender.SendResultsAsync(pending.Email, attachmentPath, results, cancellationToken);
|
||||
_logger.LogInformation("Session {SessionId} done. {Count} results sent.", pending.Id, results.Count);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogError(ex, "Session {SessionId} failed.", pending.Id);
|
||||
pending.Status = JobSearchStatus.Failed;
|
||||
await db.SaveChangesAsync(cancellationToken);
|
||||
}
|
||||
}
|
||||
|
||||
private async Task<List<JobSearchResultEntity>> RunSearchAsync(
|
||||
JobSearchSessionEntity session,
|
||||
CvSearchDbContext db,
|
||||
CancellationToken ct)
|
||||
{
|
||||
var cvKeywords = session.Keywords
|
||||
.Split(',', StringSplitOptions.RemoveEmptyEntries)
|
||||
.Select(k => k.Trim())
|
||||
.Where(k => k.Length > 0)
|
||||
.ToList();
|
||||
|
||||
var providers = GetProviders(session.ProviderConfigJson);
|
||||
var jobUrls = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
|
||||
|
||||
foreach (var provider in providers)
|
||||
{
|
||||
var urls = await _searcher.SearchJobUrlsAsync(provider, cvKeywords, ct);
|
||||
foreach (var url in urls) jobUrls.Add(url);
|
||||
}
|
||||
|
||||
var candidates = jobUrls.Take(_settings.MaxJobsToMatch).ToList();
|
||||
_logger.LogInformation("Session {SessionId}: {Count} candidate job URLs to match", session.Id, candidates.Count);
|
||||
|
||||
var results = new List<JobSearchResultEntity>();
|
||||
|
||||
foreach (var url in candidates)
|
||||
{
|
||||
try
|
||||
{
|
||||
var matchRequest = new MatchJobRequest
|
||||
{
|
||||
CvDocumentId = session.CvDocumentId,
|
||||
JobUrl = url,
|
||||
GdprConsent = true
|
||||
};
|
||||
|
||||
var matchResult = await _matcherApi.MatchJobAsync(matchRequest, ct);
|
||||
if (matchResult.Score < _settings.MinMatchScore)
|
||||
{
|
||||
_logger.LogDebug("Session {SessionId}: {Url} scored {Score}% (below threshold)", session.Id, url, matchResult.Score);
|
||||
continue;
|
||||
}
|
||||
|
||||
var entity = new JobSearchResultEntity
|
||||
{
|
||||
Id = Guid.NewGuid().ToString("N"),
|
||||
SessionId = session.Id,
|
||||
ProviderName = GuessProvider(url, providers),
|
||||
JobUrl = url,
|
||||
JobTitle = matchResult.Summary.Split('.').FirstOrDefault()?.Trim() ?? "Job",
|
||||
JobText = string.Empty,
|
||||
Score = matchResult.Score,
|
||||
ResultJson = JsonSerializer.Serialize(matchResult, new JsonSerializerOptions(JsonSerializerDefaults.Web)),
|
||||
CreatedAt = DateTime.UtcNow
|
||||
};
|
||||
|
||||
db.JobSearchResults.Add(entity);
|
||||
await db.SaveChangesAsync(ct);
|
||||
results.Add(entity);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex, "Session {SessionId}: match failed for {Url}", session.Id, url);
|
||||
}
|
||||
}
|
||||
|
||||
results.Sort((a, b) => b.Score.CompareTo(a.Score));
|
||||
return results;
|
||||
}
|
||||
|
||||
private List<JobProviderConfig> GetProviders(string? providerConfigJson)
|
||||
{
|
||||
if (string.IsNullOrWhiteSpace(providerConfigJson)) return _settings.Providers.Where(p => p.Enabled).ToList();
|
||||
try
|
||||
{
|
||||
return JsonSerializer.Deserialize<List<JobProviderConfig>>(providerConfigJson,
|
||||
new JsonSerializerOptions(JsonSerializerDefaults.Web))
|
||||
?? _settings.Providers.Where(p => p.Enabled).ToList();
|
||||
}
|
||||
catch
|
||||
{
|
||||
return _settings.Providers.Where(p => p.Enabled).ToList();
|
||||
}
|
||||
}
|
||||
|
||||
private static string GuessProvider(string url, List<JobProviderConfig> providers)
|
||||
{
|
||||
foreach (var p in providers)
|
||||
{
|
||||
if (!string.IsNullOrWhiteSpace(p.JobLinkContains) &&
|
||||
url.Contains(p.JobLinkContains, StringComparison.OrdinalIgnoreCase))
|
||||
return p.Name;
|
||||
}
|
||||
|
||||
return Uri.TryCreate(url, UriKind.Absolute, out var uri) ? uri.Host : "unknown";
|
||||
}
|
||||
|
||||
private string BuildCvPath(string cvDocumentId)
|
||||
{
|
||||
var safeId = string.Concat(cvDocumentId.Where(char.IsLetterOrDigit));
|
||||
if (string.IsNullOrWhiteSpace(safeId)) safeId = "cv";
|
||||
return Path.Combine(_fileStoragePath, $"{safeId}.pdf");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,139 @@
|
||||
{
|
||||
"Logging": {
|
||||
"LogLevel": {
|
||||
"Default": "Information",
|
||||
"Microsoft.Hosting.Lifetime": "Information",
|
||||
"Microsoft.Extensions.Hosting": "Information",
|
||||
"System.Net.Http.HttpClient": "Warning",
|
||||
"CvSearchJob": "Information",
|
||||
"JobScheduler": "Information"
|
||||
}
|
||||
},
|
||||
"LogEnvironmentOnStartup": true,
|
||||
"Serilog": {
|
||||
"Using": [
|
||||
"Serilog.Sinks.Console",
|
||||
"Serilog.Sinks.File",
|
||||
"Serilog.Sinks.Email"
|
||||
],
|
||||
"MinimumLevel": {
|
||||
"Default": "Information",
|
||||
"Override": {
|
||||
"Microsoft.AspNetCore": "Warning",
|
||||
"Microsoft.Extensions.Hosting": "Information",
|
||||
"Microsoft.Hosting.Lifetime": "Information",
|
||||
"System.Net.Http.HttpClient": "Warning",
|
||||
"CvSearchJob": "Information",
|
||||
"JobScheduler": "Information"
|
||||
}
|
||||
},
|
||||
"WriteTo": [
|
||||
{
|
||||
"Name": "Console",
|
||||
"Args": {
|
||||
"outputTemplate": "[{Timestamp:HH:mm:ss} {Level:u3}] {SourceContext}: {Message:lj}{NewLine}{Exception}"
|
||||
}
|
||||
},
|
||||
{
|
||||
"Name": "File",
|
||||
"Args": {
|
||||
"path": "logs/cv-search-job-.log",
|
||||
"rollingInterval": "Day",
|
||||
"retainedFileCountLimit": 30,
|
||||
"outputTemplate": "{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} [{Level:u3}] {SourceContext}: {Message:lj}{NewLine}{Exception}"
|
||||
}
|
||||
},
|
||||
{
|
||||
"Name": "Email",
|
||||
"Args": {
|
||||
"restrictedToMinimumLevel": "Error",
|
||||
"fromEmail": "",
|
||||
"toEmail": "",
|
||||
"mailServer": "",
|
||||
"networkCredential": {
|
||||
"userName": "",
|
||||
"password": ""
|
||||
},
|
||||
"port": 587,
|
||||
"enableSsl": true,
|
||||
"emailSubject": "[mihes.ro CV search job] Error Alert",
|
||||
"outputTemplate": "{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} [{Level:u3}] {SourceContext}{NewLine}{Message:lj}{NewLine}{Exception}",
|
||||
"batchPostingLimit": 10,
|
||||
"period": "0.00:05:00"
|
||||
}
|
||||
}
|
||||
],
|
||||
"Enrich": [
|
||||
"FromLogContext",
|
||||
"WithMachineName",
|
||||
"WithEnvironmentName"
|
||||
]
|
||||
},
|
||||
"Database": {
|
||||
"Host": "localhost",
|
||||
"Port": 1433,
|
||||
"Name": "MyAiDb",
|
||||
"User": "sa",
|
||||
"Password": "",
|
||||
"TrustServerCertificate": true
|
||||
},
|
||||
"CvMatcherApi": {
|
||||
"BaseUrl": "http://cv-matcher-api:8080",
|
||||
"InternalApiKey": ""
|
||||
},
|
||||
"FileStorage": {
|
||||
"Path": "Files"
|
||||
},
|
||||
"Smtp": {
|
||||
"Host": "",
|
||||
"Port": 587,
|
||||
"Username": "",
|
||||
"Password": "",
|
||||
"UseStartTls": false
|
||||
},
|
||||
"Contact": {
|
||||
"ToEmail": ""
|
||||
},
|
||||
"JobSearch": {
|
||||
"Enabled": true,
|
||||
"JobSearchLinkBaseUrl": "https://myai.ro",
|
||||
"TokenExpiryDays": 7,
|
||||
"MinMatchScore": 15,
|
||||
"MaxJobsToMatch": 15,
|
||||
"Providers": [
|
||||
{
|
||||
"Name": "ejobs.ro",
|
||||
"Enabled": false,
|
||||
"SearchUrlTemplate": "https://www.ejobs.ro/locuri-de-munca/{keywords}/",
|
||||
"JobLinkContains": "/user/locuri-de-munca/job/",
|
||||
"InitialKeywords": [],
|
||||
"MaxResults": 20
|
||||
},
|
||||
{
|
||||
"Name": "bestjobs.eu",
|
||||
"Enabled": false,
|
||||
"SearchUrlTemplate": "https://www.bestjobs.eu/ro/locuri-de-munca?q={keywords}",
|
||||
"JobLinkContains": "/ro/locuri-de-munca/",
|
||||
"InitialKeywords": [],
|
||||
"MaxResults": 20
|
||||
},
|
||||
{
|
||||
"Name": "linkedin.com",
|
||||
"Enabled": false,
|
||||
"SearchUrlTemplate": "https://www.linkedin.com/jobs/search/?keywords={keywords}&location=Romania",
|
||||
"JobLinkContains": "/jobs/view/",
|
||||
"InitialKeywords": [],
|
||||
"MaxResults": 20
|
||||
}
|
||||
]
|
||||
},
|
||||
"Jobs": {
|
||||
"Tasks": [
|
||||
{
|
||||
"TaskType": "CvSearch",
|
||||
"Enabled": true,
|
||||
"Interval": "00:00:30"
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,30 @@
|
||||
<Project Sdk="Microsoft.NET.Sdk.Worker">
|
||||
|
||||
<PropertyGroup>
|
||||
<TargetFramework>net10.0</TargetFramework>
|
||||
<Nullable>enable</Nullable>
|
||||
<ImplicitUsings>enable</ImplicitUsings>
|
||||
<RootNamespace>CvSearchJob</RootNamespace>
|
||||
<AssemblyName>cv-search-job</AssemblyName>
|
||||
</PropertyGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<PackageReference Include="MailKit" Version="4.16.0" />
|
||||
<PackageReference Include="Microsoft.Extensions.Hosting" Version="10.0.0" />
|
||||
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.7" />
|
||||
<PackageReference Include="Refit.HttpClientFactory" Version="10.1.6" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<Folder Include="logs\" />
|
||||
</ItemGroup>
|
||||
|
||||
<ItemGroup>
|
||||
<ProjectReference Include="..\..\Apis\cv-matcher-api-models\cv-matcher-api-models.csproj" />
|
||||
<ProjectReference Include="..\..\Apis\cv-search-models\cv-search-models.csproj" />
|
||||
<ProjectReference Include="..\..\Apis\shared-models\shared-models.csproj" />
|
||||
<ProjectReference Include="..\..\Helpers\startup-helpers\startup-helpers.csproj" />
|
||||
<ProjectReference Include="..\job-scheduler\job-scheduler.csproj" />
|
||||
</ItemGroup>
|
||||
|
||||
</Project>
|
||||
Reference in New Issue
Block a user