One full screening ReAct loop (Semantic Kernel)

Appendix A

•

Appendix

•

min read

— One full screening ReAct loop (Semantic Kernel)

This is the full code behind a single screening agent: tools the model can call, instructions that demand reasons (and refuse to screen on protected characteristics), and the run kicked off through the gateway with automatic function calling. It fleshes out the resume screening use case, where the same loop is walked through in prose. The IAtsClient is the only thing that differs between Bullhorn and JobAdder.

// Illustrative excerpt — not a copy-paste product.
public sealed class ScreeningPlugin(IAtsClient ats, ILlmGateway gateway)
{
    [KernelFunction, Description("Fetch a job's title and requirements by id.")]
    public Task<JobBrief> GetJob(string jobId) => ats.GetJobAsync(jobId);

    [KernelFunction, Description("Fetch a candidate's parsed CV text by id.")]
    public Task<CvText> GetCandidateCv(string candidateId) => ats.GetCandidateCvAsync(candidateId);

    [KernelFunction, Description("Persist the screening verdict back to the ATS as a note.")]
    public Task SaveVerdict(string candidateId, string jobId, ScreeningResult result) =>
        ats.WriteScreeningNoteAsync(candidateId, jobId, result);
}

public sealed class ScreeningAgent(Kernel kernel, ILlmGateway gateway)
{
    private const string Instructions = """
        You screen ONE candidate against ONE job. You do not hire or reject.
        1. Call GetJob, then GetCandidateCv.
        2. For each job requirement, decide Met / Partially met / Not met, and quote the
           CV line that justifies it. No quote => Not met.
        3. Score 0-100 for fit, and set recommendation:
           Shortlist | Reject | Flag-for-human (use Flag when evidence is thin or anything
           looks discriminatory or off).
        4. Call SaveVerdict with your reasoning attached.
        Never infer gender, age, ethnicity or nationality. Never reward or penalise a candidate
        on those grounds. If a requirement is unlawful to screen on, flag it for a human and move on.
        """;

    public async Task<AgentOutcome> ScreenAsync(string candidateId, string jobId, CancellationToken ct)
    {
        var settings = new OpenAIPromptExecutionSettings
        {
            ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,  // SK runs the ReAct loop
            Temperature = 0.0                                               // deterministic triage
        };

        // The loop is BOUNDED — fewer steps is the single biggest reliability lever (Ch.10).
        const int MaxSteps = 6;
        var request = ScreenRequest.For(candidateId, jobId);

        for (var step = 0; step < MaxSteps; step++)
        {
            // Every model turn goes through the gateway — allowlist -> DLP -> fail-closed -> call.
            var result = await gateway.InvokeAgentAsync(kernel, Instructions, request.ToInput(), settings, ct);

            if (!ScreeningResult.TrySchemaParse(result.Text, out var verdict))
            {
                // ONE repair pass: hand the model its own broken output + the error, ask again.
                var repaired = await gateway.InvokeAgentAsync(
                    kernel, Instructions, request.AsRepair(result.Text, "must match ScreeningResult schema"),
                    settings, ct);
                if (!ScreeningResult.TrySchemaParse(repaired.Text, out verdict))
                    return AgentOutcome.NeedsHuman("schema failed after repair");   // the leash
            }

            if (verdict.IsComplete) return AgentOutcome.Done(verdict);
            request = request.With(verdict);          // feed the VALIDATED result forward, never the raw text
        }
        return AgentOutcome.NeedsHuman("exceeded max steps");   // never loop forever
    }
}

The trace that runs underneath reads like a junior teammate thinking out loud — Thought → Action → Observe, repeated — and that trace, not the bare score, is the product a recruiter reads and either nods at or overrules.

The agent does the tireless reading. The schema gate, the repair pass, and the human flag make sure a confident-but-wrong answer never reaches the recruiter as if it were right.

— Redaction & structured extraction (the allowlist in practice)

The mechanism behind "the model never sees the document": parse the CV into fields locally, first, then pass only the allowlisted fields to the gateway. Name, DOB, photo, and address are never selected, so they cannot leak — by construction, not by scrubbing. This is the mechanism behind keeping PII out of the LLM, and the same extraction feeds the CV formatting and redaction use case.

// Illustrative excerpt — not a copy-paste product.
public sealed class CvIntake(ILocalCvParser parser, ICvSanitiser sanitiser, ILlmGateway gateway)
{
    public async Task<ScreeningResult> ScreenAsync(byte[] rawCv, string jobId, CancellationToken ct)
    {
        // 1. Sanitise on parse: strip hidden text, zero-width chars, white-on-white (prompt-injection).
        var text = sanitiser.Strip(parser.ToText(rawCv));

        // 2. Extract LOCALLY, on your own infrastructure. The raw document never leaves this method.
        CvFields fields = parser.Extract(text);

        // 3. Select ONLY what the task needs. Name / DOB / address / photo are never chosen.
        var req = AllowlistedRequest.From(new ScreeningFields(
            Skills:          fields.Skills,
            Titles:          fields.Titles,
            YearsExperience: fields.YearsExperience,
            Qualifications:  fields.Qualifications));
            // No Name. No DOB. No Address. No Photo. They were never put on the request.

        // 4. Send through the gate. The wrong path below WON'T COMPILE — there is no From(string).
        // var bad = AllowlistedRequest.From(text);   // <-- compile error by design (Ch.9.3)
        var result = await gateway.SendAsync(req, ct);
        return ScreeningResult.Parse(result.Text);
    }
}

// For Use Case 2 (CV formatting & redaction), the same extraction feeds the branded template —
// and the redacted output is what leaves the building toward a client.
public RedactedCv ToClientReady(CvFields f, BrandTemplate template) => template.Render(new
{
    f.Skills, f.Titles, f.YearsExperience, f.Qualifications, f.WorkHistorySummary
    // PII fields deliberately absent: this object is what a client receives.
});

The output guardrail in the guarded gateway is the backstop: even if a sanitiser missed something, the DLP pass on the response catches PII or an injected instruction before anything is stored or sent onward.

Redaction asks "what should I strip?" An allowlist asks "what does this task actually need?" — and the answer is always a short, structured list. You cannot leak a field you never sent.

— The Polly resilience policy

The model is the most capable dependency you have and the least predictable. This is the fuller version of the layered defence against a flaky dependency: a retry pipeline (transient failures) and a circuit breaker (a dependency that's clearly down), composed and wrapped around the gateway call — with a per-run token budget on top.

// Illustrative excerpt — Polly v8 resilience pipeline around the guarded gateway. Not a copy-paste product.
public sealed class ResilientLlmCaller(ILlmGateway gateway)
{
    private readonly ResiliencePipeline<LlmResult> _pipeline =
        new ResiliencePipelineBuilder<LlmResult>()
            // 1. RETRY — transient failures only: 429s, timeouts, 5xx. Backoff + jitter so a
            //    batch of CVs doesn't retry in lockstep and DDoS your own model endpoint.
            .AddRetry(new RetryStrategyOptions<LlmResult>
            {
                ShouldHandle = new PredicateBuilder<LlmResult>()
                    .Handle<HttpRequestException>()
                    .Handle<TimeoutRejectedException>()
                    .HandleResult(r => r.StatusCode == 429 || r.StatusCode >= 500),
                MaxRetryAttempts = 4,
                BackoffType = DelayBackoffType.Exponential,
                UseJitter = true,                              // de-correlate the retry storm
                Delay = TimeSpan.FromSeconds(1),               // Bullhorn 429 guidance: wait ~1s, retry
            })
            // 2. CIRCUIT BREAKER — stop hammering a dependency that's clearly down.
            .AddCircuitBreaker(new CircuitBreakerStrategyOptions<LlmResult>
            {
                ShouldHandle = new PredicateBuilder<LlmResult>()
                    .HandleResult(r => r.StatusCode >= 500),
                FailureRatio = 0.5,                            // trip if half of recent calls fail
                MinimumThroughput = 10,
                SamplingDuration = TimeSpan.FromSeconds(30),
                BreakDuration = TimeSpan.FromSeconds(30),      // back off, then probe
            })
            // 3. TIMEOUT — a single call may not hang forever.
            .AddTimeout(TimeSpan.FromSeconds(30))
            .Build();

    public async Task<LlmResult> SendAsync(AllowlistedRequest req, RunBudget run, CancellationToken ct)
    {
        // Hard per-run token ceiling — a runaway loop is a budget bug AND an availability bug (Ch.10).
        if (run.TokensUsed + run.Estimate(req) > run.TokenBudget)
            throw new BudgetExceededException(run.TokenBudget);

        // The gateway call still runs inside the pipeline — guardrails AND resilience, composed.
        return await _pipeline.ExecuteAsync(async token => await gateway.SendAsync(req, token), ct);
    }
}

The same 429 → wait ~1s → retry shape with exponential backoff is what you wrap around the ATS clients too. Bullhorn does publish a hard ceiling — 1,500 requests per minute per OAuth Client ID, and its own SDK's default 429 handling is "wait 1s, retry" — so respect it and back off the moment a 429 lands. JobAdder applies throttling too, but its exact numbers sit behind the vendor's help centre rather than the public spec — consult JobAdder's API Throttling guide (or api@jobadder.com) and implement the same defensive 429 backoff regardless. Retries fix the line; the schema validation in the screening loop above fixes the lie; the human flag fixes everything else.

Bound the loops, or the loops bound your budget. The cost of these guards is a handful of lines. The cost of not having them is a bill you discover after it's spent.

— The Serilog / OpenTelemetry redacting log sink

Observability without hoarding PII. Serilog gives structured logs; OpenTelemetry gives distributed traces; both are deliberately cloud-neutral so nothing is locked in. The sink itself is guarded — it accepts only a typed SafeLogEvent, refuses raw strings by construction, and runs a DLP backstop. It backs auditability without storing PII and the structured logging the running system depends on.

// Illustrative excerpt — not a copy-paste product.

// 1. The only thing the log path accepts is a TYPED event — refs, reasoning, versions; never raw CVs.
public sealed record SafeLogEvent(
    string CorrelationId,
    string JobRef, string CandidateRef,    // ATS IDs, not names
    string Recommendation, int Score,
    string ReasoningSummary,               // the "because", already redacted
    string ModelVersion, string PromptVersion,
    string? HumanAction = null)
{
    public static SafeLogEvent Completed(string promptV, int tokens) => /* … */ default!;
    public static SafeLogEvent Blocked(string dir, IReadOnlyList<string> findingTypes) => /* … */ default!;
}

// 2. The sink. There is NO overload that takes a string — raw payloads are refused at compile time.
public sealed class SafeLogSink(ILogger logger, IDlpInspector dlp) : ISafeLogSink
{
    public void Write(SafeLogEvent e)
    {
        // Backstop: scan the reasoning summary before it is written, even though it is meant to be clean.
        if (dlp.QuickScan(e.ReasoningSummary) != ScanStatus.Clean)
            e = e with { ReasoningSummary = "[redacted: DLP backstop]" };

        // Serilog structured log — typed fields you can filter and query, no raw CV anywhere.
        logger.Information(
            "decision corr={Corr} job={Job} cand={Cand} rec={Rec} score={Score} " +
            "modelv={Mv} promptv={Pv} human={Human}",
            e.CorrelationId, e.JobRef, e.CandidateRef, e.Recommendation, e.Score,
            e.ModelVersion, e.PromptVersion, e.HumanAction);
    }
}

// 3. OpenTelemetry tracing, hooked once via a Semantic Kernel function-invocation filter (Ch.11.2).
public sealed class TracingFilter(ISafeLogSink log) : IFunctionInvocationFilter
{
    public static readonly ActivitySource Source = new("Recruiter.Agent");

    public async Task OnFunctionInvocationAsync(
        FunctionInvocationContext ctx, Func<FunctionInvocationContext, Task> next)
    {
        using var activity = Source.StartActivity(ctx.Function.Name);  // one span per tool call
        var sw = Stopwatch.StartNew();
        await next(ctx);                                               // run the actual function
        activity?.SetTag("duration.ms", sw.ElapsedMilliseconds);
        activity?.SetTag("corr", ctx.Arguments.CorrelationId());
        activity?.SetTag("promptv", PromptVersion.Current);           // tags are refs/versions — never CV text
    }
}

// 4. Wiring — Serilog + OTel exported to whichever cloud. The neutrality is the feature.
//    builder.Host.UseSerilog(...);
//    builder.Services.AddOpenTelemetry()
//        .WithTracing(t => t.AddSource("Recruiter.Agent").AddOtlpExporter())   // -> Cloud Trace / X-Ray / App Insights
//        .WithMetrics(m => m.AddMeter("Recruiter.Agent").AddOtlpExporter());   // -> Cloud Monitoring / CloudWatch / Azure Monitor

Logs land in Cloud Logging (CloudWatch Logs / Azure Monitor Logs); traces in Cloud Trace (AWS X-Ray / Azure Monitor + Application Insights); metrics and dashboards in Cloud Monitoring (CloudWatch / Azure Monitor), with Prometheus/Grafana as the portable option. The OpenTelemetry layer means none of that is a hostage to one provider's pricing — you can move the export target without touching the code that emits.

A good trace tells you exactly what the agent did and why, and tells you nothing you'd be sorry to leak. You can't log what you never put in a variable that reaches the log path.

That's the fuller plumbing behind the book's snippets. Read together, they make the same point the prose does: building each piece is an afternoon; keeping all of them correct, tested, and alive against two ATSs that keep moving — that's the part that never ends.

See also: the .env and deployment reference that puts these listings into a single-tenant container you can actually run.

the-math-no-recruiter-can-win-by-hand

what-an-ai-agent-actually-is

the-leash

the-toolkit

the-model-small-capable-swappable

talking-to-your-ats

use-case-1-resume-screening-against-a-job

the-shape-of-the-loop

running-it-thought-action-observation

use-case-2-cv-formatting-redacting-for-clients

reformatting-into-your-branded-template

resume-shortlisting

that-was-easy

security-compliance

keeping-pii-out-of-the-llm

exceptions-reliability

silent-api-drift-the-ats-changes-under-you

when-it-fails-anyway-dead-letter-and-the-leash

monitoring-observability

maintenance-the-lifecycle

the-scorecard-success-metrics-kpis

build-vs-buy-vs-managed

what-an-engineer-actually-costs

what-the-wider-data-says-happens-next

conclusion-how-this-gets-run-for-you

the-promises-behind-the-service

fuller-code-listings

one-full-screening-react-loop-semantic-kernel

env-deployment-reference

secrets-in-dev-vs-production

bullhorn-jobadder-endpoint-cheat-sheets

sources-further-reading

compliance-primary-law-sources

Download the full PDF for free?

Free download — no account required

Get the PDF

Prev Next