Sidecar / Ambassador

Chapter 9

•

Part II

•

min read

Sidecar / Ambassador

Some concerns belong next to your order service but not inside it: a metrics exporter, a log shipper, a proxy that handles mTLS or retries to a flaky payment gateway. Bolting them into your application code couples your release cycle to theirs and bloats the image with things that have nothing to do with placing an order. The Sidecar pattern (Azure Cloud Design Patterns) runs them as a separate container in the same deployment unit, sharing the network and lifecycle but not the codebase. The OpenTelemetry traces and Serilog logs you wired at the observability altitude leave the order container and land in a telemetry sidecar that ships them onward.

The Ambassador is the outbound-facing variant: a local proxy your service talks to as if it were the remote dependency, while the proxy handles connection management, retries, and routing. This is the same idea as the Proxy pattern from the object altitude, lifted to the deployment level. A stand-in that controls access, except now it is a container, not a class.

What it buys you in production: cross-cutting infrastructure you can upgrade independently, written once and reused across the order, menu, and courier services in whatever language. Your application stays about the order. The sidecar handles the plumbing.

Skip-if: you run on a platform that already gives you the concern for free. Cloud Run and Container Apps inject telemetry and ingress without you managing a sidecar; the explicit pattern earns its place on Kubernetes, where the pod is the unit and the sidecar is idiomatic. Don't add a proxy container to chase a problem the platform already solved.

Scale-to-Zero (and Graceful Shutdown)

A worker that costs nothing when no one is using it is a small team's quiet advantage. Take the nightly analytics and report worker that rolls up the day's orders per restaurant: it does real work for an hour after close and sits idle the rest of the day. Scale-to-Zero (Knative, the model behind Cloud Run; AWS App Runner, Azure Container Apps) drops its running instance count to zero between runs and spins one back up on the next message or request. You pay for the rollup, not for idle CPU through the small hours. The tradeoff is the cold start: the first request after an idle period waits for an instance to boot.

The half of this pattern people skip is the half that bites. When the platform scales you down, reclaims an instance, or rolls a new revision, it sends SIGTERM and then waits a grace period before SIGKILL. If the order service ignores SIGTERM mid-rush, in-flight orders get cut before they are written and queued courier-assignment work is lost. Graceful shutdown means catching that signal, refusing new work, and draining what you already accepted.

// ASP.NET Core honours SIGTERM through IHostApplicationLifetime.
// Give in-flight requests time to finish before the host exits.
builder.Services.Configure<HostOptions>(o =>
    o.ShutdownTimeout = TimeSpan.FromSeconds(25));

var app = builder.Build();
var lifetime = app.Services.GetRequiredService<IHostApplicationLifetime>();
lifetime.ApplicationStopping.Register(() =>
    app.Logger.LogInformation("SIGTERM received, draining in-flight work"));

What it buys you in production: near-zero cost for the off-peak worker, and clean reclaims that don't drop a single order when the platform decides to move you. The two halves are one pattern. Elasticity you can't shut down gracefully is just a faster way to lose orders.

Skip-if: the order API itself, which is latency-critical and never truly idle through the lunch-to-dinner stretch, where the cold-start penalty on a first request costs more than the idle instance saves. Keep a warm minimum there and let the analytics worker scale to zero. Drain on SIGTERM regardless, because reclaims and rollouts happen whether or not you ever scale to zero.

the-pareto-stack-cloud-design-patterns-for-small-teams

the-ladder-of-altitudes

how-to-read-this

object-level-the-patterns-that-earn-their-keep

decorator

state

component-level-structuring-one-service

ports-and-adapters-hexagonal

mediator-the-commandquery-split

data-persistence

optimistic-concurrency

messaging-scale

outbox

resilience-staying-up-when-dependencies-dont

rate-limiting-throttling

timeout-fallback

the-composed-pipeline

observability-diagnostics-seeing-inside-production

metrics-the-four-golden-signals

externalised-configuration

hosting-cloud-agnostic-by-default

sidecar-ambassador

orchestrator-agnostic-deploy

a-reference-service

the-relay-outbox-to-queue

the-payment-saga-charge-pay-out-compensate

the-over-engineering-tax

conclusion-production-ready-deliberately

the-pattern-quick-reference-card

altitude-3-data-persistence

altitude-5-resilience

the-skip-list

full-event-sourcing-for-crud

robert-c-martin-uncle-bob-the-house-authority-for-structure

altitude-2-component

altitude-4-messaging-scale

altitude-6-observability-diagnostics

Download the full PDF for free?

Free download — no account required

Get the PDF

Prev Next