Altitude 5: Resilience

Appendix A

•

Appendix

•

min read

Altitude 5 — Resilience

Every network call is a coin-flip that sometimes lands wrong. These turn a dependency hiccup into a non-event. Library: Polly, without locking the concepts to it.

Pattern	What it solves	The shape	Skip-if
Retry (backoff + jitter)	Ride out a transient failure	Re-issue with exponential backoff and jitter; pair with Idempotency	The op isn't safe to repeat and can't be made idempotent
Circuit Breaker	Stop hammering a failing dependency	Trip open after a failure threshold, fail fast, probe before closing	A single in-process call
Rate Limiting / Throttling	Cap request rate, in and out	A limiter that sheds or queues traffic above a ceiling	Trusted, low-volume internal traffic
Bulkhead	Stop one slow dependency draining everything	Isolated resource pools per dependency	One dependency, low concurrency
Timeout + Fallback	Never wait forever; degrade instead of erroring	A bounded wait with a sensible degraded answer on expiry	No sensible fallback: then fail fast and loud
Idempotency	Make retries and at-least-once delivery safe	The same request applied twice has one effect, keyed by an idempotency token	Naturally idempotent reads
Steady State	Stop a 3am failure from something filling up	Bound every growing resource: rotate logs, purge old data, cap caches and pools	Nothing in the process grows unbounded (rare: check first)

Honorable mentions: Graceful Degradation, Load Shedding, Failover/Redundancy.

Altitude 6 — Observability & Diagnostics

You can't operate what you can't see, and seeing is worthless without something that wakes someone. Vendor-neutral: OpenTelemetry plus Serilog.

Pattern	What it solves	The shape	Skip-if
Health Endpoint Monitoring	Let an orchestrator restart or drain bad instances	Liveness and readiness endpoints it can probe	Nothing's orchestrating it
Structured Logging	Make "what happened at 2am" answerable	Machine-parseable logs with context (Serilog), not string soup	Never
Metrics (Four Golden Signals)	See degradation before customers do	Latency, traffic, errors, saturation: start with four, not fifty	Never
Distributed Tracing + Correlation IDs	Root-cause across a call graph in minutes	One request followed across services via a propagated trace context (OpenTelemetry)	A single process, no fan-out
Externalised Configuration	Run the same image in every environment	Config read from the environment, not baked into the image	Never
Alerting & SLOs	Page a human before the customer notices	Thresholds tied to objectives, not raw noise	No on-call or SLA yet (but wire it before you have users)
Audit Logging	Keep a defensible who-did-what trail	An immutable log of actions, references not PII, separate from diagnostics	No regulatory or security need, no sensitive actions

Honorable mentions: Log Aggregation, Synthetic Monitoring, the RED/USE methods.

Altitude 7 — Hosting (cloud-agnostic, container-first)

The container is the portability boundary; state lives outside it. The same image runs on GCP, AWS or Azure. 12-factor is the backbone.

Pattern	What it solves	The shape	Skip-if
Container as the Unit (12-factor)	Run the same artifact everywhere	One OCI image as the deployable, portable boundary	A managed-runtime function where a container adds nothing
Stateless + Externalised State	Scale out and restart freely	No state in process memory; instances are cattle, not pets	A genuinely single-instance tool
Sidecar / Ambassador	Add cross-cutting infra without app changes	A helper container (proxy, agent) co-deployed beside the app	The helper's job is a library call away
Scale-to-Zero (+ graceful shutdown)	Pay for use, lose no in-flight work	Idle instances drop to zero; SIGTERM drains work on reclaim	Latency-critical, always-warm workloads
Orchestrator-Agnostic Deploy	Avoid cloud lock-in	The same OCI image to Cloud Run, ECS, Container Apps or k8s	You've deliberately committed to one platform's primitives
Infrastructure as Code	Reproduce environments without a platform team	Version-controlled, reviewable infra definitions	A single hand-clicked environment you'll never rebuild (usually a false economy)
Blue-Green / Canary Deploy	Release with zero downtime and instant rollback	Traffic shifted to a parallel or partial slice, rolled back fast on trouble	A low-traffic internal tool where a few seconds of downtime is fine

Honorable mentions: Feature Flags, Gateway/Backend-for-Frontend, Secrets Management, Service Discovery.

One vocabulary, seven rungs. The skip-if column is the half of the card most teams need most.

Next: Appendix B — The Skip List, for the patterns deliberately left off the ladder and the reason each is a tax a small team rarely needs.

the-pareto-stack-cloud-design-patterns-for-small-teams

the-ladder-of-altitudes

how-to-read-this

object-level-the-patterns-that-earn-their-keep

decorator

state

component-level-structuring-one-service

ports-and-adapters-hexagonal

mediator-the-commandquery-split

data-persistence

optimistic-concurrency

messaging-scale

outbox

resilience-staying-up-when-dependencies-dont

rate-limiting-throttling

timeout-fallback

the-composed-pipeline

observability-diagnostics-seeing-inside-production

metrics-the-four-golden-signals

externalised-configuration

hosting-cloud-agnostic-by-default

sidecar-ambassador

orchestrator-agnostic-deploy

a-reference-service

the-relay-outbox-to-queue

the-payment-saga-charge-pay-out-compensate

the-over-engineering-tax

conclusion-production-ready-deliberately

the-pattern-quick-reference-card

altitude-3-data-persistence

altitude-5-resilience

the-skip-list

full-event-sourcing-for-crud

robert-c-martin-uncle-bob-the-house-authority-for-structure

altitude-2-component

altitude-4-messaging-scale

altitude-6-observability-diagnostics

Download the full PDF for free?

Free download — no account required

Get the PDF

Prev Next