Home

/

The Production-Ready Playbook

/

Altitude 6: Observability & Diagnostics

Altitude 6: Observability & Diagnostics

Appendix C
Appendix
4
min read

Altitude 6 — Observability & Diagnostics

  • The Four Golden Signals (latency, traffic, errors, saturation) — Google, Site Reliability Engineering (O'Reilly, 2016), Ch. 6, Monitoring Distributed Systems. https://sre.google/sre-book/monitoring-distributed-systems/ Watched on the order service at the dinner rush. Complements: the RED method (Tom Wilkie) and the USE method (Brendan Gregg, https://www.brendangregg.com/usemethod.html).
  • Health Endpoint Monitoring — Azure Cloud Design Patterns (https://learn.microsoft.com/azure/architecture/patterns/health-endpoint-monitoring); ASP.NET Core health checks. Liveness and readiness for the order service's orchestrator.
  • Distributed Tracing, Metrics, Logs (the three pillars) + Correlation IDsOpenTelemetry (CNCF). https://opentelemetry.io/docs/ Trace one order across order → payment → courier. Vendor-neutral; export to Cloud Trace / Monitoring (X-Ray / CloudWatch · Azure Monitor).
  • Structured Logging — Serilog. https://serilog.net/ Logs keyed on order_id and tenant_id.
  • Externalised ConfigurationThe Twelve-Factor App, "III. Config." https://12factor.net/config Surge thresholds as config, not code. Overlaps Altitude 7; taught here as the diagnostics-friendly habit.
  • Alerting & SLOs — Google SRE, Service Level Objectives (https://sre.google/sre-book/service-level-objectives/) and Alerting on SLOs (https://sre.google/workbook/alerting-on-slos/). "99% of orders confirmed within 10s": page on objectives, not noise.
  • Audit Logging — OWASP Logging Cheat Sheet. https://cheatsheetseries.owasp.org/cheatsheets/Logging_Cheat_Sheet.html An immutable who-did-what trail for refunds and cancellations: references, not PII.
  • Honorable mentions — Log Aggregation; Synthetic Monitoring; the RED and USE methods.
A trace of one order

Altitude 7 — Hosting

  • The Twelve-Factor App — Adam Wiggins / Heroku, 2011. https://12factor.net/ The backbone: stateless processes, externalised config, disposability, port binding. The order service runs stateless, state in DB and queue.
  • Container as the unit — OCI Image Specification (https://opencontainers.org/); Docker. One image for the order service.
  • Sidecar / Ambassador — Azure Cloud Design Patterns (https://learn.microsoft.com/azure/architecture/patterns/sidecar and …/ambassador); Kubernetes pod sidecars. Telemetry shipped by a sidecar.
  • Scale-to-Zero — Knative / Cloud Run (https://cloud.google.com/run/docs); AWS App Runner · Azure Container Apps. The analytics/report worker scales to zero off-peak. Note the cold-start tradeoff.
  • Graceful Shutdown (SIGTERM) — Kubernetes pod lifecycle and container lifecycle hooks. https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/ Finish in-flight orders before exit.
  • Orchestrator-Agnostic Deploy — the same OCI image to Cloud Run / ECS / Container Apps / Kubernetes.
  • Infrastructure as Code — Terraform / Pulumi / Bicep; Fowler bliki. https://martinfowler.com/bliki/InfrastructureAsCode.html The whole food-delivery stack as reproducible, reviewable, cloud-portable environments.
  • Blue-Green Deployment — Fowler bliki. https://martinfowler.com/bliki/BlueGreenDeployment.html
  • Canary Release — Danilo Sato, Fowler bliki. https://martinfowler.com/bliki/CanaryRelease.html Route 5% of orders through a new pricing engine first, with an instant rollback path.
  • Honorable mentions — Feature Flags / Feature Toggles (Fowler, https://martinfowler.com/articles/feature-toggles.html); Gateway / Backend-for-Frontend (Sam Newman; Fowler, https://martinfowler.com/articles/gateway-pattern.html) — a BFF each for the customer and courier apps; Secrets Management (GCP Secret Manager · HashiCorp Vault); Service Discovery.
One image, any cloud

Other foundational and framing texts

  • Evans, Eric. Domain-Driven Design: Tackling Complexity in the Heart of Software. Addison-Wesley, 2003. The source for Anti-Corruption Layer (over the legacy restaurant POS), Domain Events, and Specification.
  • Microsoft. Azure Cloud Design Patterns. A vendor-published but broadly applicable catalogue of cloud patterns. https://learn.microsoft.com/azure/architecture/patterns/ Cited as "(Azure Cloud Design Patterns)" throughout; individual pages are listed by altitude above.
  • Beyer, Betsy; Jones, Chris; Petoff, Jennifer; Murphy, Niall Richard (eds.). Site Reliability Engineering: How Google Runs Production Systems. O'Reilly, 2016. Free online: https://sre.google/sre-book/table-of-contents/ The source for the four golden signals and SLO-based alerting.
  • OpenTelemetry (CNCF). https://opentelemetry.io/docs/ The vendor-neutral standard behind the observability altitude.

Framing & delivery metrics

Used to frame what good delivery looks like, not as quoted benchmarks.

  • DORA — the four key metrics (deployment frequency, lead time for changes, change failure rate, time to restore service). https://dora.dev/guides/dora-metrics-four-keys/ A framework, not a multiplier; don't quote elite-versus-low figures without the report.
  • Standish Group, CHAOS 2015 — small projects succeed far more often than grand ones, which underpins the case for small, scoped work. Cite as "(Standish CHAOS 2015)"; use sparingly.

A note on the numbers: this book is built on provenance. If a claim has no source above, it does not belong in the book.

the-pareto-stack-cloud-design-patterns-for-small-teams
the-ladder-of-altitudes
how-to-read-this
object-level-the-patterns-that-earn-their-keep
decorator
state
component-level-structuring-one-service
ports-and-adapters-hexagonal
mediator-the-commandquery-split
data-persistence
optimistic-concurrency
messaging-scale
outbox
resilience-staying-up-when-dependencies-dont
rate-limiting-throttling
timeout-fallback
the-composed-pipeline
observability-diagnostics-seeing-inside-production
metrics-the-four-golden-signals
externalised-configuration
hosting-cloud-agnostic-by-default
sidecar-ambassador
orchestrator-agnostic-deploy
a-reference-service
the-relay-outbox-to-queue
the-payment-saga-charge-pay-out-compensate
the-over-engineering-tax
conclusion-production-ready-deliberately
the-pattern-quick-reference-card
altitude-3-data-persistence
altitude-5-resilience
the-skip-list
full-event-sourcing-for-crud
robert-c-martin-uncle-bob-the-house-authority-for-structure
altitude-2-component
altitude-4-messaging-scale
altitude-6-observability-diagnostics

Download the full PDF for free?

Free download — no account required

Get the PDF
Get the PDF
Related Chapters
Free Download
Get the full PDF
All pages, including all code examples, diagrams, and the appendix reference card.
No spam. Unsubscribe at any time.
Your email won't be shared.
Oops! There's a problem with your request. We're working on fixing it. Please try again later.