Home

/

The Production-Ready Playbook

/

The Over-Engineering Tax

The Over-Engineering Tax

Chapter 11
Part III
11
min read

The Order Service in Chapter 10 uses a lot of the ladder. That was the point of it: show the patterns composing cleanly into one shape. The danger is that you read it as a starting template and build all of it on day one, for a food-delivery app with one restaurant signed up and a dozen orders a week.

You won't have over-built because you were careless. You'll have over-built because each pattern, on its own, looked obviously correct. Multitenancy is correct. So is an outbox, and so is a circuit breaker, and that defended judgement on each one is exactly the trap. No single decision is the tax. The sum of them is, taken before the problem that justified each one ever showed up.

Every pattern has a carrying cost. It's code a junior has to understand before they can change anything near it, a failure mode you now own, a line in the runbook, a thing that breaks in a way the boring version never could. A five-person team that adopts a pattern it doesn't yet need hasn't bought insurance. It's bought a second job.

A pattern you don't need yet is not free. It is a liability you chose to carry early.

The deferring question

There's one question that does most of the work here, and you ask it of every pattern before you adopt it: what breaks if I don't add this yet?

If the honest answer is "nothing breaks; it would just be tidier" or "nothing breaks until we hit a scale we're nowhere near," you defer. Write down the trigger that would change the answer ("when a restaurant's order data has to live in its own database for compliance," "when the courier-assignment queue regularly backs up past a minute at dinner rush") and move on. The pattern isn't rejected. It's parked, with the condition that un-parks it written next to it.

This is the inverse of the skip-if from Part II. The skip-if tells you a pattern's adoption signal. The deferring question makes you say out loud what it costs you to not have it today. Most of the time the cost is zero, and zero is a price worth paying for a smaller system you can hold in your head.

What each pattern costs, and what un-parks it

The trap at each altitude

The over-engineering tax has a signature shape at every rung of the ladder. The pattern is real and so is its production payoff; the trap is reaching for that payoff before it exists. What follows is the premature-pattern trap altitude by altitude, each with the question that defers it.

Object

The trap is the abstraction with one implementation. You introduce a CourierMatchingStrategy interface while you only ever match couriers one way, a Factory for an object you only ever construct one way, an interface on a class nobody else implements. The indirection costs a reader a jump to a second file to learn there was nothing there.

What breaks if I don't add this yet? Nothing, until a second implementation actually shows up. When you only assign the nearest free courier, a plain method that finds the nearest free courier is the correct amount of code, not technical debt. Add the Strategy seam the day "fastest" and "cheapest" become real options; the change is small, and the second case will tell you the right shape for the seam, which you'd have guessed wrong today. Robert C. Martin's open-closed principle (Agile Software Development: Principles, Patterns, and Practices) cuts both ways: a seam you open before there's a second case to vary is open for nothing.

Component

The trap is structure that outruns the domain: ports and adapters around an Order Service with one adapter, a mediator over three handlers, a full anti-corruption layer for a restaurant POS you call in two places. The cousin every team meets is the heavy ORM, sold as a shortcut and paid back as a maintenance tax you can't reason about. The book's stance is the smaller one: a thin OrderGateway over stored procedures, SQL you can read. Fowler's Patterns of Enterprise Application Architecture (PoEAA) names the contrast directly: a Table Data Gateway owns one table's SQL, where the Data Mapper an ORM gives you hides that SQL behind change-tracking you don't control.

What breaks if I don't add this yet? Your Order Service stays a service you can read top to bottom. Hexagonal architecture earns its keep when you have real adapters to swap or a domain core worth protecting from I/O. Until then it's ceremony, and ceremony is the thing juniors cargo-cult next.

Data and persistence

This altitude carries the two heaviest taxes in the book. The first is multitenancy: shared schema, tenant_id on every table, row-level security, a filter you can never forget, before you have a second restaurant on the platform. The second is the golden combo, Event Sourcing with CQRS and a materialized read model, stood up for a menu admin that is plainly CRUD.

What breaks if I don't add this yet? For multitenancy: nothing, if you have one restaurant. Book #1 chose single-tenant on purpose and shipped. Onboard the first restaurant with its menu and orders in one place; the move to shared-schema-plus-RLS keyed on tenant_id is a migration you do once, when the second restaurant signs, not a tax you pay from commit one.

For Event Sourcing: you keep a database a new hire understands on their first afternoon. Event Sourcing is the book's loudest skip-if. It buys you an audit-grade history and independent read scale, and it charges you eventual consistency, projection rebuilds, versioned events, and a debugging story most teams underestimate. A menu admin that edits prices and toggles items in and out of stock is a handful of tables and a few stored procedures. A status column on the Order row answers "what state is this in" without an event log behind it. Fowler's own bliki write-up of Event Sourcing is candid about the complexity it adds; reach for events when the history is the product, not before.

CRUD is not the thing you graduate from. For a menu admin it's the thing you should still be running in year three.

Messaging and scale

The trap is the broker you don't have load for: a courier-assignment queue between two services that exchange a hundred orders a day, or competing consumers scaled out for a workload one process handles in its sleep. The worst version is a saga, the heaviest coordination pattern in the messaging chapter, written for an order flow that is two updates in one database and one transaction.

What breaks if I don't add this yet? Nothing, until a real dinner-rush spike knocks you over or a slow consumer falls behind. Handling OrderPlaced with an in-process call is simpler than a queue in every way that matters at low volume: no broker to run, no message to lose, no dead-letter queue to drain, no at-least-once semantics forcing idempotency on you. Add the queue the day the rush is real. Add the order-fulfilment saga when you genuinely have a distributed transaction across charge, courier, and restaurant, which is later than you think and possibly never.

Resilience

The trap here is subtle because resilience patterns feel like pure virtue. A circuit breaker on a dependency that has never once failed. Three retry policies, two timeout budgets, and a bulkhead around a call to the orders database on the same machine. Resilience machinery is itself a source of incidents: a breaker tuned wrong trips under normal load, a retry storm turns a blip into an outage.

What breaks if I don't add this yet? For an in-process or same-region call that doesn't fail, nothing. Resilience is a tax you pay per network boundary, and you pay it where calls actually cross an unreliable boundary and actually fail. The payment gateway, a third party over the public internet, earns the full Polly pipeline of retry, breaker, and timeout. A timeout on every outbound call is cheap and you should have one. That same pipeline wrapped around the local OrderGateway call that can't fail is configuration nobody will dare touch in a year.

Observability

This is the altitude where under-building hurts more than over-building, so the trap is shaped differently. The mistake here is rarely too many patterns. It's standing up the platform-grade rig before you have a platform's problems. Think a full SLO regime with error budgets for an internal restaurant dashboard three staff use, custom dashboards nobody reads, distributed tracing wired through an order flow that is still one process.

What breaks if I don't add this yet? Less than at any other altitude, which is the point: get the cheap, high-value pieces in early and defer the rest. Structured logs keyed on order_id and tenant_id, a health endpoint, and the four golden signals are not a tax; they're the floor, and you want them from the start. A formal "99% of orders confirmed within 10s" SLO with paging tied to an error budget is worth it once a real customer is paged by its absence. Buy the floor early. Buy the rig when the on-call rotation asks for it.

Hosting

The trap is the platform-team architecture without the platform team: a service mesh for the three services your food-delivery app actually runs, or Kubernetes for a workload Cloud Run (AWS App Runner, Azure Container Apps) runs with a single command and scales to zero for free. Worst of all is a multi-region active-active deploy for a marketplace whose restaurants and couriers are all in one city and would not notice an hour of downtime a year.

What breaks if I don't add this yet? Nothing a small team can't absorb, and a great deal of operational surface you'd otherwise own. The Order Service as a 12-factor container behind a managed runtime is cloud-agnostic, scales, and rolls back, with no mesh to operate and no control plane to patch at 3am. A config file is a service mesh you don't have to run. One managed Postgres instance is sharding-by-city you've deferred until a single instance genuinely can't hold the orders. Reach up the hosting altitude when the managed boring option visibly runs out of room, not in anticipation of a scale you're modelling on a whiteboard.

The honest case for boring

Notice the pattern across the altitudes. The boring choice isn't a worse version of the sophisticated one. It's a different bet, and at small scale it's usually the better bet.

Single-tenant before multitenancy. CRUD before Event Sourcing. From there the same instinct keeps repeating: a monolith before microservices, a config file before a service mesh, one database before sharding, an in-process call before a queue. Each pair is the same trade. Less to build now, far less to maintain forever, and a clean migration path to the heavier option if and when the trigger fires. You are not betting against scale. You're refusing to pay for it before it arrives.

The small-team rule underneath all of this: you maintain everything you ship. There is no platform team to absorb the patterns you adopted speculatively. The breaker you wrapped round the local OrderGateway "to be safe" is yours to tune. The order event log you stood up "for flexibility" is yours to rebuild the courier and customer projections against at midnight. Capacity spent maintaining a pattern you don't need yet is capacity not spent on the feature a restaurant is actually asking for.

Maintain what you ship. The corollary is the whole chapter: ship less, so there's less to maintain, so the small team stays fast.

Production-readiness per unit of team capacity is the selection criterion the whole book runs on, and restraint is half of it. The patterns in Part II raise the numerator. The discipline in this chapter protects the denominator. A team that knows the whole catalogue and deploys six of them on purpose will out-ship a team that deploys everything it knows because it could.

Production-ready is a deliberate bar you clear, not a maximum you chase. That's the note the conclusion lands on.

the-pareto-stack-cloud-design-patterns-for-small-teams
the-ladder-of-altitudes
how-to-read-this
object-level-the-patterns-that-earn-their-keep
decorator
state
component-level-structuring-one-service
ports-and-adapters-hexagonal
mediator-the-commandquery-split
data-persistence
optimistic-concurrency
messaging-scale
outbox
resilience-staying-up-when-dependencies-dont
rate-limiting-throttling
timeout-fallback
the-composed-pipeline
observability-diagnostics-seeing-inside-production
metrics-the-four-golden-signals
externalised-configuration
hosting-cloud-agnostic-by-default
sidecar-ambassador
orchestrator-agnostic-deploy
a-reference-service
the-relay-outbox-to-queue
the-payment-saga-charge-pay-out-compensate
the-over-engineering-tax
conclusion-production-ready-deliberately
the-pattern-quick-reference-card
altitude-3-data-persistence
altitude-5-resilience
the-skip-list
full-event-sourcing-for-crud
robert-c-martin-uncle-bob-the-house-authority-for-structure
altitude-2-component
altitude-4-messaging-scale
altitude-6-observability-diagnostics

Download the full PDF for free?

Free download — no account required

Get the PDF
Get the PDF
Related Chapters
Free Download
Get the full PDF
All pages, including all code examples, diagrams, and the appendix reference card.
No spam. Unsubscribe at any time.
Your email won't be shared.
Oops! There's a problem with your request. We're working on fixing it. Please try again later.