Software delivery is the plainest of the five accountabilities, which is exactly why it gets skipped in the conversation about who to hire. When a founder pictures a CTO, they picture strategy: architecture diagrams, a technology vision, a seat in the board meeting. Look at how the role is described in public, by the vendors who staff it and the consultancies who define it, and delivery is barely there. The descriptions skew to vision and oversight. Shipping is assumed.
That assumption is where companies get hurt. Delivery is the accountability for turning intent into something a customer can use, this week, and again the week after. Not a prototype. Not a demo that works on the founder's laptop. Software in production, doing the job it was built to do, on a cadence the business can plan around.
The cadence matters as much as the shipping. A team that delivers a big release every nine months, on a date that slips twice, is not delivering reliably even if the release is good. A founder can't sell against it, can't staff against it, can't promise a customer a date. What you want from this accountability is a drumbeat: small things, landing often, predictably enough that you stop wondering whether the next one will arrive.
Here is what it looks like when nobody owns delivery. The work is stop-start. A sprint produces something; the next two produce a refactor nobody asked for and a fortnight lost to an integration that "should have been simple." Dates are given with confidence and missed without explanation. You ask for a status and get a paragraph about complexity. A quarter ends and you realise that nothing a customer can touch went out the door.
This is not usually a story about lazy engineers. It is a story about size. The Standish Group's CHAOS data is blunt on this point: across modern software projects, just 29% land as outright successes, while 52% come in challenged (late, over budget, or stripped of scope) and 19% fail outright (Standish CHAOS 2015). Those are sobering odds before you've written a line of code. The instinct, when delivery is shaky, is to plan harder and scope bigger so that "this time we get it right." That instinct is the trap.
Because the dominant predictor of whether a project ships is not the team's talent or the quality of the plan. It's the size of the bet. CHAOS found small projects succeed 61% of the time; "grand" projects succeed 6% of the time (Standish CHAOS 2015). Same firms, same engineers, wildly different outcomes. The variable that moved was how much you tried to ship at once. A company with a delivery problem has often, without noticing, talked itself into a grand project.
The other half of the failure is the tail. Look at large IT projects and the average cost overrun runs to 27% (Flyvbjerg & Budzier). An overrun you could plan for. What you can't plan for is the part the average hides: roughly one project in six becomes a "black swan," overrunning its budget by 200% and its schedule by some 70% (Flyvbjerg & Budzier). That's the one that doesn't make you late. That's the one that takes the runway with it. When a founder says they got burned by an engineering project, this is usually the project they mean.
The average overrun is the bill you can survive. The one-in-six catastrophe is the one that ends the company.
Reliable execution of software delivery is unglamorous, and that's the point. It looks like a steady stream of small, finished things. Each piece of work is scoped so it can ship on its own. Each has a clear definition of done, so "finished" is a fact you can check rather than a feeling someone reports. Nothing in the pipeline is so large that its failure would take a quarter with it.
Concretely, the unit of work is the task: small enough to ship in days, scoped tightly enough that you can verify it landed. A founder doesn't need to read the code to know a task is done. They need to see the thing it was supposed to do, working. That's the whole bargain. Make the work small, make "done" checkable, and the CHAOS odds start moving in your favour, because you have stopped betting the company on one grand release and started making a series of small bets you can win.
What does that look like on the ground? A shared view of what's moving, like this:
| Task | Status | Definition of done | Shipped |
|---|---|---|---|
| Add CSV export to the reporting page | In review | User can download a filtered report as CSV; matches on-screen totals | — |
| Email receipt after checkout | Shipped | Customer receives a branded receipt within 60s of payment | Tue |
| Fix duplicate-charge edge case | In progress | No customer is charged twice on a retried payment; covered by a regression check | — |
| Search by customer reference | Backlog | Support can find an order by reference in under 5s | — |
You don't have to manage that board to read it. You can see what's moving, what's stuck, and what shipped this week without sitting in a single standup. Predictability comes from the same place reliability does: small units, visible state, a definition of done you can hold someone to.
This is the shape Dev on Demand is built around. Work comes in as a task, one engineer ships it, and you approve it before the next one begins. The cycle is short by design, a few days per task, with the approval gate sitting between every task and the next. You're never staring at a quarter-long black box wondering what's inside it; you're looking at the last thing that shipped and deciding what ships next.
That's not a claim that delivery becomes effortless. It's a claim about where the risk goes. When the unit of delivery is a small, verifiable task and you sign off on each one, the grand-project failure mode has nowhere to form. You can't accidentally drift into a 200% overrun when the longest thing you've committed to is a few days of work you'll inspect before continuing. The odds, the ones CHAOS measured, are quietly on your side again.
Delivery runs on people. So the next question is the obvious one: who's actually shipping, and what happens the day they leave?
Download the full PDF for free?