I can’t begin to count the number of times I have heard development teams say things like this:
- “We don’t have time to fix that.”
- “Product won’t let us fix our technical debt.”
- “All we do is build new features.”
Which I mentally translate to:
We don’t know how to explain the impact of our technical projects.
This post is an attempt to explain how quantifying Cost of Delay can help teams successfully understand and prioritize their technical project backlog.
The Prioritization Problem
Prioritization is key to ensuring that a company is investing in efforts that will sustainably grow the business. Product teams spend a tremendous amount of time working to understand and quantify project impacts so that financial benefits are maximized (or at least they should!).
However, it’s not always the case that there is a corresponding understanding of the technical backlog — infrastructure projects, security upgrades, etc. Depending on the organizational structure, these projects may be in separate tools, spreadsheets, or simply in people’s heads. Even if there is a central technical project list, project benefits are often described in technical terms and may require in-depth system knowledge to appreciate. These inconsistencies make prioritizing technical projects within development a real challenge.
Outside of development, this lack of understanding becomes a bigger problem. When the Product team has already quantified the financial impacts of six months of feature development, hand-wavy technical project lists aren’t going to cut it.
With that in mind the question becomes: How can the impacts of technical projects be quantified such that they can be directly compared to other projects?
Introducing Cost of Delay
The term “Cost of Delay” captures the idea that you can quantify the financial impact of delaying work. It is measured in dollars / time unit, and should represent the total financial impact to the organization if a project is delayed for a single time unit.
I first came across the concept in the writings of Donald Reinertsen, shown here describing how he came to recommend quantifying the Cost of Delay to his customers.
Why Cost of Delay?
I have found Cost of Delay to be an excellent means of prioritizing projects because it provides a consistent way to compare the impact of any type of project. Since Cost of Delay is measured in dollars / time unit, there’s no ambiguity when comparing impacts. By quantifying Cost of Delay it’s possible to have a much more focused conversation about technical initiatives, and these can then be compared directly according to financial impact.
Calculating Cost of Delay for a Project
If you want to calculate the Cost of Delay for a project, you need a model that describes the impact of the project to the organization in financial terms. There are many ways to do this. Projects can reduce operational costs, reduce security risks, generate revenue, eliminate time-wasting development friction, etc. Depending on the needs of the organization, models can be basic so as to provide high-level directional guidance, or detailed to the point that historical operational cost data is included.
Personally, I prefer to start with a simple model if at all possible. I’ve found that the prioritization conversation that happens across projects has the most impact, so I try to get there as quickly as possible.
Example: Reducing Build Time
Let’s work through an example of how to calculate Cost of Delay for a single project: reducing build times.
Scenario: Say you are part of a 100 person development organization where everyone is working in a monorepo. Build times average about 20 minutes, and people are tired of having to wait for builds during the Pull Request review process. Is it worth spending time to make the build faster?
Because the goal of this project is to reduce build times, that impact can be modeled in (at least) two areas:
- Reduction in continuous integration provider costs
- Reduction in developer “downtime” (i.e., waiting for builds to complete, context switching, etc.)
If you make some assumptions about how many PRs each developer creates per week, and how many reviews on average each of those PRs requires, you can create a model that calculates the amount spent on CI infrastructure as well as the cost of having developers wait for builds to complete. (A spreadsheet containing this model is available here.)
Note: the goal of this model is not to perfectly reflect reality, but to represent the project’s impacts given a shared set of organizational assumptions. This consistency will make it possible to compare the impacts of one project to others.
Based on this model, keeping the same build infrastructure and reducing build times by five minutes would result in a savings of nearly $1500 / week, or about $78,000 annually. Reducing build times by 10 minutes would have a savings of $3000/week, or $156,000 annually.
Additionally, because this model scales with the number of developers, if the team were to start hiring aggressively (and who isn’t?) the weekly Cost of Delay would increase over time as more engineers join the organization. (You might choose to include this growth in the model, depending on the conversation.)
Yes, there are valid questions as to the amount of downtime that developers actually incur when waiting for builds to complete, and you could certainly model things differently. To me that’s the beauty of a model with documented assumptions — your team can collaboratively determine which assumptions make the most sense for your organization.
Bonus: Using the Model to Explore Options
Once a model has been created for a project, it can sometimes be used to explore various implementation options. For example, for this project there are at least two options to reduce build times:
- Have someone with a detailed understanding of the build process work to manually optimize the build
- Throw hardware at it
Since you now have a model that can give you a high level understanding of how these might work out in practice, this can be quite enlightening.
For example, if it turns out that you can reduce build times by five minutes using a larger instance (say one that is 4 times more expensive), then while your CI provider costs might increase from $288 to $864, the overall cost to the organization would decrease from $6479 to $5628. In that case spending more money might actually save you money.
Prioritizing Projects with Cost of Delay
Okay, so once you have the Cost of Delay for a project, how can that be used to prioritize?
The recommended way to do this is to divide the Cost of Delay by the estimate of how long it will take to do the work. This is often referred to as CD3 (Cost of Delay Divided by Duration), and using this combination of factors to order projects will optimize the value delivered over that time period. (For a detailed explanation as to why this optimizes value in this fashion,1 see Reinertsen’s The Principles of Product Development Flow.)
For example, here are the CD3 values for several projects:
|ID||Project||Cost of Delay||Duration||CD3|
|1a||Reduce build time by 5 minutes (move from Large to 2X-large)||$1,714/week||1/5 week||8570|
|1b||Reduce build time by 10 minutes (Manually optimize build)||$2,997/week||1 week||2997|
|2||Increase log retention from 7 to 30 days||$37,860/week||1 week||37860|
Even with this simple list there are some interesting results.
Due to the large time impacts missing logs have on problem investigation efforts, the Cost of Delay for increasing log retention is very large relative to the other projects on this list. CD3 says that that project should be done first (if a single team has to choose). Only once that project is completed should the team experiment with other instance types to reduce build times, even though that build time reduction effort is only expected to take a day.
Is this really the best order to complete these projects? If reducing build times by 5 minutes only takes a day, shouldn’t that get done first? Let’s see.
If the team tried out new instance sizes first (project 1a), and then increased the log retention (project 2), they would incur the Cost of Delay of the both project 1a and project 2 for the first day, and then the cost of project 2 for the remainder of its implementation. That’s ($1,714 / 5) + ($37,860 / 5) + $37,860 = $45,775.
If the team were to complete the projects in the order indicated by CD3, the costs would be $37,860 + $1,714 + ($1,714 / 5) = $39,917. The CD3 ordering saves $5,859 while delivering the same results in the same time period.
I hope I’ve provided some insights into how Cost of Delay can help you understand project impacts and assist in project prioritization. Once your organization is able to clearly describe project impacts in financial terms, it becomes much easier to advocate for technical initiatives and discuss priority trade-offs with other teams.
tl;dr: it minimizes the area under the project cost curve. ↩︎