Planning & estimating large-scale software projects
This weekend I was talking with a friend about how I'd planned and estimated a huge software project in one of my previous jobs. By that I mean not something which can be done by a single team in a sprint or two, but something which requires several teams, across several quarters of effort, involving many other departments of non-engineering team members.
Sensitive commercial details and company names, obviously, aren't real - to
guilty (wait, that's me!) innocent! 😂
This will be a long post, because we're dealing with projects that can cost eight figures in staff salaries, easy. And make or break entire companies. No apologies made.
Foreword: But Tom, #NoEstimates!
Let's get this out of the way first: Estimates are one of the hardest parts of software development. They're never fully correct - in fact, most of the time they're no more correct than well-informed guesses - and that's hard to accept as an engineer used to working with certainties.
That said, I have yet to work in a company where they're not a necessary evil. Engineering does not work in a vacuum, and there are commercial contracts that will be signed, at a high level, without your involvement. Deals that will have been agreed before you joined the organization. "Tie-ins" with other companies that have specific launch dates. Marketing and finance departments that want to know when they'll need to spend £xm on producing a Christmas ad campaign.
Entire teams of people, who aren't software engineers, and are used to being held accountable for delivery dates in their fields, will expect you to be able to tell them when your part of a project is likely to be done.
There's plenty of information about how to estimate velocity out there for a single team - whether that be through planning poker, story points, or backlog story tracking without estimates at all, but none of those work across huge projects involving multiple teams and departments.
If you're in leadership -- maybe you're an architect, or an engineering manager -- you will be expected to produce estimates for major projects, and you should expect to be held accountable if your estimates are too far off the mark because you failed to do your due diligence when coming up with them.
Step 0: Information Gathering
You'll need some idea of what you need to build. In some cases - like where what you're building is the result of a contract between large companies - this might be contained in reams of contract paperwork, presentations, and appendices.
You might be lucky enough to have business analysts working with you (or reading the contract, or working with clients!), or user researchers working with your users, or a Head of Product with deep subject-matter expertise, who can tell you roughly what needs building.
If you're doubly lucky, they'll have written it down somewhere you can access.
If you're unlucky, you'll need to spend a long time talking to them, and then write it down yourself. But it must be written down. At this stage, don't worry about format; I've had 200-page long Google docs, 10,000 line long spreadsheets, 1.5Gb powerpoints and entire books of user stories. You'll organize it later.
Now you have to read it. All of it.
You have to understand what you're being asked to build, or you'll be working blind, and everything you do after this will be bullshit. Ask questions as you go, to make sure you understand what you're being asked to build.
And keep the people who got this information for you close, and happy; you'll need them alongside you, checking your work (and the development teams work!) every step of the way.
The first rule of estimating complex projects: You must understand what you are being asked to build.
Step 1: The Breakdown
Now you understand what you're being asked to plan out, here's where you start to earn your pay as a technology leader.
The information that you read, and understood, before, now needs grouping together into coherent "boxes". The best tool I've found for this is a physical index card - but there are digital alternatives, too.
An actual index card - redacted - that I wrote while breaking down a recent project. Top-left is the "code" that can be used in planning software to refer to this task, when allocating it to a team. Top center is a reference to the original spreadsheet containing the underlying requirements, and top right is a colour coding that indicates that this feature is required for the MVP we were targeting.
I'm going to use the example of an e-commerce marketplace, here and in the next few steps. Let's say you've got these bullet points in a contract or spreadsheet somewhere, from the previous step:
- A user can add a product to their shopping cart
- A user can see what's in their shopping cart
- A user can update the quantity of items in their cart
- A user can remove items from their cart
- Discounts are applied if the products in their shopping cart are part of a promotion
- A user can see their order total price
- An estimated delivery date is shown for the products in their cart
- An estimate of shipping costs is shown for the products in their cart.
An efficient agile practitioner should be asking some questions here. She'll be asking "Do we absolutely need discount functionality at launch?", for example; at every opportunity you should be trying to drop scope as much as possible. In general, these kind of large-scale software projects are based on contract needs, or product launches - so you should try and descope as much as possible at every stage, to make sure you're only building the true MVP.
But let's say you get the highly unhelpful answer of "We've agreed in the contract to build all of this" (this happens way more than I'd like) . Okay. Assuming that all of these are required before launch, you might group these together and write the following index cards:
- Product detail page
- Shopping cart containing products
- Shopping cart price calculation
- Discount/promotions engine
- Cart shipping estimation
Hopefully, you can see how your engineering skills are starting to come into play; does it make sense to split the requirements in this way?
Will engineering teams, given each of these boxes, be able to build them as a distinct "thing", with a celebration when they're "done", in a way that each builds a shippable feature, that can be evaluated in isolation?
Are there cross-cutting concerns you'll have to build in two or more of these steps that might make more sense to split out as an earlier requirement?
A skilled technologist will use her experience here to try and avoid dependencies at this early stage, and push back on "kitchen sink" requirements, too, pointing out even at this early stage pitfalls that the company would do well to avoid by reducing scope or opening a dialogue with the client.
Great. Now you've got a stack of index cards, which grouped together cover all of the needs in your original, unstructured requirements. And they're also in a format which you're comfortable can be parceled out to engineering teams as needed!
Link them to the underlying requirements. Write them on the back of the index card, or reference the document you got the requirements from in the first place if there's too many.
aside: Essentially, what you're doing here is producing a Work Breakdown Structure, a project management and systems engineering tool.
Step 2: Dependencies
Assuming, at this step, you've pushed back to make sure you're only left with things you absolutely have to build, now we've got to decide what order to do them in.
Which tasks depend on which other tasks?
Again, this is where your skill as an engineer, and your experience at doing work of this type will come in very useful. Will you need to build an integration with a payment provider before you can build your checkout experience?
Maybe, but not necessarily. You could build "everything-but" and avoid a dependency, having the payment step be a simple button on the page which "bypasses" it for now - but does that run the risk of having to go back and rework more than just that button when you eventually do integrate a payment provider?
Remember, you've already pushed back on as much as you can to make sure you're only building an MVP; if you've got to integrate a payment provider before launch, when is the most efficient and least risky time to do it? Is it worth taking on additional risk to avoid a dependency? Maybe it is - maybe it isn't - but make a note of the possibility anyway, whatever you decide. It could save your backside later.
I commandeered an entire conference room for several weeks at this stage. Or at least, the entire wall of one, painted it with whiteboard paint and stuck my index cards to the wall with blu-tac. Remember - this project is going to cost millions, and so you can justify taking over some space.
Let's go back to our example of the e-commerce marketplace, above, with a couple of added items -- and see what that looks like as a dependency chart:
You'll note I've written down why some of the dependencies exist - that's because soon we're going to involve the engineering teams, who we want to challenge our assumptions - so we need to remember what assumptions we've made!
Realistically, you should do this for every dependency. It'll help you figure out how to drop scope later on in the project to avoid dependencies in future.
Step 3: Estimation
The second rule of estimating complex projects: You must always involve those who will actually do the work.
This is where you bring in your tech leads, principal engineers and the like. They know their teams, they're aware of what they're capable of, and they're familiar - more than you are - with the systems already in operation that the features you're proposing will touch.
Give them an introduction to the project, then walk them through your digital board - or physical wall - explaining the very top-level - what's written on the front of the card, not the requirements underlying it - as you go.
As a group, identify which team has the subject-matter expertise or best prior experience for each of the cards on your wall.
Now ask each team lead to go through their cards, referencing the source material, and assign an estimate in weeks for an entire team. Don't go more granular than 1 team-week, and you can use doubling scales - 1 week, 3 weeks, 6 weeks, 12 weeks - too. Too many 12-week estimates -- or whatever you settle on as your maximum -- can imply too much uncertainty, so put a marker on those, and explore them in more detail later on.
aside: Civil engineering uses a series of price books, here in the UK, which "ground" estimations for massive civil engineering projects, based on what has been done before. Software engineering is too young a discipline to have this kind of depth and rigor, which is a shame.
Timebox this. Each card should take 5-10 minutes, maximum, for them to put a weeks-number on, or your teams risk getting lost in the detail. You want to strike a balance between giving a team enough time to understand the rough requirements, vs solutionising and "going deep" into the need, at this stage.
Ensure the teams know that you take the responsibility, not them, for their estimates, and promise them you'll make sure they're taken as estimates, not cast-iron guarantees - and then follow through on that in your conversations with management. You have to trust your teams -- if you don't give them psychological safety, they'll dramatically overestimate to protect themselves.
You don't get that psychological safety; unfortunately, such is life at the top.
Your head is firmly above the parapet, here, and theirs isn't -- which is as it should be. And for that reason, your skill as a technologist is needed here, too; to challenge estimates that surprise you, and write down why.
I strongly, strongly suggest you don't use this as an opportunity to challenge your teams to reduce their estimates, at least not in this group setting. Again; trust your team leads and senior engineers -- they know more than you about what will be involved in actually doing this work.
But a good technologist will be able to identify areas where teams are being over-cautious, or misunderstanding the requirements, or know something she doesn't, and she'll explore those later in 1-on-1 conversations.
A great technologist will use this opportunity to check her assumptions on dependencies, too; she'll explore them with her senior engineers and alter the dependency chart based on those conversations.
Step 4: Analysis
Wonderful. We've got a dependency chart - which can double as a project plan -- we know what team is likely to do each piece of work, and we've got an estimate, in weeks, for each item on it.
Manna from heaven for a project manager, and if you've got a good one available, they'll love being asked for help from here on.
What I did next is apply something basic which I learnt at university; critical path analysis on a PERT chart network diagram.
This is not new -- it comes from 1958, leads to the dreaded Gantt chart, and has been widely criticized as something to be avoided for software projects. But I've still found it helpful to go from "I have no idea how long this will take" to an initial timeframe, and provide an indication of whether the company is "staffed-up" to perform the project at all.
Again, let's take our fictional e-commerce marketplace example, and see what that looks like on an activity on node critical path network diagram:
Here, we've added our estimates to each of the tasks, based on the number of weeks each task will take a team to accomplish. We've added our dependency lines, and we've added "start" and "finish" boxes.
Now we need to fill out the rest, using a forward pass, and then a backwards pass. The activities (tasks) with a zero in the bottom centre column - our 'slack' - indicate the critical path. Explaining how to do this is a blog post in and of itself, but you can find good explanations on google, like this one.
Now, we've got our critical path, in red (those with 'zero' slack/float), and our smallest possible time this project could be completed in - 22 weeks (the earliest finish of the "Finish" box). In essence, it gives us a No Earlier Than or NET date for the entire project.
It also tells us how many team-weeks this fictional, idealised project would require -- i.e how long it would take a single team, working continuously, to complete -- by adding all the estimates together. 43 team-weeks.
There are whole books in project management written about other insights you can glean - and pitfalls to avoid - when using a diagram like this, but that's for another time.
Step 5: Review & Inform
We've gone from Zero to Something, and we've done a reasonable amount of due diligence on getting this "Something". That's great, and more, unfortunately, than most large-scale software projects seem to get.
But there's an important caveat to realise;
- The no-earlier-than date assumes you have an infinite number of teams.
We haven't done any Gantt-style allocation across our already existing teams yet -- and I wouldn't bother (the CEO can hire in a project manager to do that if she so chooses), so the "no earlier than" date is truly the earliest possible date the project can be completed - it could easily be two or three times as long.
But you've also got the total team-weeks the project will take.
Now, you're at a crossroads. You've got three possible outcomes of your no-earlier-than date:
- Easy case: Your NET date is well before your deadline.
- Harder case: Your NET date is well after your deadline...perhaps significantly.
- Frustratingly common, impossible case: Your NET date is close to your deadline, either just before or just after, or your "total team-weeks" is close to 100% of what's available.
In the easy case, you just need to make sure your total team-weeks is around 50% of your current staffing level, and you should be comfortable having a pretty easy conversation with your CEO. If it's not, you can have a conversation about hiring, taking into account the significant overhead and time-lag in onboarding new team members, and the diminishing returns inherent in growing your team.
In the harder case, you're way, way over even in the earliest, most fictional version of your estimation, and you have to have a difficult conversation with your CEO -- and probably your sales director -- where you'll have to talk cutting scope, extending the deadline, or disappointing the client.
And crucially, you're doing it with actual data at your fingertips and actual due diligence having been performed. You're not "gut feeling" it. You're not "whining about deadlines like every other engineer". You're talking cold, hard facts, and hopefully you're doing this well in advance of the project deadline, so there's still time to make changes.
But in the impossible case... We've managed to box ourselves into a corner, because now it looks to anyone seeing the figures like we can do this, and you're going to have to balance on the head of a pin in order to avoid being forced into an impossible situation.
- Double-check all of your estimates with your teams. As estimates are, by
nature, imperfect and usually lower than reality, you will likely discover
things you didn't consider that push the estimate up, and now you're in the
harder-but-not-impossible case I described above.
- This is not cheating -- if you're that tight on a perfect, spherical cow in a vacuum estimate, then you need to recheck your assumptions.
- Now is the time to go deeper on your estimation, in conversation with your team members, spending more than that initial 5-10 minutes on each task.
- Look again at the risks you can take to avoid dependencies.
- Consider any changes you can make to the underlying project structure.
- I once offered - as a "nuclear option" - forking a project so a set of client requirements could be implemented without migrating the existing data. It took - as explained at the start, as a consequence of this option - 6 months to re-integrate the changes at the conclusion of the project, but it was the right commercial decision to make.
- Can you accept any considered, appropriate technical debt to speed things up, while providing context on when that debt will need to be addressed, and how much addressing it will cost?
- Consider if any of your tasks can be undertaken by consultants or bought "off-the-shelf", but be wary of the overhead involved.
- Truly consider other, ongoing work to decide if your "total team-weeks" vs "available team weeks" is feasible - will your teams really get 50% of their time to work on this project?
- Challenge scope at a lower level - with the subject matter experts and original requirements authors - to see if certain parts can be dropped or pushed later.
Step 6: Kick things off
Without noticing, you've also developed a project plan, that means engineering teams can now be given tasks to explore, flesh out, and build. That means your teams can start building on zero-dependency items now, and you're that much more confident they won't run into blockers.
A skilled technologist can also use this to explain and explore the full requirements with her teams, making sure they feel informed, trusted, aware of the entire plan, and encouraged to speak up about things they're working on that will have an impact down the line. Her teams will feel like they are pulling together towards a larger set of goals, and will see how the work they are doing relates to the work everyone else is doing.
This is invaluable to team cohesion, and maintaining a positive, focused and effective working environment.
A few words on staying Agile™
I strongly believe that an approach like this is not contrary to agile methods. Note I say "agile" with a lowercase 'a' -- an Agile™, SCRUM, SaFER or XP practitioner may not enjoy themselves in an environment like this.
If you've got a SaaS platform, which is already launched, and you have a "grab bag" of features that could be worked on, even if there is no ominous, looming deadline, having a dependency chart of those features will help with prioritising what has the most value to the company. The same feature, if we know we can work on it immediately, will have more value than if we know we can't work on it for 6 months because we don't have a payment system.
aside: This is similar to the accounting concept of "the time value of money", a conjecture in financial accounting which says "there is greater benefit to receiving a sum of money now rather than an identical sum later". Working out the "net present value" of a feature is an exercise best left to a product owner 😉
What we've produced here is simply a roadmap which product and engineering teams can use to prioritise what's next. Sometimes, you'll find it'll have no dependencies; great, now you can prioritise based on pure user value, without any engineering concerns!
Continue to deploy early, deploy often, and demo -- even if it's only to a staging/test environment. This will keep your subject matter experts & your users in the loop, able to integrate feedback on that working software quickly and move on when "good enough" is reached.
Don't fuss about detailing every requirement up front; get your engineering teams as close to the client/user/original source of requirements as possible, give them a few words on an index card, and urge them to demo often; trust your engineering teams to figure out exact user stories, acceptance tests, and the like.
Revisit your project plan in light of developments; update and refine it based on things you learn (and reconsider dependencies and scope!) regularly. Compare actual time taken to the original estimates you received. Re-estimate in light of that, as needed.
Do not follow The Plan™ as a hard and fast, unbreaking rule of what must be done, but instead recognise that the core tenets of agile software development -- that requirements change all the time, as we learn more by building and releasing software -- mean the plan is always flexible and always changing.
And now, all of this being said, do remember that this is one approach, and there is no one-size-fits-all strategy for engineering leadership. Your situation might be very different, and not require any of this work. As always, match the tools and approaches you use to your reality!
And that's how I've estimated large, complex software projects.
Thanks go to Rob Yurkowski, Dave Sullivan, sevenseacat, Hanna, Dom and my father (also, confusingly, named Tom Russell, an accomplished PM in the Civil Engineering sector) for reviewing drafts of this post.