Estimating software development is hard, but that's no reason to ignore it. What can we learn from the scientific method and how can we apply it to our projects?
Estimating how long something is going to take is a part of everyday life, and as most of us know, we humans stink at it. Yet as software developers, estimating how long features and functionality will take to develop is all part of the job description.
I was explaining Red Badger’s project sizing technique the other day to a client who noted that we had a “very scientific” process. I think what they really meant was that it appeared very mathematical and exact (which it isn’t and beware anyone who tries to convince you that estimating can be), but taking the comment at face value, they had inadvertently hit upon something.
In the scientific world, it is a generally accepted principle that one can only draw conclusions from the evidence that presents itself. In other words – I cannot say something is so, if I have no supporting evidence.
More importantly, however, it is accepted that a scientific theory is rarely presented as fact, but rather our best guess given the current evidence.
Estimating software is, or should be, much the same.
A customer explains the solution they ultimately want from 50,000ft and we have to provide a theory for how long (and therefore how much) it will take to achieve. It should come as no great surprise then to realise that the fidelity of the theory produced is going to correlate to the fidelity of the evidence upon which it is based.
If you’re willing to bet more than a tenner on your guesses, then please do give me a call and I’ll happily take your money.
Let’s do an experiment. Look up from wherever you’re sitting right now and look out your nearest window. Pick out 2 objects in the distance, one quite close to you and one further away. Next take a guess, in centimetres, as to how far away from you the first object is. Now do the same for the second object. We now have 3 data points; the distance to the first object, the distance to the second object and thus the distance between the first and second objects.
How accurate do you think the estimates you just made are and how much money are you willing to bet on them being correct?
If you’re willing to bet more than a tenner on your guesses, then please do give me a call and I’ll happily take your money.
So how do you set a budget?
Here is an almost universal fact – most projects have a budget assigned to them before anyone’s really done much thinking about how much work is actually required. What does that mean to the people who will work on that project and what does it mean in relation to what the customer can expect at the end?
Whilst there are exceptions to every rule, more often than not projects are completed and the results put to use. If we put aside the quality of the result for a moment and instead think about the solution created, I would argue that its complexity will correlate directly to the original budget dictated for the project.
If you want a website building and set a budget of £500 – you can almost certainly get a website that will present the content you provided. You probably won’t be able to update the content yourself after the fact and maybe it doesn’t work or look quite right in some browsers – but you did it, you got yourself a website.
If on the other hand, you set a budget of £500,000 – you will also get a website. It will be entirely content manageable, have great design, be SEO friendly and look good in every browser and on every mobile device.
So if estimation is unreliable and budgets are set ahead of time anyway – why bother estimating?
If you wanted me to build you a table, I can already start chunking that requirement up into units of work to give us an idea of how much effort might be involved. We know we need to source the correct wood, create the tabletop, turn the legs, stain the wood, varnish it and get it delivered.
We don’t need to tie down every single specific at this point and nor should we. Picking the correct thickness of wood, number and placement of table legs are things we can discover and agree on together along the way when we collectively have more data and have maybe got some wood samples to play with.
What is important however is that we agree, in general terms, on the primary tasks that we need to complete so that we’re going to have a table at the end of the project that you can use and with which you’ll be satisfied.
Once we’ve broadly agreed on the chunks of work that need completing – we can start to estimate them using relative estimating. If we think back to our experiment earlier when we tried to estimate the distance of 2 objects visible out of the window in centimetres it probably felt pretty futile to be thinking at such high fidelity.
Rather than saying the first object was 10,734cm away and the second 19,764cm away – wouldn’t it be much easier to say the second object is further away than the first, probably by a factor of about two? This is relative estimating.
Using our experience, we can quickly estimate that turning the legs is about two or three times more complicated a task than creating the table top and that staining the wood is about half the effort of creating the tabletop and so on.
At the end of this exercise, we have a list of the tasks involved in building the table and low fidelity feel for how large each task is relevant to the others. Next, we take a few of those tasks, ideally the ones we understand best, and think about them in more detail down to how many days or hours we think that task might take. Finally, we can use that data to extrapolate out a likely time for all our other tasks using the relative estimates we’ve already created.
We now have a low fidelity view of how long all the tasks might take to complete. It’s important to stress that this isn’t a guarantee but a guide. It is enough data for us to match the effort we think is required to the available budget and rule ourselves in or out at an early stage without having expended too much effort or set any false expectations.
We have a theory based on the data available to us.
In Scrum the units of work we described are our product backlog, each individual unit of work themselves are epic user stories and the relative estimates are story points.
In Scrum, when it comes to estimating sprint backlog tasks, you estimate in hours.
This estimate is not a promise however, it is a guide used to provide feedback as to the status of the sprint. At the end of each day, the developer updates that estimate.
They don’t simply subtract the number of hours they’ve spent working on the task, instead, they use their the insight and knowledge afforded them, having been working with the task in such close proximity, to provide a new high fidelity estimate.
Typically this estimate would be lower than the original given that they’ve spent time working on it, but it might not be. If they’ve started working on something we thought would be simple, but it turned out to be more than had been bargained for; they simply increase the estimate.
This is the iterative feedback loop at the heart of scrum and one of the tenants that makes it agile. Those daily, iterative estimates provide you with the indicators that things are either on track or not going as you had planned – they provide you with the opportunity to be agile and to react to the situation early.
The more you do something, the better your become at estimating how long it will take. Not least because you’ve done it before and so can build your experience into that estimate.
What time do you get up in the morning and why? For most people, the time they get out of bed will be dictated by the time they need to arrive at their place of work or study.
Most people, more or less, make it to work on time each day – which means they did a good job of estimating how long it was going to take to get showered, dressed, have breakfast and travel to work. That’s because they do it nearly every day. Even with this practice in hand, however, sometimes things go wrong. It snows, the train’s delayed, you hit road works or maybe you just got too drunk the night before and slept through your alarm.
As a project progresses it is natural to see a better team alignment as it finds its rhythm and in turn for its estimation to become more accurate. Scrum takes advantage of this fact to give a more and more accurate view of the future the closer you get to crunch time, whilst factoring in unforeseen issues that can come out of the blue. In both cases it empowers you to take, early, corrective action.
There’s a great blog post from Dr Tom Crick, about the so-called Feynman Problem-Solving Algorithm which states:
Often when we write down a problem we presume to know the answer, but when it comes down to the nitty gritty we discover we perhaps didn’t know as much as we thought. Estimating tasks as a team forces us to think things through and collectively agree on a broad approach to a problem before we start.
Estimating can be a painful process and it is never more painful than when you don’t have enough data to base your estimate on. It’s at this point, that you should stop – step down the fidelity of the estimate, ask for more data and move on.
Once again we’re able to identify issues or unknowns early in the process, flag them and attempt to get them resolved before they become a blocker to the project delivery.
People have a penchant for fixed price projects because they’ve had too many experiences where projects have run over budget so horribly. You could argue that they ran over because the estimations upon which the budget was based were poor. Or you could accept that estimating is such a fallible tool, that to place a bet on the size of a project budget on initial estimates was foolish in the first place.
Software development isn’t about “ones and zeros” as often people like to say, it’s about human interaction with a system. And when humans are involved infinite variables are suddenly introduced that need to be continually accounted for.
The truth is that whilst estimating is a valuable tool, it has to be applied in the right manner at the right time. Crucially it should be treated as an iterative process that is continually revisited to increase it’s fidelity in line with the available data.
Estimating is a science. And science is iterative.