Wednesday, September 23, 2009

Why IT has difficulty with estimates

Estimating has always been a difficult task in IT. Especially for development efforts. How long will it take to write the program? How much will it cost? How many people do we need? For decades, we have struggled with estimates and project plans. Development projects run over allotted time (and over allotted budget). Why?

I observe that the problem with estimates is on the development side of IT. The other major parts of IT, support and operations, have loads that can be reliably estimated. For support, we have experience with the number of customers who call and the complexity of their issues. For operations, we have the experience of nightly jobs and the time it takes to run them. It's only on the development side, where we gather requirements, prepare designs, and do the programming that we have the problem with estimates. (I'm including testing as part of the development effort.)

The process of estimation works for repeated tasks. That is, you can form a reasonable estimate for a task that you have performed before. The more often you have performed the task, the better your estimate.

For example, most people have very good estimates for the amount of time they need for their morning commute. We know when to leave to arrive at the office on time. Every once in a while our estimate is incorrect, due to an unforeseen event such as traffic delays or water main breaks, but on average we do a pretty good job.

We're not perfect at estimates. We cannot make them out of nothing. We need some initial values, some basis for the estimate. When we are just hired and are making our first trips to the new office, we allow extra time. We leave early and probably arrive early -- or perhaps we leave at what we think is a good time and arrive late. We try different departure times and eventually find one that works for us. Once we have a repeating process, we can estimate the duration.

Hold that thought while I shift to a different topic. I'll come back to estimates, I promise.

The fundamental job of IT is to automate tasks. The task could be anything, from updating patient records to processing the day's sales transactions. It could be monitoring a set of servers and restarting jobs when necessary. It could be serving custom web pages. It is not a specific kind of task that we automate, it is the repetition of *any* task.

Once we identify a repeating task, we automate it. That's what we do. We develop programs, scripts, and sometimes even new hardware to automate well-defined, repeating tasks.

Once a task has been automated, it becomes part of the operation. As an operation task, it is run on a regular schedule with an expected duration. We can plan for the CPU load, network load, and other resources. And it is no longer part of the development task set.

The repeating tasks, the well-defined tasks, the tasks that can be predicted, move to operations. The tasks remaining for development -- the ones that need estimates -- are the ones that have are not repeating. They are new. They are not well-defined. They cover unexplored territory.

And here's where estimates come back into the discussion. Since we are constantly identifying processes that can be automated, automating them, and moving them from development to operations, the well-defined, repeatable tasks fall out of the development basket, leaving the ill-defined and non-repeating tasks. These are the tasks that cannot be estimated, since they are not well-defined and repeating.

Well, you *can* write down some numbers and call them estimates. But without experience to validate your numbers, I'm not sure how you can call them anything but guesses.


No comments: