Showing posts with label automated testing. Show all posts
Showing posts with label automated testing. Show all posts

Sunday, June 15, 2014

Untangle code with automated testing

Of all of the tools and techniques for untangling code, the most important is automated testing.

What does automated testing have to do with the untangling of code?

Automated testing provides insurance. It provides a back-stop against which developers can make changes.

The task of untangling code, of making code readable, often requires changes across multiple modules and multiple classes. While a few improvements can be made to single modules (or classes), most require changes in multiple modules. Improvements can require changes to the methods exposed by a class, or remove access to member variables. These changes ripple though other classes.

Moreover, the improvement of tangled code often requires a re-thinking of the organization of the code. You move functions from one class to another. You rename variables. You split classes into smaller classes.

These are significant changes, and they can have significant effects on the operation of the code. Of course, while you want to change the organization of the code you want the results of calculations to remain unchanged. That's how automated tests can help.

Automated tests verify that your improvements have no effect on the calculations.

The tests must be automated. Manual tests are expensive: they require time and attention. Manual tests are easy to skip. Manual tests are easy to "flub". Manual tests can be difficult to verify. Automated tests are consistent, accurate, and most of all, cheap. They do not require attention or effort. They are complete.

Automated tests let programmers make significant improvements to the code base and have confidence that their changes are correct. That's how automated tests help you untangle code.

Monday, March 24, 2014

Software is not always soft

Tim O'Reilly asks "Can hardware really change as fast as software":




We're used to the idea that software changes faster than hardware. It's widely accepted as common knowledge. ("Of course software changes faster than hardware! Software is soft and hardware is hard!")

Yet it's not that simple.

Software is easy to change... sometimes. There are times when software is easier to change, and there are times when software is harder to change.

Tim O'Reilly's tweet (and the referenced article) consider software in the context of cell phones. While cell phones have been changing over time, the apps for phones tend to "rev" faster. But consider the software used on PCs. Sometimes PC software changes at a rate much slower than the hardware. The "Windows XP problem" is an example: people stay with Windows XP because their software runs on Windows XP (and not later versions of Windows).

Long-term software is not limited to PCs. Corporations and governments have large systems built with mainframe technology (COBOL, batch processing) and these systems have outlasted several generations of mainframe hardware. These systems are resistant to change and do not easily translate to out current technology set of virtualized servers and cloud computing.

What makes some software easy to change and other software hard? In my view, the answer is not in the software, but in the culture and processes of the organization. "Hard" software is a result, not a cause.

Development teams that use automated testing and that refactor code frequently have a better chance of building "soft" software -- software that is easy to change. Tests keep the developers "honest" and alert them to problems. Automated tests are cheap to run and therefore run frequently, giving developers immediate feedback. Comprehensive automated tests are cheap to run and, well, comprehensive, so developers get complete feedback and alerted to any deviations from requirements.

Refactoring is important; it allows developers to improve the code over time. We rarely get code right the first time. Often we are happy that it works, and we don't care about the simplicity or the consistency of the code. Refactoring lets us re-visit that code and make it simpler and consistent -- both of which make it easier to understand and change.

Development teams that use manual methods of testing (or no testing!) have little chance at building "soft" software. Without automated tests, the risk of introducing a defect while making a change is high. Developers and managers will both avoid unnecessary changes and will consider refactoring to be an unnecessary change. The result is that code is developed but never simplified or made consistent. The code remains hard to read and difficult to change.

If you want software to be soft -- to be easy to change -- then I encourage automated testing. I see no way to get "soft" software without it.

On the other hand, if you want "hard" software -- software that is resistant to change -- then skip the automated testing. Build a culture that avoids improvements to the code and allows only those changes that are necessary to meet new requirements.

But please don't complain about the difficulty of changes.

Wednesday, February 26, 2014

Legacy code isn't code -- its a lack of tests

We have all heard of legacy code. Some of use have had the (mis)fortune to work on it. But where does it come from? How is it created?

If legacy code is nothing more than "really old code", then when does normal code become legacy code? How old does code have to be to earn the designation "legacy"?

I've worked on a number of projects. The projects used different programming languages (C, Visual Basic, C++, Java, C#, Perl). They had different team sizes. They were at different companies with different management styles. Some projects had old code but it wasn't legacy code. Some projects created new code that was legacy code from the first day.

Legacy code is not merely old code. Legacy code is code that few (if any) team members want to modify or enhance. It is code that has a reputation, code that contains risk. It is hard to maintain and easy to break.

Legacy code is code without tests.

With a set of tests -- a comprehensive set of tests -- one can change code and be sure that it still works. That is a powerful position. With tests to verify the operation of the code, programmers can refactor code and simplify it, knowing that mistakes will be caught.

Without tests, programmers limit their changes to the bare minimum. Changes are small, surgical operations that adjust the smallest number of lines of code. The objective is not to improve the code, not to make it readable or more reliable, but to avoid breaking something.

Changes to code with tests also strive to avoid breaking things, but the programmer doesn't need the paranoia-like fear of changes. The tests verify the code, and frequent testing identifies errors quickly. The programmer can focus on improvements to the code.

Don't ask the question "is our code legacy code" -- ask the question "do we have comprehensive tests".

Monday, October 14, 2013

Executables, source code, and automated tests

People who use computers tend to think of the programs as the "real" software.

Programmers tend to have a different view. They think of the source code as the "real" software. After all, they can always create a new executable from the source code. The generative property of source code gives it priority of the mere performant property of executable code.

But that logic leads to an interesting conclusion. If source code is superior to executable code because the former can generate the latter, then how do we consider tests, especially automated tests?

Automated tests can be used to "generate" source code. One does not use tests to generate source code in the same, automated manner that a compiler converts source code to an executable, but the process is similar. Given a set of tests, a framework in which to run the tests, and the ability to write source code (and compile it for testing), one can create the source code that produces a program that conforms to the tests.

That was a bit of a circuitous route. Here's the concept in a diagram:


     automated tests --> source code --> executable code


This idea has been used in a number of development techniques. There is test-driven development (TDD), extreme programming (XP), and agile methods. All use the concept of "test first, then code" in which tests (automated tests) are defined first and only then is code changed to conform to the tests.

The advantage of "test first" is that you have tests for all of your code. You are not allowed to write code "because we may need it someday". You either have a test (in which case you write code) or you don't (in which case you don't write code).

A project that follows the "test first" method has tests for all features. If the source code is lost, one can re-create it from the tests. Granted, it might take some time -- this is not a simple re-compile operation. A complex system will have thousands of tests, perhaps hundreds of thousands. Writing code to conform to all of those tests is a manual operation.

But it is possible.

A harder task is going in the other direction, that is, writing tests from the source code. It is too easy to omit cases, to skip functionality, to misunderstand the code. Given the choice, I would prefer to start with tests and write code.

Therefore, I argue that the tests are the true "source" of the system, and the entity we consider "source code" is a derived entity. If I were facing a catastrophe and had to pick one (and only one) of the tests, the source code, and the executable code, I would pick the tests -- provided that they were automated and complete.

Thursday, September 5, 2013

Measure code complexity

We measure many things on development projects, from the cost to the time to user satisfaction. Yet we do not measure the complexity of our code.

One might find this surprising. After all, complexity of code is closely tied to quality (or so I like to believe) and also an indication of future effort (simple code is easier to change than complicated code).

The problem is not in the measurement of complexity. We have numerous techniques and tools, spanning the range from "lines of code" to function points. There are commercial tools and open source tools that measure complexity.

No, the problem is not in techniques or tools.

It is a matter of will. We don't measure complexity because, in short, we don't want to.

I can think of a few reasons that discourage the measurement of source code complexity.

- The measurement of complexity is a negative one. That is, more complexity is worse. A result of 170 is better than a result of 270, and this inverted scale is awkward. We are trained to like positive measurements, like baseball scores. (Perhaps the golf enthusiasts would see more interest if they changed their scoring system.)

- There is no direct way to connect complexity to cost. While we understand that a complicated code base is harder to maintain that a simple one, we have no way of converting that extra complexity into dollars. If we reduce our complexity from 270 to 170 (or 37 percent), do we reduce the cost of development by the same percentage? Why or why not? (I suspect that there is a lot to be learned in this area. Perhaps several Masters theses can be derived from it.)

- Not knowing the complexity shifts risk from managers to developers. In organizations with antagonistic relations between managers and developers, a willful ignorance of code complexity pushes risk onto developers. Estimates, if made by managers, will ignore complexity. Estimates made by developers may be optimistic (or pessimistic) but may be adjusted by managers. In either case, schedule delays will be the fault of the developer, not the manager.

- Developers (in shops with poor management relations) may avoid the use of any metrics, fearing that they will be used for performance evaluations.

Looking forward, I can see a time when we do measure code complexity.

- A company considering the acquisition of software (including the source code), may want an unbiased opinion of the code. They may not completely trust the seller (who is biased towards the sale) and they may not trust their own people (who may be biased against 'outside' software).

- A project team may want to identify complex areas of their code, to identify high-risk areas.

- A development team may wish to estimate the effort for maintaining code, and may include the complexity as a factor in that effort.

The tools are available.

I believe that we will, eventually, consider complexity analysis a regular part of software development. Perhaps it will start small, like the adoption of version control and automated testing. Both of those techniques were at one time considered new and unproven. Today, they are considered 'best practices'.