Showing posts with label test-driven development. Show all posts
Showing posts with label test-driven development. Show all posts

Tuesday, May 8, 2018

Refactor when you need it

The development cycle for Agile and TDD is simple:
  • Define a new requirement
  • Write a test for that requirement
  • Run the test (and see that it fails)
  • Change the code to make the test pass
  • Run the test (and see that it passes)
  • Refactor the code to make it clean
  • Run the test again (and see that it still passes)
Notice that refactor step near the end? That is what keeps your code clean. It allows you to write a messy solution quickly.

A working solution gives you a good understanding of the requirement, and its affect on the code. With that understanding, you can then improve the code, making it clear for other programmers. The test keeps your revised solutions correct -- if a cleanup change breaks a test, you have to fix the code.

But refactoring is not limited to after a change. You can refactor before a change.

Why would you do that? Why would you refactor before making any changes? After all, if your code is clean, it doesn't need to be refactored. It is already understandable and maintainable. So why refactor in advance?

It turns out that code is not always perfectly clean. Sometimes we stop refactoring early. Sometimes we think our refactoring is complete when it is not. Sometimes we have duplicate code, or poorly named functions, or overweight classes. And sometimes we are enlightened by a new requirement.

A new requirement can force us to look at the code from a different angle. We can see new patterns, or see opportunities for improvement that we failed to see earlier.

When that happens, we see new ways of organizing the code. Often, the new organization allows for an easy change to meet the requirement. We might refactor classes to hold data in a different arrangement (perhaps a dictionary instead of a list) or break large-ish blocks into smaller blocks.

In this situation, it is better to refactor the code before adding the new requirement. Instead of adding the new feature and refactoring, perform the operations in reverse sequence: refactor and then add the requirement. (Of course, you still test and you can still refactor at the end.) The full sequence is:
  • Define a new requirement
  • Write a test for that requirement
  • Run the test (and see that it fails)
  • Examine the code and identify improvements
  • Refactor the code (without adding the new requirement)
  • Run tests to verify that the code still works (skip the new test)
  • Change the code to make the test pass
  • Run the test (and see that it passes)
  • Refactor the code to make it clean
  • Run the test again (and see that it still passes)
I've added the new steps in bold.

Agile has taught us is to change our processes when the changes are beneficial. Changing the Agile process is part of that. You can refactor before making changes. You should refactor before making changes, when the refactoring will help you.

Wednesday, February 26, 2014

Legacy code isn't code -- its a lack of tests

We have all heard of legacy code. Some of use have had the (mis)fortune to work on it. But where does it come from? How is it created?

If legacy code is nothing more than "really old code", then when does normal code become legacy code? How old does code have to be to earn the designation "legacy"?

I've worked on a number of projects. The projects used different programming languages (C, Visual Basic, C++, Java, C#, Perl). They had different team sizes. They were at different companies with different management styles. Some projects had old code but it wasn't legacy code. Some projects created new code that was legacy code from the first day.

Legacy code is not merely old code. Legacy code is code that few (if any) team members want to modify or enhance. It is code that has a reputation, code that contains risk. It is hard to maintain and easy to break.

Legacy code is code without tests.

With a set of tests -- a comprehensive set of tests -- one can change code and be sure that it still works. That is a powerful position. With tests to verify the operation of the code, programmers can refactor code and simplify it, knowing that mistakes will be caught.

Without tests, programmers limit their changes to the bare minimum. Changes are small, surgical operations that adjust the smallest number of lines of code. The objective is not to improve the code, not to make it readable or more reliable, but to avoid breaking something.

Changes to code with tests also strive to avoid breaking things, but the programmer doesn't need the paranoia-like fear of changes. The tests verify the code, and frequent testing identifies errors quickly. The programmer can focus on improvements to the code.

Don't ask the question "is our code legacy code" -- ask the question "do we have comprehensive tests".

Monday, October 14, 2013

Executables, source code, and automated tests

People who use computers tend to think of the programs as the "real" software.

Programmers tend to have a different view. They think of the source code as the "real" software. After all, they can always create a new executable from the source code. The generative property of source code gives it priority of the mere performant property of executable code.

But that logic leads to an interesting conclusion. If source code is superior to executable code because the former can generate the latter, then how do we consider tests, especially automated tests?

Automated tests can be used to "generate" source code. One does not use tests to generate source code in the same, automated manner that a compiler converts source code to an executable, but the process is similar. Given a set of tests, a framework in which to run the tests, and the ability to write source code (and compile it for testing), one can create the source code that produces a program that conforms to the tests.

That was a bit of a circuitous route. Here's the concept in a diagram:


     automated tests --> source code --> executable code


This idea has been used in a number of development techniques. There is test-driven development (TDD), extreme programming (XP), and agile methods. All use the concept of "test first, then code" in which tests (automated tests) are defined first and only then is code changed to conform to the tests.

The advantage of "test first" is that you have tests for all of your code. You are not allowed to write code "because we may need it someday". You either have a test (in which case you write code) or you don't (in which case you don't write code).

A project that follows the "test first" method has tests for all features. If the source code is lost, one can re-create it from the tests. Granted, it might take some time -- this is not a simple re-compile operation. A complex system will have thousands of tests, perhaps hundreds of thousands. Writing code to conform to all of those tests is a manual operation.

But it is possible.

A harder task is going in the other direction, that is, writing tests from the source code. It is too easy to omit cases, to skip functionality, to misunderstand the code. Given the choice, I would prefer to start with tests and write code.

Therefore, I argue that the tests are the true "source" of the system, and the entity we consider "source code" is a derived entity. If I were facing a catastrophe and had to pick one (and only one) of the tests, the source code, and the executable code, I would pick the tests -- provided that they were automated and complete.