Agile promises clean code. That's the purpose of the 'refactor' phase. After creating a test and modifying the code, the developer refactors the code to eliminate compromises made during the changes.
But how much refactoring is enough? One might flippantly say "as much as it takes" but that's not an answer.
For many shops, the answer seems to be "as much as the developer thinks is needed". Other shops allow refactoring until the end of the development cycle. The first is subjective and opens the development team to the risk of spending too much time on refactoring and not enough on adding features. The second is arbitrary and risks short-changing the refactoring phase and allowing messy code to remain in the system.
Agile removes risk by creating automated tests, creating them before modifying the code, and having developers run those automated tests after all changes. Developers must ensure that all tests pass; they cannot move on to other changes while tests are failing.
This process removes judgement from the developer. A developer cannot say that the code is "good enough" without the tests confirming it. The tests are the deciders of completeness.
I believe that we want the same philosophy for code quality. Instead of allowing a developer to decide when refactoring has reached "good enough", we will instead use an automated process to make that decision.
We already have code quality tools. C and C++ have had lint for decades. Other languages have tools as well. (Wikipedia has a page for static analysis tools.) Some are commercial, others open source. Most can be tailored to meet the needs of the team, placing more weight on some issues and ignoring others. My favorite at the moment is 'Rubocop', a style-checking tool for Ruby.
I expect that Agile processes will adopt a measured approach to refactoring. By using one (or several) code assessors, a team can ensure quality of the code.
Such a change is not without ramifications. This change, like the use of automated tests, takes judgement away from the programmer. Code assessment tools can consider many things, some of which are style. They can examine indentation, names of variables or functions, the length or complexity of a function, or the length of a line of code. They can check the number of layers of 'if' statements or 'while' loops.
Deferring judgement to the style checkers will affect managers as well as programmers. If a developer must refactor code until it passes the style checker, then a manager cannot cut short the refactoring phase. Managers will probably not like this change -- it takes away some control. Yet it is necessary to maintain code quality. By ending refactoring before the code is at an acceptable quality, managers allow poor code to remain in the system, which will affect future development.
Agile is all about code quality.
Showing posts with label egoless programming. Show all posts
Showing posts with label egoless programming. Show all posts
Monday, August 8, 2016
Sunday, March 30, 2014
How to untangle code: Start at the bottom
Messy code is cheap to make and expensive to maintain. Clean code is not so cheap to create but much less expensive to maintain. If you can start with clean code and keep the code clean, you're in a good position. If you have messy code, you can reduce your maintenance costs by improving your code.
But where to begin? The question is difficult to answer, especially on a large code base. Some ideas are:
- Re-write the entire code
- Re-write logical sections of code (vertical slices)
- Re-write layers of code (horizontal slices)
- Make small improvements everywhere
All of these ideas have merit -- and risk. For very small code sets, a complete re-write is possible. For a system larger than "small", though, a re-write entails a lot of risk.
Slicing the system (either vertically or horizontally) has the appeal of independent teams. The idea is to assign a number of teams to the project, with each project working on an independent section of code. Since the code sets are independent, the teams can work independently. This is an appealing idea but not always practical. It is rare that a system is composed of independent systems. More often, the system is composed of several mutually-dependent systems, and adjustments to any one sub-system will ripple throughout the code.
One can make small improvements everywhere, but this has its limits. The improvements tend to be narrow in scope and systems often need high-level revisions.
Experience has taught me that improvements must start at the "bottom" of the code and work upwards. Improvements at the bottom layer can be made with minimal changes to higher layers. Note that there are some changes to higher layers -- in most systems there are some affects that ripple "upwards". Once the bottom layer is "clean", one can move upwards to improve the next-higher level.
How to identify the bottom layer? In object-oriented code, the process is easy: classes that can stand alone are the bottom layer. Object-oriented code consists of different classes, and some (usually most) classes depend on other classes. (A "car system" depends on various subsystems: "drive train", "suspension", "electrical", etc., and those subsystems in turn depend on smaller components.)
No matter how complex the hierarchy, there is a bottom layer. Some classes are simple enough that they do not include other classes. (At least not other classes that you maintain. They may contain framework-provided classes such as strings and lists and database connections.)
These bottom classes are where I start. I make improvements to these classes, often making them immutable (so they can hold state but they cannot change state). I change their public methods to use consistent names. I simplify their code. When these "bottom" classes are complex (when they hold many member variables) I split them into multiple classes.
The result is a set of simpler, cleaner code that is reliable and readable.
Most of these changes affect the other parts of the system. I make changes gradually, introducing one or two and then re-building the system and fixing broken code. I create unit tests for the revised classes. I share changes with other members of the team and ask for their input.
I don't stop with just these "bottom" classes. Once cleaned, I move up to the next level of code: the classes than depend only on framework and the newly-cleaned classes. With a solid base of clean code below, one can improve the next layer of classes. The improvements are the same: make classes immutable, use consistent names for functions and variables, and split complex classes into smaller classes.
Using this technique, one works from the bottom of the code to the top, cleaning all of the code and ensuring that the entire system is maintainable.
This method is not without drawbacks. Sometimes there are cyclic dependencies between classes and there is no clear "bottom" class. (Good judgement and re-factoring can usually resolve that issue.) The largest challenge is not technical but political -- large code bases with large development teams often have developers with egos, developers who think that they own part of the code. They are often reluctant to give up control of "their" code. This is a management issue, and much has been written on "egoless programming".
Despite the difficulties, this method works. It is the only method that I have found to work. The other approaches too often run into the problem of doing too much at once. The "bottom up" method allows for small, gradual changes. It reduces risk, but cannot eliminate it. It lets the team work at a measured pace, and lets the team measure their progress (how many classes cleaned).
Labels:
clean code,
egoless programming,
refactoring,
untangling code
Subscribe to:
Posts (Atom)