Friday, August 24, 2012

How I fix old code

Over the years (and multiple projects) I have developed techniques for improving object-oriented code. My techniques work for me (and the code that has been presented to me). here is what I do:

Start at the bottom Not the base classes, but the bottom-most classes. The classes that are used by other parts of the code, and have no dependencies. These classes can stand alone.

Work your way up After fixing the bottom classes, move up one level. Fix those classes. Repeat. Working up from the bottom is the only way I have found to be effective. One can have an idea of the final result, a vision of the finished product, but only by fixing the problems at the bottom can one achieve any meaningful results.

Identify class dependencies To start at the bottom, one must know the class dependencies. Not the class hierarchy, but the dependencies between classes. (Which classes use which other classes at run-time.) I use some custom Perl scripts to parse code and create a list of dependencies. The scripts are not perfect but they give me a good-enough picture. The classes with no dependencies are the bottom classes. Often they are utility classes that perform low-level operations. They are the place to start.

Create unit tests Tests are your friends! Unit tests for the bottom (stand-alone) classes are generally easy to create and maintain. Tests for higher-level classes are a little trickier, but possible with immutable lower-level classes.

Make objects immutable The Java String class (and the C# String class) showed us a new way of programming. I ignored it for a long time (too long, in my opinion). Immutable objects are unchangeable, and do not have the "classic" object-oriented functions for setting properties. Instead, they are fixed to their original value. When you want to change a property, the immutable object techniques dictate that instead of modifying an object you create a new object.

I start by making the lowest-level classes immutable, and then working my way up the "chain" of class dependencies.

Make member variables private Create accessor functions when necessary. I prefer to create "get" accessors only, but sometime it is necessary to create "set" accessors. I find that it easier to track and identify access with functions than with member variables, but that may be an effect of Visual Studio. Once the accessors are in place, I forget about the "get" accessors and look to remove the "set" accessors"

Create new constructors Constructors are your friends. They take a set of data and build an object. Create the ones that make sense for your application.

Fix existing constructors to be complete Sometimes people use constructors to partially construct objects, relying on the code to call "set" accessors later. Immutable object programming has none of that nonsense: when you construct an object you must provide everything. If you cannot provide everything, then you are not allowed to construct the object! No soup (or object) for you!

When possible, make member functions static Static functions have no access to member variables, so one must pass in all "ingredient" variables. This makes it clear which variables must be defined to call the function. Not all member functions can be static; make the functions called by constructors static when possible. (Really, put the effort into this task.) Calls to static functions can be re-sequenced at will, since they cannot have side effects on the object.

Static functions can also be moved from one class to another, at will. Or at least easier than member functions. It's a good attribute when re-arranging code.

Reduce class size Someone (I don't remember where) claimed that the optimum class size was 70 lines of code. I tend to agree with this size. Bottom classes can easily be expressed in 70 lines. (if not, they are probably composites of multiple elementary classes.) Higher-level classes can often be represented in 70 lines or less, sometimes more. (But never more than 150 lines.)

Reducing class size usually means increasing the number of classes. You code size may shrink somewhat (my experience shows a reduction of 40 to 60 percent) but it does not reduce to zero. Smaller classes often means more classes. I find that a system with more, smaller classes is easier to understand than one with fewer, large classes.

Name your classes well Naming is one of the great challenges of programming. Pick names carefully, and change names when it makes sense. (If your version control system resists changes to class names, get a new version control system. It is the servant, not you!)

Talk with other developers Discuss changes with other developers. Good developers can provide useful feedback and ideas. (Poor developers will waste your time, though.)

Discuss code with non-developers Our goal is to create code that can be read by non-developers who are experts in the subject matter. We want them to read our code, absorb it, and provide feedback. We want them to say "yes, that seems right" (or even better, "oh, there is a problem here with this calculation"). To achieve that level of understanding, we need to strip away all of the programming overhead: temporary variables, memory allocation, and sequence/iteration gunk. With immutable object programming, meaningful names, and modern constructs (in C++, that means BOOST) we can create high-level routines that are readable by non-programmers.

(Note that we are not asking the non-programmers to write code, merely to read it. That is enough.)

These techniques work for me (and the folks on my projects). Your mileage may vary.

No comments: