Sunday, March 9, 2014

How to untangle code: Remove the tricks

We all have our specialties. Mine is the un-tangling of code. That is, I can re-factor messy code to make it readable (and therefore maintainable).

The process is sometimes complex and sometimes tedious. I have found (discovered?, identified?) a set of practices that allow me to untangle code. As practices, they are imprecise and subject to judgement. Yet they can be useful.

The first practice is to get rid of the tricks. "Tricks" are the neat little features of the language.

In C++, two common types of tricks are pointers and preprocessor macros. (And sometimes they are combined.)

Pointers are to be avoided because they can often cause unintended operations. In C, one must use pointers; in C++ they are to be used only when necessary. One can pass a reference to an object instead of a pointer (or better yet, a reference to a const object). The reference is bound to an object and cannot be changed; a pointer, on the other hand, can be changed to point to something else (if you are very disciplined that something else will be another instance of the same class).

We use pointers in C (and in early C++) to manage elements in a data structure such as a list or a tree. While we can use references, it is better to use members of the C++ STL (or the BOOST library). These containers handle memory allocation and de-allocation. I have successfully untangled programs and eliminated all "new" and "delete" calls from the code.

The other common trick of C++ is the preprocessor. The preprocessor macros are powerful constructs that let one perform all sorts of mischief including changing function names, language keywords, and constant values. Simple macro definitions such as

#define PI 3.1415

can be written in Java or C# (or even C++) as

const double PI = 3.1415;

so one does not really need the preprocessor for those defintions.

More sophisticated macros such as

#define array_value(x, y) { if (y < 100) x[y]; else x[0]; }

let you check the bounds of an array, but the STL std::vector<> container performs this checking for you.

The preprocessor also lets one construct function calls at compile time:

#define call_func(x, y, a1, a2) func_##x##y(a1, a2)

to convert this code

call_func(stats, avg, v1, v2);

to this

func_statsavg(v1, v2);

Mind you, the source code contains only the unconverted line, never the converted line. Your debugger does not know about the post-processed line either. In a sense, #define macros are lies that we programmers tell ourselves.

Worse, they are specific to C++ (and possibly C, depending on their use of object-oriented notations). When you write code that invokes the C++ preprocessor, you lock the code into that language. Java, C#, and later languages do not have the preprocessor (or anything like it).

So when un-tangling code (sometimes with the objective of moving code to another language), one of the first things I do is get rid of the tricks.

No comments: