Showing posts with label class size. Show all posts
Showing posts with label class size. Show all posts

Sunday, June 8, 2014

Untangle code with small classes

If you want to simplify code, build small classes.

I have written (for different systems) classes for things such as ZIP Codes, account numbers, weights, years, year-month combinations, and file names.

These are small, simple classes, usually equipped with a constructor, comparison operators, and a "to string" operator. Sometimes they have other operators. For example, the YearMonth class has next_month() and previous_month() functions.

Why create a class for something as simple? After all, a year can easily be represented by an int (or an unsigned int, if you prefer). A file name can be held in a string. Why have a separate class for them?

Small classes provide a number of benefits.

Check for validity The constructor can check for the validity of the contents. With the proper checks in place, you know that every instance of the class is a valid instance. With primitive types (such as a string to hold a ZIP Code), you are never sure.

Consolidate redundant code A class can hold the logic that is duplicated in the main code. The Year class can tell if a year is a leap year, instead of repeating if (year % 4 == 0) in the code. It is easier (and more readable) to have code say if (year.is_leap_year()).

Consistent operations Our Year class performs the proper calculation for leap years (not the simple one listed above). Using the Year class for all instances of a year means that the calculations for leap year are consistent (and correct).

Clear names for operations Our Year class has operations named next_year() and previous_year() which give clear meaning to the operations year + 1 and year - 1.

Limit operations Custom classes provide the operations you specify and no others. The standard library provides classes with lots of operations, some of which may be inappropriate for your needs.

Add operations Our YearMonth class has operations next_month() and previous_month(), operations which are not supplied in the standard library's Date class. (Yes, one can add a TimeSpan object, provided one gets the right number of days in the TimeSpan, but the code is more complex.) Also, our YearMonth class can calculate the quarter of the year, something we need for our computations.

Prevent accidental use An object of a specific class cannot be used accidentally. If passed to a function or class, the target must be ready to accept the class. Our Year class cannot be carelessly passed to another function. If we stored our years in ints, those ints could be passed to any function that expected an int.

These benefits simplify the main code. Custom classes for small data elements let you ensure that objects are complete and internally consistent. They let you consolidate logic into a single place. They let you tailor the operations to your needs. They prevent accidental use or assignment.

Simplifying the main code means that the main code becomes, well, simpler. By moving low-level operations to low-level classes, your "mainline" code focusses on higher-level concepts. You spend less of your time worrying about low-level things and more of your time thinking about high-level (that is, application-level) ideas.

If you want to simplify code, build small classes.

Friday, August 24, 2012

How I fix old code

Over the years (and multiple projects) I have developed techniques for improving object-oriented code. My techniques work for me (and the code that has been presented to me). here is what I do:

Start at the bottom Not the base classes, but the bottom-most classes. The classes that are used by other parts of the code, and have no dependencies. These classes can stand alone.

Work your way up After fixing the bottom classes, move up one level. Fix those classes. Repeat. Working up from the bottom is the only way I have found to be effective. One can have an idea of the final result, a vision of the finished product, but only by fixing the problems at the bottom can one achieve any meaningful results.

Identify class dependencies To start at the bottom, one must know the class dependencies. Not the class hierarchy, but the dependencies between classes. (Which classes use which other classes at run-time.) I use some custom Perl scripts to parse code and create a list of dependencies. The scripts are not perfect but they give me a good-enough picture. The classes with no dependencies are the bottom classes. Often they are utility classes that perform low-level operations. They are the place to start.

Create unit tests Tests are your friends! Unit tests for the bottom (stand-alone) classes are generally easy to create and maintain. Tests for higher-level classes are a little trickier, but possible with immutable lower-level classes.

Make objects immutable The Java String class (and the C# String class) showed us a new way of programming. I ignored it for a long time (too long, in my opinion). Immutable objects are unchangeable, and do not have the "classic" object-oriented functions for setting properties. Instead, they are fixed to their original value. When you want to change a property, the immutable object techniques dictate that instead of modifying an object you create a new object.

I start by making the lowest-level classes immutable, and then working my way up the "chain" of class dependencies.

Make member variables private Create accessor functions when necessary. I prefer to create "get" accessors only, but sometime it is necessary to create "set" accessors. I find that it easier to track and identify access with functions than with member variables, but that may be an effect of Visual Studio. Once the accessors are in place, I forget about the "get" accessors and look to remove the "set" accessors"

Create new constructors Constructors are your friends. They take a set of data and build an object. Create the ones that make sense for your application.

Fix existing constructors to be complete Sometimes people use constructors to partially construct objects, relying on the code to call "set" accessors later. Immutable object programming has none of that nonsense: when you construct an object you must provide everything. If you cannot provide everything, then you are not allowed to construct the object! No soup (or object) for you!

When possible, make member functions static Static functions have no access to member variables, so one must pass in all "ingredient" variables. This makes it clear which variables must be defined to call the function. Not all member functions can be static; make the functions called by constructors static when possible. (Really, put the effort into this task.) Calls to static functions can be re-sequenced at will, since they cannot have side effects on the object.

Static functions can also be moved from one class to another, at will. Or at least easier than member functions. It's a good attribute when re-arranging code.

Reduce class size Someone (I don't remember where) claimed that the optimum class size was 70 lines of code. I tend to agree with this size. Bottom classes can easily be expressed in 70 lines. (if not, they are probably composites of multiple elementary classes.) Higher-level classes can often be represented in 70 lines or less, sometimes more. (But never more than 150 lines.)

Reducing class size usually means increasing the number of classes. You code size may shrink somewhat (my experience shows a reduction of 40 to 60 percent) but it does not reduce to zero. Smaller classes often means more classes. I find that a system with more, smaller classes is easier to understand than one with fewer, large classes.

Name your classes well Naming is one of the great challenges of programming. Pick names carefully, and change names when it makes sense. (If your version control system resists changes to class names, get a new version control system. It is the servant, not you!)

Talk with other developers Discuss changes with other developers. Good developers can provide useful feedback and ideas. (Poor developers will waste your time, though.)

Discuss code with non-developers Our goal is to create code that can be read by non-developers who are experts in the subject matter. We want them to read our code, absorb it, and provide feedback. We want them to say "yes, that seems right" (or even better, "oh, there is a problem here with this calculation"). To achieve that level of understanding, we need to strip away all of the programming overhead: temporary variables, memory allocation, and sequence/iteration gunk. With immutable object programming, meaningful names, and modern constructs (in C++, that means BOOST) we can create high-level routines that are readable by non-programmers.

(Note that we are not asking the non-programmers to write code, merely to read it. That is enough.)

These techniques work for me (and the folks on my projects). Your mileage may vary.

Monday, May 14, 2012

Just how big is too big?

I recently read (somewhere on the internet) that the optimal size of a class is 70 lines of code (LOC).

My initial thought on such a size for classes was that it was extremely small, too small to be practical. Indeed, with some languages and frameworks, it is not possible to create a class with less than 70 lines of code.

Yet after working with "Immutable Object Programming" techniques, I have come to believe that classes of size 70 LOC are possible -- and practical. A recent project saw a number of classes (not all of them, but many) on the order of 70 LOC. Some were slightly larger (perhaps 100 LOC), some a bit larger (250 LOC), and a few very large (1000 LOC). A few classes were smaller.

The idea of smaller classes is not new. Edward Yourdon, in his 1975 work "Techniques of Program Structure and Design" states that some organizations set a limit on module size to 50 LOC. At the time, object-oriented programming was unknown to the profession (although the notions of classes had been around for decades), so a module is a reasonable substitute for a class.

What I find interesting is the similarity of optimal sizes. For classes, 70 LOC. For modules, 50 LOC. I think that this may tell us something about our abilities as programmers.

I will also observe that 70 lines is about the size of three screens of text -- if we consider a "screen" to be the olde standard size of 24 lines with 80 characters. That may tell us about our abilities, too.