Fitzpatrick's Fabulous Future: coding style

Wednesday, October 5, 2022

Success with C++

Having recently written on the possible decline of C++, it is perhaps only fair that I share a success story about C++. The C++ programming language is still alive, and still useful. I should know, because a recent project used C++, and successfully!

The project was to maintain and enhance an existing C++ program. The program was written by other programmers before I arrived, over a period of years. Most of the original developers were no longer on the project. (In other words, a legacy system.)

The program itself is small by today's standards, with less than 300,000 lines of source code. It also has an unusual design (by today's standards): The program calculates economic forecasts, using a set of input data. It has no interaction with the user; the calculations are made completely with nothing more than the input data and program logic.

We (the development team) have successfully maintained and enhanced this program by following some rules, and placing some constraints upon ourselves. The goal was to make the code easy to read, easy to debug, and easy to modify. We made some design decisions for performance, but only after our initial design was shown to be slow. These constraints, I think, were key to our success.

We use a subset of C++. The language is large and offers many capabilities; we pick those that are necessary. We use classes. We rarely use inheritance. Instead, we build classes from composition. Thus, we had no problems with slicing of objects. (Slicing is an effect that can occur in C++, when casting a derived class to a base class. It generally does not occur in other OOP languages.) There are a very small number of classes that use inheritance, and in those cases we often want slicing.

We use STL but not BOOST. The STL (the Standard Template Library) is enough for our needs, and we use only what we need: strings, vectors, maps, and an occasional algorithm.

We followed the Java convention for files, classes, class names, and function names. That is, each class is stored in its own file. (In C++, we have two files, for the header file and the source file.) The name of the file is the name of the class (with a ".h" or ".cpp" extension). The class name uses camel-case, with a capital letter at the beginning of each word, for names such as "Year" or "HitAdjustment". Function names use snake-case with all lower-case letters and underscores between words. This naming convention simplified a lot of our code. When creating objects, we could create an object of type Year and name it "year". (The older code using no naming conventions, and many classes had lower-case names, which meant that when creating an object of type "ymd" (for example) we had to pick a name like "my_ymd" and keep track mentally of what was a class name and what was a variable name.)

We do not use namespaces. That is, we do not "use std" or any other namespace. This forces us to specify the namespace for every class. While tedious, it provides the benefit that one can easily see the class for function names. There is no need to search through the code, or guess about a function.

We use operator overloading only for a few classes, and only when the operators are obvious. Most of our code uses function calls. This also reduces guesswork by developers.

We have no friend classes and no friend functions. (We could use them, but we don't need them.)

Our attitude towards memory management is casual. Current operating systems provide a 2 gigabyte space for our programs, and that is enough for our needs. (It has been so far.) We avoid pointers and dynamic allocation of memory. STL allocates memory for its objects, and we assume that it will manage that memory properly.

We do not use lambdas or closures. (We could use them, but we don't need them.)

We use spacing in our code to separate sections of code. We also use spacing to denote statements that are split across multiple lines. (A blank in front and a blank after.)

We use simple expressions. This increases the number of source lines, which eases debugging (we can see intermediate results). We let the C++ compiler optimize expressions for "release" builds.

----

By using a subset of C++, and carefully picking which features make up that subset, we have successfully developed, deployed, and maintained a modest-sized C++ application.

These constraints are not traditionally considered part of the C++ language. We enforce them for our code. It provides us with a consistent style of code, and one that we find readable. New team members find that they can read and understand the code, which was one of our goals. We can quickly make changes, test them, and deploy them -- another goal.

These choices work for us, but we don't claim that they will work for other teams. You may have an application that has a different design, a different user interface, or a different set of computations, and it may require a different set of C++ code.

I don't say that you should use these constraints on your project. But I do say this: you may want to consider some constraints for your code style. We found that these constraints let us move forward, slowly at first and then rapidly.

Tuesday, July 16, 2019

Across and down

All programming languages have rules. These rules define what can be done and what cannot be done in a valid program. Some languages even have rules for certain things that must be done. (COBOL, for example, requires the four 'DIVISION' sections in each program.)

Beyond rules, there are styles. Styles are different from rules. Rules are firm. Styles are soft. Styles are guidelines: good to follow, but break them when necessary.

Different languages have different styles. Some style guidelines are common: Many languages have guidelines for indentation and the naming of classes, functions, and variables. Some style guidelines are unique to languages.

The Python programming language has a style which limits line length. (To 80 characters, if you are interested.)

Ruby has a style for line length, too. (That is, if you use Rubocop with its default configuration.)

They are not the first languages to care about line length. COBOL and FORTRAN limited line length to 72 characters. These were rules, not guidelines. The origin was in punch cards, and the language standards specified the column layout and specifies 72 as a limit. Compilers ignored anything past column 72, and woe to the programmer who let a line exceed that length.

The limit in Python is a guideline. One is free to write Python with lines that exceed 80 characters, and the Python interpreter will run the code. Similarly, Ruby's style checker, Rubocop, can be configured to warn about any line length. Ruby itself will run the long lines of code. But limits on line length make for code that is more readable.

Programs exist in two dimensions. Not just across, but also down. Code consist of lines of text.

While some languages limit the width of the code (the number of columns), no language limits the "height" of the code -- the number of lines in a program, or a module, or a class.

Some implementations of languages impose a limit on the number of lines. Microsoft BASIC, for example, limited line numbers to four digits, and since each line had to have a unique line number, that imposed an upper bound of 10,000 lines. Some compilers can handle as many lines as will fit in memory -- and no more. But these are limits imposed by the implementation. I am free, for example, to create an interpreter for BASIC that can handle more than 10,000 lines. (Or fewer, stopping at 1,000.) The language does not dictate the limit.

I don't want the harshly-enforced and unconfigurable limits of the days of early computing. But I think we could use with some guidelines for code length. Rubocop, to its credit, does warn about functions that exceed a configurable limit. There are tools for other languages that warn about the complexity of functions and classes. The idea of "the code is too long" has been bubbling in the development community for decades.

Perhaps it is time we gave it some serious thought.

One creative idea (I do not remember who posed it) was to use the IDE (or the editor) to limit program size. The idea was this: Don't allow scrolling in the window that holds the code. Instead of scrolling, as a programmer increased the length of a function, the editor reduced the font size. (The idea was to keep the entire function visible.) As the code grows in size, the text shrinks. Eventually, one reaches a point when the code becomes unreadable.

The idea of shrinking code on the screen is amusing, but the idea of limiting code size may have merit. Could we set style limits for the length of functions and classes? (Such limits and warnings already exist in Rubocop, so the answer is clearly 'yes'.)

The better question is: How do limits on code length (number of lines) help stakeholders? How do they help developers, and how do they help users?

The obvious response is that shorter functions (and shorter classes) are easier to read and comprehend, perform fewer tasks, and are easier to verify (and to correct). At least, that is what I want the answer to be -- I don't know that we have hard observations that confirm that point of view. I can say that my experience confirms this opinion; I have worked on several systems, in different languages, splitting large functions and classes into smaller ones, with the result being that the re-designed code is easier to maintain. Smaller functions are easier to read.

I believe that code should consist of small classes and small functions. Guidelines and tools that help us keep functions short and classes small will improve our code. Remember that code exists in two dimensions (across and down) and that it should be moderate in both.

Fitzpatrick's Fabulous Future

Wednesday, October 5, 2022

Success with C++

Tuesday, July 16, 2019

Across and down

Blog Archive

About Me