Friday, April 3, 2020

Information in code

We programmers think of code as providing information, to the programmer as well as the computer. Code is converted into an executable that performs tasks for us, which is information for the computer. Code is also a description of those tasks, suitable for a programmer to read.

But information in code is not uniformly distributed. Some constructs in code provide more information than others.

Let's look at three different constructs that each provide information: data types, variable names, and
comments. They all provide information to the reader of the code, and are all useful.

The first construct is the variable type. It conveys information. It is most likely correct, and it is certainly consistent with the operations performed by the code. The variable type exists in multiple places in the code, although it may not be obvious. Anywhere the variable is used, the type is present.

Variable types prevent errors, by restricting the contents of the variable and operations on the variable. (At least in statically-typed languages.)

The second construct is the variable name. It also conveys information, but information that is different from the variable type. The variable name provides information about the intent of the variable. A good variable name describes the contents in such a way as to be useful to the programmer.

The name exists in multiple places in the code. It is (syntactically) linked to the variable -- where the variable is used, the name is used.

The third construct is the comment. A comment in code conveys information, or at least it is capable of conveying information. A well-written comment is useful to the programmer: It can inform the reader of reasons for the code. Comments can explain why the code was built in a particular way. No other construct (in code) can provide this information.

A comment exists in only one place in the code. It does not appear in multiple places, like variable types or variable names. Thus, the placement of a comment is important.

Comments may be incorrect. (They can be wrong from the start, or they can be correct and then left unchanged as the associated code is changed.)

Notice that all three of these elements provide value to the programmer. Each provides some value. Together they provide a comprehensive set of information: reasons, purpose, and constraints.

All three of these elements are important when writing a program. I won't say "necessary", as some programs can be written with expressive data types and descriptive variable names, and no comments. But any program beyond the trivial will benefit from comments.

Also notice that these three elements are different. Selecting the type of variable is mostly a technical decision, with clear right-and-wrong answers. Choosing a name for a variable is often difficult -- although can be easy when a variable's contents corresponds to a real-world concept. Composing a comment to explain the reason for decisions requires an understanding of that decision and the skill to convey that decision in clear (and concise) language.

Effective programmers will use all of these constructs (variable types, variable names, and comments). They will develop skills for selecting the right data types, for assigning descriptive variable names, and they will write comments that are helpful to themselves and other programmers.

No comments: