Tuesday, December 29, 2020

For readability, BASIC was the best

One language stands alone in terms of readability.

That language is BASIC.

BASIC -- the old, pre-Visual Basic of the 1980s -- has a unique characteristic: a single line can be read and understood.

One may think that a line from any programming language can be read and understood. After all, we read and understand programs all the time, don't we? That's true, but we read entire programs, or large sections of programs. Those large fragments of programs contain information that defines the classes, functions, and variables in the programs, and we use that information to understand the code. But if we strip away that extra information, if we limit ourselves to a single line, then we cannot read and completely understand the code.

Let's look at an example line of code:

a = b * 5

Can you tell what this code does? For certain? I cannot.

A naive assessment is that the code retrieves the value of variable 'b', multiplies it by 5, and stores the result in the variable 'a'. It is easy to assume that the variables 'a' and 'b' are numeric. Yet we don't know that -- we only assume it.

If I tell you that the code is Python code, and that 'b' refers to a string object, then our understanding of this code changes. The code still performs a 'multiply' operation, but 'multiply' for a string object is very different from 'multiply' for a numeric object.

Instead, if I tell you that the code is C++, then we must identify the type for 'b' (which is not provided in the single line of code) and we must know if the class for 'b' defines the '*' operator. That operation could do anything, from casting b's contents to a number and multiplying 5 to sending some text to 'cout'.

We like to think we understand the code, but instead we are constantly making assumptions about the code and building an interpretation that is consistent with those assumptions.

But the language BASIC is different.

In BASIC, the line

a = b * 5

or, if you prefer

100 LET A = B * 5

is completely defined. We know that the variable B contains a numeric value. (The syntax and grammar rules for BASIC require that a variable with no trailing sigil is a numeric variable.) We also know that the value of B is defined. (Variables are always defined. If not initialized in our code, that have the value 0.) We know the behavior of the '*' operator -- it cannot be overridden or changed. We know that the variable 'A' is numeric, and that it can receive the results of the multiply operation.

We know these things. We do not need other parts of the program to identify the type for a variable, or a possible redefinition of an operator.

This property of BASIC means that BASIC is readable in a way that other programming languages are not. Other programming languages require knowledge of declarations. All of the C-based languages (C, Objective-C, C++, C#, and even Java) require this. Perl, Python, and Ruby don't have variables; they have names that can refer to any type of object. The only other programming language that comes close is FORTRAN II. It might have had the same readability; it had rules for names of variables and functions.

BASIC's readability is possible because it requires the data type of a variable to be encoded in the name of the variable. This is completely at odds with every modern language, which allow variables to be named with no special markings for type. BASIC used static typing; not only static typing, but overt typing -- the type was expressed in the name.

Static, overt typing was possible in BASIC because BASIC used a limited number of types (numeric, integer, single-precision floating point, double-precision floating point, and string) each of which could be represented by a single punctuation character. Each variable name had a sigil for the type. (Or no sigil, in which case the type was numeric.)

Those sigils were so useful that programmers who switched to Visual Basic kept the idea, through the use of programming style conventions that asked for prefixes for each variable. That effort become unwieldy, as there were many types (Visual Basic used many libraries for Windows functions and classes) and there was no all-encompassing standard and no was to enforce a standard.

Overt typing is possible with a language that has a limited number of types. It won't work (or it hasn't worked) for object-oriented languages. Those languages are designed for large systems with large code bases. They have built-in types and allow for user-defined types, with no support to indicate the type in the name of the variable. And as we saw with Visual Basic, expressing the different types is complicated.

But that doesn't mean the idea is useless. Overt typing worked for BASIC, a language that was designed for small programs. (BASIC was meant to be a language for teaching the skills of programming. The name was an acronym: Beginner's All-Purpose Symbolic Instruction Code.) Overt typing might be helpful for a small language, one designed for small programs.

It strikes me that cloud computing is the place for small languages. Cloud computing uses multiple processors to perform calculations, and split code bases. A well-designed cloud application consists of lots of small programs. Those small programs don't have to be built with object-oriented programming languages. I expect to see new programming languages for cloud-based computing, programming languages that are designed for small programs.

I'm not recommending that we switch from our current set of programming languages to BASIC. But I do think that the readability of BASIC deserves some attention.

Because programs, large or small, are easier to understand when they are readable.

No comments: