Tuesday, July 19, 2016

How programming languages change

Programming languages change. That's not news. Yet programming languages cannot change arbitrarily; the changes are constrained. We should be aware of this, and pick our technology with this in mind.

If we think of a programming language as a set of features, then programming languages can change in three ways:

Add a feature
Modify a feature
Remove a feature

The easiest change (that is, the type with the least resistance from users) is adding a feature. That's no surprise; it allows all of the old programs to continue working.

Modifying an existing feature or removing a feature is a difficult business. It means that some programs will no longer work. (If you're lucky, they won't compile, or the interpreter will reject them. If you're not lucky, the compiler or interpreter will accept them but process them differently.)

So as a programming language changes, the old features remain. Look inside a modern Fortran compiler and you will find FORMAT statements and arithmetic IF constructs, elements of Fortran's early days.

When a programming language changes enough, we change its name. We (the tech industry) modified the C language to mandate prototypes and in doing so we called the revised language "ANSI C". When Stroustup enhanced C to handle object-oriented concepts, he called it "C with Classes". (We've since named it "C++".)

Sometimes we change not the name but the version number. Visual Basic 4 was quite different from Visual Basic 3, and Visual Basic 5 was quite different from Visual Basic 4 (two of the few examples of non-compatible upgrades). Yet the later versions retained the flavor of Visual Basic, so keeping the name made sense.

Perl 6 is different from Perl 5, yet it still runs old code with a compatibility layer.

Fortran can add features but must remain "Fortranish", otherwise we call it "BASIC" or "FOCAL" or something else. Algol must remain Algol or we call it "C". An enhanced Pascal is called "Object Pascal" or "Delphi".

Language names bound a set of features for the language. Change the feature set beyond the boundary, and you also change the name of the language. Which means that a language can change only so much, in only certain dimensions, while remaining the same language.

When we start a project and select a programming language, we're selecting a set of features for development. We're locking ourselves into a future, one that may expand over time -- or may not -- but will remain centered over its current point. COBOL will always be COBOL, C++ will always be C++, and Ruby will always be Ruby. A COBOL program will always be a COBOL program, a C++ program will always be a C++ program, and a Ruby program will always be a Ruby program.

A lot of this is psychology. We certainly could make radical changes to a programming language (any language) and keep the name. But while we *could* do this, we don't. We make small, gradual changes. The changes to programming languages (I hesitate to use the words "improvements" or "progress") are glacial in nature.

I think that tells us something about ourselves, not the technology.

No comments: