Thursday, February 26, 2015

The names of programming languages

A recent project involved a new programming language (a variant of the classic Dartmouth BASIC) and therefore saw the need for a name for the new language. Of course, a new name should be different from existing names, so I researched the names of programming languages.

My first observation was that we, as an industry, have created a lot of programming languages! I usually think of the set of languages as BASIC, FORTRAN, COBOL, Pascal, C, C++, Java, C#, Perl, Python, and Ruby -- the languages that I use currently or have used in the past. If I think about it, I add some other common languages: RPG, Eiffel, F#, Modula, Prolog, LISP, Forth, AWK, ML, Haskell, and Erlang. (These a programming languages that I have either read about or discussed with fellow programmers.)

As I surveyed existing programming languages, I found many more languages. I found extinct languages, and extant languages. And I noticed various things about their names.

Programming languages, except for a few early languages, have names that are easily pronounceable. Aside from the early "A-0" and "B-0", most languages have recognizable names. We switched quickly from designations of letters and numbers to names like FORTRAN and COBOL.

I also noticed that some names last longer than others. Not just the languages, but the names. The best example may be "BASIC". Created in the 1960s, the BASIC language has undergone a number of changes (some of them radical) and has had a number of implementations. Yet despite its changes, the name has remained. The name has been extended with letters ("CBASIC", "ZBASIC", "GW-BASIC"), numbers ("BASIC-80", "BASIC09"), symbols ("BASIC++"), prefix words ("Visual Basic", "True Basic", "Power Basic"), and sometimes suffixes ("BASIC-PLUS"). Each of these names was used for a variant of the original BASIC language, with separate enhancements.

Other long-lasting names include "LISP", "FORTRAN", and "COBOL".

Long-lasting names tend to have two syllables. Longer names do not stay around. The early languages "BACAIC", "COLINGO", "DYNAMO", "FLOW-MATIC", "FORTRANSIT", "JOVIAL", "MATH-MATIC", "MILITRAN", "NELIAC", and "UNICODE" (yes it was a programming language, different from today's character set) are no longer with us.

Short names of single letters have little popularity. Aside from C (the one exception), other languages (B, D, J) see limited acceptance. The up-and-coming R language for numeric analysis (derived from S, another single-letter language) may have limited acceptance, based on the name. It may be better to change the name to "R-squared" with the designation "R2".

Our current set of popular languages have two-syllable names: "VB" (pronounced "vee bee"), "C#" ("see' sharp"), Java, Python, and Ruby. Even the database language SQL is pronounced "see' kwell" to give it two syllables. Popular languages with only one syllable are Perl (which seems to be on the decline) C, and Swift.

PHP and C++ have three names with syllables. Objective-C clocks in with a possibly unwieldy four syllables; perhaps this was an incentive for Apple to change to Swift.

I expect our two-syllable names to stay with us. The languages may change, as they have changed in the past.

As for my new programming language, the one that was derived from BASIC? I picked a new name, not a variant of BASIC. As someone has already snagged the name "ACIDIC", I chose the synonym alkaline, but changed it to a two-syllable form: Alkyl.

No comments: