Monday, May 10, 2021

Large programming languages considered harmful

I have become disenchanted with the C# programming language. When it was introduced in 2001, I like the language. But the latter few years have seen me less interested. I finally figured out why.

The reason for my disenchantment is the size of C#. The original version was a medium-sized language. It was an object-oriented language, and in many ways a copy of Java (which was also a medium-sized language in 2001).

Over the years, Microsoft has released new versions of C#. Each new version added features, and increased the capabilities of the language. But as Microsoft increased the capabilities, it also increased the size of the language.

The size of a programming language is an imprecise concept. It is more than a simple count of the keywords, or the number of rules for syntax. The measure I like to use is a rough guess of how much space it requires in the head of a programmer; how much brainpower is required to learn the language and how many neurons are needed to remember the different concepts, keywords, and rules of the language.

Such a measure has not been made with any tools, at least not that I know of. All I have is a rough estimate of a language's size. But that rough estimate is good enough to classify languages into small (BASIC, AWK, original FORTRAN), medium (Ruby, Python), and large (COBOL, C#, and Perl).

It may seem natural that languages expand over time. Languages other than C# have been expanded: Java (by Sun and later Oracle), Visual Basic (by Microsoft), C++ (by committee), Perl, Python, Ruby, and even languages such as COBOL and Fortran.

But such expansions of languages worry me. The source of my worry goes back to the "language wars" of the early days of computing.

In the 1960s, 1970s, and 1980s programmers argued (passionately) over programming languages. C vs Pascal, BASIC vs FORTRAN, Assembly language vs... everything.

Those arguments were fueled, mostly in my opinion, by of the high cost of changing. Programming languages were not free. Compilers and interpreters were sold (or licensed). Changing languages meant spending for the new language -- and abandoning the investment in the old. And that meant that, once invested in a language, you were loath to give it up. And that meant you would defend that choice of programming language. People would rather fight than switch.

In the 2000s, thanks to open source, compilers and interpreters became free. The financial cost of changing from one language to another disappeared. And that meant that people could switch programming languages. And that meant that people could switch rather than fight.

So why am I worried, now, in 2021, about a new round of language wars?

The reason is the size of programming languages. More specifically, the size of the environment for any one programming language. That environment includes the language, the compiler (or interpreter), the standard library (or common packages used for development), and the IDE. Each of these components requires some amount of effort to learn and remember.

As each of these environments grows, the effort to learn it grows. And that means that the effort to switch from one language to another also grows. Changing from C# to Python, for example, requires not only learning the Python syntax, it also requires learning the common packages that are necessary for effective Python programs and also learning the IDE (probably PyCharm, which is quite different from Visual Studio).

We are rebuilding the barriers between programming languages. The old barrier was financial: it cost a lot to switch from one language to another. The new barrier is not financial but technical: the tools are free but the time to learn them is significant.

Barriers to switching programming languages can put us back in the position of defending our choices. Once again, programmers may rather fight than switch.

No comments: