Tuesday, May 8, 2012

Why we change languages

An acquaintance asked me: "Why do computer programmers change their languages?"

The question gave me pause. Why *do* we change our languages? It seems that every few years we introduce a new language. (BASIC in the mid-1960s, Pascal and C in the 1970s, C++ in the late 1980s, Java in 1995, C# in 2000... the list goes on.)

It is not about switching from one language to another. The question is more about why we modify our languages. Why do we change the language syntax over time?

The obvious answer is that hardware gets more powerful, and we (as programmers) can do more with languages when they take advantage of that hardware. The original FORTRAN was little more than a macro-assembler, converting source code into op-codes for the IBM 704. Later systems had more memory and more powerful processors, so compilers and interpreters could take advantage of the hardware. What was considered extravagant in the days of FORTRAN would be considered acceptable in the days of BASIC.

I have a different idea of languages, and our (seemingly-infinite) desire for new languages.

A programming language is a representation of our thought process. It (the language) allows us to define data in specific structures (or types), organize it in collections (classes or containers), and process it according to well-defined rules.

We change our languages according to our understanding of data, the collections and relationships of data, and our ideas for processing. Part of this is driven by hardware, but a lot of it is driven by psychology.

FORTRAN and COBOL were languages that considered data to be structured and mostly linear. The COBOL record definition was little more than a flexible punch card. (FORTRAN, too.)

Pascal (and all of the Algol languages) viewed data as a linked list or tree.

C++, Java, and C# considered data to be something contained in classes.

Perl included hashes ("dictionaries", in some venues) as a first-class members of the language. Python and Ruby have done the same.

As our view of data has matured, so have our languages. (Or perhaps, as our languages have changed, so has our view of data. I'm not sure about the direction of causality.)

It is not so much the language as it is (in my mind) the data structures that the languages provides. Those structures are still evolving, still growing as out knowledge increases.

We might think that our current collection of arrays, hashtables, and lists is sufficient. But I suspect that our confidence is ill-founded. I'm fairly sure that the FORTRAN programmers believed that zero-based arrays were sufficient for their programming challenges.

When we find a set of data structures that are "good enough", I think we will find that our languages are also "good enough".

No comments: