Sunday, April 4, 2010

One language?

I was working with a co-worker last week, who was not that familiar with C++. She was attempting to make (and understand) changes to the code. She's a smart person, but not a programmer, and I realized that part of her difficulty with C++ is that it is not a single language.

She was working on code that had been written by a third person, and she was making some minor changes. The code was not the best C++ code, nor was it the worst, but it was complex enough to confuse.

As I looked at the code, I realized that I was looking at not one but three different languages. The code was "standard" C++ code, with some assignments, a printf() call, and a macro. While all part of normal C++, these three parts of the code have different languages.

The assignment is a normal C++ statement. No problems there.

The printf() call (actually, it was a call to sprintf(), but I consider the printf() and sprintf() calls as part of one family) is C++, yet the format specifier is a different language. The format (in this case "%10.3lf") is its own little language, quite distinct from C++.

The macro (a multi-line expansion that declared a pointer, called malloc(), and then executed a for() loop to initialize members to zero) is also a language of its own. Similar to C++, macros look like C++. Yet there are differences: the space required after the name, the lack of braces, and the backslashes at the end of lines that indicate continuation.

The task of understanding C++ is made harder by the use of multiple languages. I eventually explained the code, and the person making changes understood it, but the cognitive load was higher due to the three different languages. (And the fact that this person did not recognize that there were three different languages with their own scopes did not help. Mind you, I did not recognize that there were three languages at the time, either.)

Templates in C++ could be considered a fourth language, and a friend of mine thinks that templates are two additional languages to C++.

Egads! *Five* different languages for a single program?

But wait, don't forget the ability to include assembly language! That brings the total up to six!

Yes, it would be possible to write a C++ program that used six different languages.

This problem is not unique to C++. Other common languages have such "extensions", although not to the extreme of C++. Perl, for example, uses regular expressions. These are their own language, so one could easily claim that a Perl program with regular expressions is really a program in two languages.

Beyond regular expressions, SQL is its own language, so a C# program could be written in three languages: C#, regular expressions (if you used them), and SQL.

Come to think of it, you can use SQL in C++, so our possible total for languages in a C++ program is up to seven. Add in the feature of HTML or XML, and you've reached eight.

This is too much. Any program that uses more languages than Snow White had dwarves is extreme. Eight is more than I want, and more than a reasonable person can keep straight. The cost of shifting from one language context to another is greater than zero, and must be considered in the maintenance of a program.

I don't think that we will every get back to "one language in one program". Regular expressions and SQL are too useful. But we can keep other languages out. C# and Java have removed the pre-processor and thereby removed the macro language. They've also removed assembly language. These were two changes that I liked, and now I know why. But they've added templates (or "generics", as they call them), which I see as a drawback.

Well, a little progress is better than none.

No comments: