Wednesday, July 31, 2019

Programming languages, structured or not, immediate or not

I had some spare time on my hands, and any of my friends will tell you that when I have spare time, I think about things. This time, I thought about programming languages.

That's not a surprise. I often think about programming languages. This time I thought about two aspects of programming languages that I call structuredness and immediacy. Immediacy is simply the rapidity in which a program can respond. The languages Perl, Python, and Ruby all have high immediacy, as one can start a REPL (for read-evaluate-print-loop) that takes input and provides the result right away. (In contrast, programs in the languages C#, Java, Go, and Rust must be compiled, so there is an extra step to get a response.

Structuredness, in a language, is how much organization was encouraged by the language. I say "encouraged" because many languages will allow unstructured code. Some languages do require careful thought and organization prior to coding. Functional programming languages require a great deal of thought. Object-oriented languages such as C++, C#, and Java provide some structure. Old-school BASIC did not provide structure at all, with only a GOTO and a simple IF statement to organize your code. (Visual Basic has much more structure than old-school BASIC, and it is closer to C# and Java, although it has a bit more immediacy than those languages.)

My thoughts on structuredness and immediacy led me to think about the combination of the two. Some languages are high in one aspect, and some languages mix the two aspects. Was there an overall pattern?

I built a simple grid with structure on one axis and immediacy on the other. Structure was on the vertical axis: languages with high structure were higher on the chart, languages with less structure were lower. Immediacy was on the horizontal axis, with languages with high immediacy to the right and languages that provided slower response were to the left.

Here's the grid:

                         structured
                              ^
     Go C++ Objective-C Swift |
          C# Java VB.NET      |
                              | Python Ruby
       (Pascal)               |    Matlab
                              |      Visual Basic
      C                       | SQL   Perl
      COBOL Fortran           |    JavaScript (Forth)
slow <------------------------------------------------> <----------------------------------------------->immediate
       (FORTRAN)              |          R
                              |            (BASIC)
                              |
                              |
                              |
                              |            spreadsheet
                              v
                        unstructured

Some notes on the grid:
- Languages in parentheses are older, less-used languages.
- Fortran appears twice: "Fortran" is the modern version and "(FORTRAN)" is the 1960s version
- I have included "spreadsheet" as a programming language

Compiled languages appear on the left (slow) side. This is not related to the performance of programs written in these languages, but the development experience. When programming in a compiled language, one must edit the code, stop and compile, and then run the program. Languages on the right-hand side (the "immediate" side) do not need the compile step and provide feedback faster.

Notice that, aside from the elder FORTRAN, there are no slow, unstructured languages. Also notice that the structured immediate languages (Python, Ruby, et al.) cluster away from the extreme corner of structured and immediate. They are closer to the center.

The result is (roughly) a "main sequence" of programming languages, similar to the main sequence astronomers see in the types of stars. Programming languages tend to a moderate zone, where trade-offs are made between structure and immediacy.

The unusual entry was the spreadsheet, which I consider a programming language for this exercise. It appears in the extreme corner for unstructured and immediate. The spreadsheet, as a programming environment, is the fastest thing we have. Enter a value or a formula in a cell and the change "goes live" immediately. ("Before your finger is off the ENTER key", as a colleague would say.) This is faster than any IDE or compiler or interpreter for any other language.

Spreadsheets are also unstructured. There are no structures in spreadsheets, other than multiple sheets for different sets of data. While it is possible to carefully organize data in a spreadsheet, there is nothing that mandates the organization or even encourages it. (I'm thinking about the formulas in cells. A sophisticated macro programming language is a different thing.)

I think spreadsheets took over a specific type of computing. They became the master of immediate, unstructured programming. BASIC and Forth could not compete with them, and no language since has tried to compete with the spreadsheet. The spreadsheet is the most effective form of this kind of computing, and I see nothing that will replace it.

Therefore, we can predict that spreadsheets will stay with us for some time. It may not be Microsoft Excel, but it will be a spreadsheet.

We can also predict that programming languages will stay within the main sequence of compromise between structure and immediacy.

In other words, BASIC is not going to make a comeback. Nor will Forth, regrettably.

Tuesday, July 16, 2019

Across and down

All programming languages have rules. These rules define what can be done and what cannot be done in a valid program. Some languages even have rules for certain things that must be done. (COBOL, for example, requires the four 'DIVISION' sections in each program.)

Beyond rules, there are styles. Styles are different from rules. Rules are firm. Styles are soft. Styles are guidelines: good to follow, but break them when necessary.

Different languages have different styles. Some style guidelines are common: Many languages have guidelines for indentation and the naming of classes, functions, and variables. Some style guidelines are unique to languages.

The Python programming language has a style which limits line length. (To 80 characters, if you are interested.)

Ruby has a style for line length, too. (That is, if you use Rubocop with its default configuration.)

They are not the first languages to care about line length. COBOL and FORTRAN limited line length to 72 characters. These were rules, not guidelines. The origin was in punch cards, and the language standards specified the column layout and specifies 72 as a limit. Compilers ignored anything past column 72, and woe to the programmer who let a line exceed that length.

The limit in Python is a guideline. One is free to write Python with lines that exceed 80 characters, and the Python interpreter will run the code. Similarly, Ruby's style checker, Rubocop, can be configured to warn about any line length. Ruby itself will run the long lines of code. But limits on line length make for code that is more readable.

Programs exist in two dimensions. Not just across, but also down. Code consist of lines of text.

While some languages limit the width of the code (the number of columns), no language limits the "height" of the code -- the number of lines in a program, or a module, or a class.

Some implementations of languages impose a limit on the number of lines. Microsoft BASIC, for example, limited line numbers to four digits, and since each line had to have a unique line number, that imposed an upper bound of 10,000 lines. Some compilers can handle as many lines as will fit in memory -- and no more. But these are limits imposed by the implementation. I am free, for example, to create an interpreter for BASIC that can handle more than 10,000 lines. (Or fewer, stopping at 1,000.) The language does not dictate the limit.

I don't want the harshly-enforced and unconfigurable limits of the days of early computing. But I think we could use with some guidelines for code length. Rubocop, to its credit, does warn about functions that exceed a configurable limit. There are tools for other languages that warn about the complexity of functions and classes. The idea of "the code is too long" has been bubbling in the development community for decades.

Perhaps it is time we gave it some serious thought.

One creative idea (I do not remember who posed it) was to use the IDE (or the editor) to limit program size. The idea was this: Don't allow scrolling in the window that holds the code. Instead of scrolling, as a programmer increased the length of a function, the editor reduced the font size. (The idea was to keep the entire function visible.) As the code grows in size, the text shrinks. Eventually, one reaches a point when the code becomes unreadable.

The idea of shrinking code on the screen is amusing, but the idea of limiting code size may have merit. Could we set style limits for the length of functions and classes? (Such limits and warnings already exist in Rubocop, so the answer is clearly 'yes'.)

The better question is: How do limits on code length (number of lines) help stakeholders? How do they help developers, and how do they help users?

The obvious response is that shorter functions (and shorter classes) are easier to read and comprehend, perform fewer tasks, and are easier to verify (and to correct). At least, that is what I want the answer to be -- I don't know that we have hard observations that confirm that point of view. I can say that my experience confirms this opinion; I have worked on several systems, in different languages, splitting large functions and classes into smaller ones, with the result being that the re-designed code is easier to maintain. Smaller functions are easier to read.

I believe that code should consist of small classes and small functions. Guidelines and tools that help us keep functions short and classes small will improve our code. Remember that code exists in two dimensions (across and down) and that it should be moderate in both.

Monday, July 8, 2019

Lots of (obsolete) Chromebooks

We users of PCs are used to upgrades, for both hardware and software. We comfortably expect this year's PC to be faster than last year's PC, and this year's Windows (or macOS, or Linux) to be better than last year's Windows.

We're also used to obsolescence with hardware and software. Very few people use Windows XP these days, and the number of people using Windows 3.1 (or MS-DOS) is vanishingly small. The modern PC uses an Intel or AMD 64-bit processor.

Hardware and software both follow a pattern of introduction, acceptance, popularity, and eventual replacement. It should not surprise us that Chromebooks follow the same pattern. Google specifies hardware platforms and manufacturers build those platforms and install Chrome OS. After some time, Google drops support for a platform. (That period of time is a little over six years.)

For obsolete PCs (those not supported by Windows) and MacBooks (those not supported by macOS) the usual "upgrade" is to install Linux. There are several Linux distros that are suitable for older hardware. (I myself am running Ubuntu 16.04 on an old 32-bit Intel-based MacBook.)

Back to Chromebooks. What will happen with all of those Chromebooks that are marked as "obsolete" by Google?

There are a few paths forward.

The first (and least effort) path is to simply continue using the Chromebook and its version of Chrome. Chrome OS should continue to run, and Chrome should continue to run. The Chromebook won't receive updates, so Chrome will be "frozen in time" and gradually become older, compared to other browsers. There may come a time when its certificates expire, and it will be unable to initiate secure sessions with servers. At that point, Chrome (and the Chromebook) will have very few uses.

Another obvious path is to replace it. Chromebooks are typically less expensive than PCs, and one could easily buy a new Chromebook. (And since the Chromebook model of computing is to store everything on the server and nothing on the Chromebook, there is no data to migrate from the old Chromebook to the new one.)

Yet there is another option between "continue as is" and "replace".

One could replace the operating system (and the browser). The Chromebook is a PC, effectively, and there are ways to replace its operating system. Microsoft has instructions for installing Windows 10 on a Chromebook, and there are many sites that explain how to install Linux on a Chromebook.

Old Chromebooks will be fertile ground for tinkerers and hobbyists. Tinkerers and hobbyists are willing to open laptops (Chromebooks included), adjust hardware, and install operating systems. When Google drops support for a specific model of Chromebook, there is little to lose in replacing Chrome OS with something like Linux. (Windows 10 on a Chromebook is tempting, but many Chromebooks have minimal hardware, and Linux may be the better fit.)

I expect to see lots of Chromebooks on the used market, in stores and online, and lots of people experimenting with them. They are low-cost PCs suitable for small applications. The initial uses will be as web browsers or remote terminals to server-based applications (because that what we use Chromebooks for now). But tinkerers and hobbyists are clever and imaginative, and we may see new uses, such as low-end games or portable word processors.

Perhaps a new operating system will emerge, one that is specialized for low-end hardware. There are already Linux distros which support low-end PCs (Puppy Linux, for one) and we may see more interest in those.

Those Chromebooks that are converted to Linux will probably end up running a browser. It may be Firefox, or, in an ironic twist, they may run Chromium -- or even Chrome! The machine that Google says is "not good enough" may be just good enough to run Google's browser.