Monday, July 25, 2011

The incredible shrinking program

Computer programs have been shrinking. They have been doing so since the beginning of the computer age. You may think this claim strange, given that computers are much bigger than earlier eras and programs certainly look bigger, with their millions of lines of source code. And you are right -- computer programs have gotten bigger. Yet they have also gotten smaller.

In absolute terms, computer programs are larger. They have more lines of source code, larger memory footprints, and greater complexity.

Relative to the size of the computer, however, computer programs are smaller.

The earliest computers were programmed with plugboards, so the "software" was hardware. For these computers, there was only a thin boundary between the machine and the program, so in a sense the program was the machine.

Computers in the 1940s and 1950s were programmable in the sense we have today, with the hardware and the software being two different kinds of things. The program was loaded into memory and executed, often by an operator. While running, the program was the only thing in memory -- there was no operating system or monitor. The program had to perform all tasks, from input to processing to output.

With the 1960s we saw the advent of operating systems. The operating system contained common functions and the program called the operating system to perform actions. The model of hardware, operating system, and application program was a solid one, and continues to this day.

But notice that the program is now smaller than the machine. The application program is constrained by the operating system. Combined, the program and operating system fill the machine. (For time-sharing systems, the combination is the operating system and all running programs.) Thus, the program has become smaller, giving some functions over to the operating system.

Microcomputers followed this path. The earliest microcomputers (the Altair, the IMSAI, and others of the late 1970s) were processors, memory, and front panels that allowed a person to load a program and run it. This was the same model as the 1940s electronic computers. Hobbyists quickly developed "disk operating systems" although one can argue that the first true operating systems were IBM's OS/2, Microsoft's Windows NT, and variants of Unix, all of which needed the Intel 80386 processor to be effective. As with the mainframe path, the application shrunk and gave up processing to the operating system.

The iOS and Android operating systems extend this trend. Programs are even smaller under these operating systems, yielding more control and code to the operating system.

In the earlier model, the operating system launches a program and the program runs until completion (by signalling the operating system that it is finished). There is one entry point for the program; there can be many exit points.

The model used by iOS and Android is different. Instead of starting the program and letting it run to completion, the operating system issues multiple calls to a program. Instead of a single entry point, a program has multiple (well-defined) entry points. Anyone familiar with event-driven programming will recognize the similarity: Windows sends programs messages about various events (to a single entry point) and the program must fan them out to separate routines. iOS and Android had removed the top layer of that program and fan out the messages themselves.

Thus, programs have once again yielded code to the operating system, and have shrunk in size.

I think that this is a good change. It removes "boilerplate" code from applications and puts the multiple copies in a single place. Application code can focus on the business problem and ignore an infrastructure issue. Programs gain a measure of uniformity.

The model of the operating system lasted for fifty years (twenty in the PC world) and served us well. I want to think of the new model as something different: an operating system with pluggable tasks. I think it will serve us for a long time.

1 comment:

Антон said...

I think it is still hard to predict if it is a trend or we are just trying to figure out the lines drawn on proverbial plot featuring too few data points. For the programming process to be successful it is necessary to follow the good practices and one of the good practices is the reduction of code duplication. In this light it is irrelevant if the common code moved to a thing called operating system or things called libraries, language environment, utility programs, you name it.
What I am curious about is if other trends like 1) cloud computing and 2) virtualization. Another point is the share of the short/long storage medium used by data. Today we have largish datasets and it also decreased the code share in overall memory budget. SIMD instruction sets and separate processors used for specialized data transformation (e.g. GPUs) embrace this trend.