Sunday, September 26, 2010

Programming is more than coding

Programming is more than simply writing code and getting it to "work". Computers are calculators, non-thinking entities that perform specified instructions in a given sequence. The task of programming requires the translation of a desired function into the machine code for the processor and the data storage scheme. The instructions must be specific and precise.

The desired function is often a high-level description of the problem. The task of programming converts that fuzzy, high-level description into the detailed instructions for the computer.

For a single programmer, the task is what we normally think of as programming. He (and it is most often a male) must write the program to perform the desired function. In doing so, the programmer converts the statement of the request (sometimes as informal as "change the discount to exclude items already on sale") into the language of the computer. 

The typical large IT development shop, with multiple programmers and larger systems, divides the tasks of requirements, design, and coding across multiple teams. Some shops have an early fourth phase of business initiatives. The first phase creates documents with a description of the desired functionality. Each successive phase accepts documents and provides documents with a higher degree of specificity. Each phase resolves ambiguities, "tightening" the solution. By the coding phase, the ambiguities have been removed. Indeed, they must be resolved, as the computer cannot use ambiguous instructions.

We have made great strides improving the tail end of this process.

Decades ago, programmers specified instructions in machine code. This code was the instruction set of the processor, and the coding was done in numeric values. The technique was tedious and required painstaking detail to attention.

We quickly moved from that technique to assembly language, which allowed programmers to use alphanumeric symbols instead of numeric codes. A step up -- and a big step -- but coding was still tedious and error-prone.

Today, programmers use high-level programming languages to provide the instructions to the computer. Early high-level languages were COBOL and FORTRAN, and modified versions of these languages are used to this day. COBOL and FORTRAN are by no means the only high-level languages. Many have been created in the decades of computing, ranging from AutoCoder to RPG, Algol to Pascal, BASIC to Visual Basic, and C to C++ and C#. These languages provide powerful constructs for programmers to express concepts concisely and precisely.

These languages reduce the effort for the tail end of the development process: coding. They do nothing for the earlier stages: initiatives, requirements, and design. Much of the work for systems design occurs in these phases, and most of that work is resolving ambiguities. The requirements and design phases (and business initiative phases, for those that have them) are most often written in English, a language that allows (and some say relies on) ambiguity.

Wouldn't it be nice to expand the precision beyond the coding phase and into the design and requirements phases? James Martin, in his 1985 book "System Design from Provably Correct Constructs" attempted that very thing. The technique was not adopted by the industry.

It failed, not because it was wrong, or expensive, or bizarre, but because it required disciplined and organized thinking for the specification of systems. The analysts and designers that create the requirements documents and design documents work in an environment that allows ambiguities. The English language prevents enforcement of specificity at the level that programmers must work. (There is no compiler for English, no list of syntax errors.) James Martin attempted to replace English with a precise language named HOS.

One can view the technique advocated by James Martin as a "very high level" programming language, but a programming language nonetheless. As a programming language, it enforced discipline in thinking and careful, precise specification and organization of ideas. It is this discipline that makes programming hard. HOS failed not because it was broken or imperfect, but because it was still programming and no one recognized the need for precise thinking.

The recent UML notation is another attempt to bring precision to requirements and design. It has received more attention than James Martin's HOS, and may succeed, but only if we recognize the need for disciplined thinking. In short, we have to convert requirements analysts into programmers.

When UML is accepted as a notation for requirements (or even design), and the people creating the UML documents can create those documents with exact specificity and precision, they shall be the new programmers.

Just as assemblers eliminated the need for machine-level programmers, and high-level languages eliminated the need for assembly programmers, UML will eliminate the need for high-level language programmers. (Mostly. Just as we still have a few machine-level programmers and a few assembly language programmers for specialized tasks, so will we need a few -- but only a few -- programmers who understand the "old" languages of COBOL, Pascal, C++, or C#.)

Programming isn't about one language or its syntax. It's not about parse trees, or data structures, or compiler optimizations. It's about thinking precisely.



No comments: