Programming languages have, as one of their features, variables. A variable is a thing that holds a value and that value can vary over time. A simple example:
The statement
a = 1
defines a variable named 'a' and assigns a value of 1. Later, the program may contain the statement
a = 2
which changes the value from 1 to 2.
The exact operations vary from language to language. In C and C++, the name is closely associated with the underlying memory for the value. Python and Ruby separate the name from the underlying memory, which means that the name can be re-assigned to point to a different underlying value. In C and C++, the names cannot be changed in that manner. But that distinction has little to do with this discussion. Read on.
Some languages have the notion of constants. A constant is a thing that holds a value and that value cannot change over time. It remains constant. C, C++, and Pascal have this notion. In C, a program can contain the statement
const int a = 1;
A later statement that attempts to change the value of 'a' will cause a compiler error. Python and Ruby have no such notion.
Note that I am referring to constants, not literals such as '1' or '3.14' that appear in the code. These are truly constant and cannot be assigned new values. Some early language implementations did allow such behavior. It was never popular.
The notion of 'constness' is useful. It allows the compiler to optimize the code for certain operations. When applied to a parameter of a function, it informs the programmer that he cannot change the value. In C++, a function of a class can be declared 'const' and then that function cannot modify member variables. (I find this capability helpful to organize code and separate functions that change an object from functions that do not.)
The notion of 'constness' is a specific form of a more general concept, one that we programmers tend to not think about. That concept is 'read but don't write', or 'look but don't touch'. Or as I like to think of it, the "Museum Principle".
The Museum Principle states that you can observe the value of a variable, but you cannot change it. This principle is different from 'constness', which states that the value of a variable cannot (and will not) change. The two are close but not identical. The Museum Principle allows the variable to change; but you (or your code) are not making the change.
It may surprise readers to learn that the Museum Principle has been used already, and for quite a long time.
The idea of "look but don't touch" is implemented in Fortran and Pascal, in loop constructs. In these languages, a loop has an index value. The index value is set to an initial value and later modified for each iteration of the loop. Here are some examples that print the numbers from 1 to 10:
An example in Fortran:
do 100 i = 1, 10
write(*,*) 'i =', i
100 continue
An example in Pascal:
for i:= 1 to 10 do
begin
writeln('i =', i)
end;
In both of these loops, the variable i is initialized to the value 1 and incremented by 1 until it reaches the value 10. The body of each loop prints the value of i.
Now here is where the Museum Principle comes into play: In both Fortran and Pascal, you cannot change the value of i within the loop.
That is, the following code is illegal and will not compile:
In Fortran:
do 100 i = 1, 10
i = 20
write(*,*) 'i =', i
100 continue
In Pascal:
for i:= 1 to 10 do
begin
i := 20
writeln('i =', i)
end;
The highlighted lines are not permitted. It is part of the specification for both Fortran and Pascal that the loop index is not to be assigned. (Early versions of Fortran and Pascal guaranteed this behavior. Later versions of the languages, which allowed aliases via pointers, could not.)
Compare this to a similar loop in C or C++:
for (unsigned int i = 1; i <= 10; i++)
{
printf("%d\n", i);
}
The specifications for the C and C++ languages have no such restriction on loop indexes. (In fact, C and C++ do not have the notion of a loop index; they merely allow a variable to be declared and assigned at the beginning of the loop.)
The following code is legal in C and C++ (and does what you expect):
for (unsigned int i = 1; i <= 10; i++)
{
i = 20;
printf("%d\n", i);
}
My point here is not to say that Fortran and Pascal are superior to C and C++ (or that C and C++ are superior to Fortran and Pascal). My point is to show that the Museum Principle is useful.
Preventing changes to a loop index variable is the Museum Principle. The programmer can see the value of the variable, and the value does change, but the programmer cannot change the value. The programmer is constrained.
Some might chafe at the idea of such a restraint. Many have complained about the restrictions of Pascal and lauded the freedom of C. Yet over time, modern languages have implemented the restraints of Pascal, such as bounds-checking and type conversion.
Modern languages often eliminate loop index variables, by providing "for-each" loops that iterate over a collection. This feature is a stronger form of the "look but don't touch" restriction on loop index variables. One cannot complain about Fortran's limitations of loop index variables, unless one also dislikes the 'for-each' construct. A for-each iterator has a loop index, invisible (and untouchable!) inside.
For the "normal" loop (in which the index variable is not modified), there is no benefit from a prohibition of change to the index variable. (The programmer makes no attempt to change it.) It is the unusual loops, the loops which have extra logic for special cases, that benefit. Changing the loop index value is a shortcut, often serving a purpose that is not clear (and many times not documented). Preventing that short-cut forces the programmer to use code that is more explicit. A hassle in the short term, but better in the long term.
Constraints -- the right type of constraints -- are useful to programmers. The "structured programming" method was all about constraints for control structures (loops and conditionals) and the prohibition of "goto" operations. Programmers at the time complained, but looking back we can see that it was the right thing to do.
Constraints on loop index variables are also the right thing to do. Applying the Museum Principle to loop index variables will improve code and reduce errors.
Showing posts with label Fortran. Show all posts
Showing posts with label Fortran. Show all posts
Monday, August 19, 2019
Thursday, December 17, 2015
Indentation and brackets in languages
Apple has released Swift, an object-oriented language for developing applications. It may be that Swift marks the end of the C-style indentation and brace syntax.
Why is indentation and bracket style important? Maybe we should start with a definition of indentation and brace style, and go from there.
In the far past, programs were punched onto cards and the syntax of the programming language reflected that. FORTRAN and COBOL languages reserved columns 1 through 6 for line numbers, column 7 for line continuation and comments, and columns 73 to 80 for card sequencing. (The last was used to sort the card deck when it was dropped and cards spilled on the floor.)
The limitations of the punch card and the notion that one and only one statement may appear on one (or possibly more) cards had a heavy influence on the syntax of the language.
Algol introduced a number of changes. It introduced the 'begin/end' keywords that were later used by Pascal and became the braces in most modern languages. It removed the importance of newlines, allowing multiple statements on a single line, and allowing a single statement to span multiple lines without special continuation markers.
The result was the syntax we have in C, C++, Java, and C# (and a bunch of other languages). Semicolons (not newlines) terminate statements. Braces group statements. Indentation doesn't matter, statements can begin in any column we desire. All of these features come from Algol. (At the beginning of this essay I referred to it as "C-style indentation and brace syntax", but the ideas really originated in Algol.)
The "Algol revolution" threw off the shackles of column-based syntax. Programmers may not have rejoiced, but they did like the new world. They wrote programs gleefully, indenting as they liked and arguing about the "one true brace style".
Some programmers wanted this style:
function_name(params) {
statements
}
Others wanted:
function_name(params)
{
statements
}
And yet others wanted:
function_name(params)
{
statements
}
There were some programmers who wanted to use whatever style that felt comfortable at the time, even if that meant that their code was inconsistent and difficult to read.
It turns out that the complete freedom of indentation and brace placement is not always a good thing. In the past decades, we have taken some steps in the direction of constraints on indentation and braces.
For years, programming teams have held code reviews. Some of these reviews look at the formatting of the code. Inconsistent indentation is flagged as something to be corrected. Variants of the lint program warn on inconsistent indentation.
Visual Studio, Microsoft's IDE for professionals, auto-formats code. It did some with the old Visual Basic. Today it auto-indents and auto-spaces C# code. It even aligns braces according to the style you choose.
The Python language uses indentation, not braces, to mark code blocks. It reports inconsistent indentation and refuses to run code until the indentation is consistent.
The Go language uses braces to mark code blocks and it requires a specific style of braces (the first style shown above). It won't compile programs that use other styles.
We have designed our processes, out tools, and our languages to care about indentation and brace style. We are gradually moving to language syntax that uses them, that considers them significant.
As programmers, as people who read code, we want consistent indentation. We want consistent brace style. We want these things because it makes the code easier to read.
Which gets us back to Swift.
Swift has restrictions on brace style. It uses brace placement to assist in the determination of statements, visible in the syntax for if statements and for loops (and I suspect while loops). Indentation doesn't matter. Brace style does.
We now have three popular languages (Python, Go, and Swift) that care about indentation or brace style. That, I think, shows a trend. Language designers are beginning to care about these aspects of syntax, and developers are willing to work with languages that enforce them. We will not return to the harsh constraints of FORTRAN and COBOL, but we will retreat from the complete freedom of Algol, Pascal, and C. And I think that the middle ground allow us to develop and share programs effectively.
Why is indentation and bracket style important? Maybe we should start with a definition of indentation and brace style, and go from there.
In the far past, programs were punched onto cards and the syntax of the programming language reflected that. FORTRAN and COBOL languages reserved columns 1 through 6 for line numbers, column 7 for line continuation and comments, and columns 73 to 80 for card sequencing. (The last was used to sort the card deck when it was dropped and cards spilled on the floor.)
The limitations of the punch card and the notion that one and only one statement may appear on one (or possibly more) cards had a heavy influence on the syntax of the language.
Algol introduced a number of changes. It introduced the 'begin/end' keywords that were later used by Pascal and became the braces in most modern languages. It removed the importance of newlines, allowing multiple statements on a single line, and allowing a single statement to span multiple lines without special continuation markers.
The result was the syntax we have in C, C++, Java, and C# (and a bunch of other languages). Semicolons (not newlines) terminate statements. Braces group statements. Indentation doesn't matter, statements can begin in any column we desire. All of these features come from Algol. (At the beginning of this essay I referred to it as "C-style indentation and brace syntax", but the ideas really originated in Algol.)
The "Algol revolution" threw off the shackles of column-based syntax. Programmers may not have rejoiced, but they did like the new world. They wrote programs gleefully, indenting as they liked and arguing about the "one true brace style".
Some programmers wanted this style:
function_name(params) {
statements
}
Others wanted:
function_name(params)
{
statements
}
And yet others wanted:
function_name(params)
{
statements
}
There were some programmers who wanted to use whatever style that felt comfortable at the time, even if that meant that their code was inconsistent and difficult to read.
It turns out that the complete freedom of indentation and brace placement is not always a good thing. In the past decades, we have taken some steps in the direction of constraints on indentation and braces.
For years, programming teams have held code reviews. Some of these reviews look at the formatting of the code. Inconsistent indentation is flagged as something to be corrected. Variants of the lint program warn on inconsistent indentation.
Visual Studio, Microsoft's IDE for professionals, auto-formats code. It did some with the old Visual Basic. Today it auto-indents and auto-spaces C# code. It even aligns braces according to the style you choose.
The Python language uses indentation, not braces, to mark code blocks. It reports inconsistent indentation and refuses to run code until the indentation is consistent.
The Go language uses braces to mark code blocks and it requires a specific style of braces (the first style shown above). It won't compile programs that use other styles.
We have designed our processes, out tools, and our languages to care about indentation and brace style. We are gradually moving to language syntax that uses them, that considers them significant.
As programmers, as people who read code, we want consistent indentation. We want consistent brace style. We want these things because it makes the code easier to read.
Which gets us back to Swift.
Swift has restrictions on brace style. It uses brace placement to assist in the determination of statements, visible in the syntax for if statements and for loops (and I suspect while loops). Indentation doesn't matter. Brace style does.
We now have three popular languages (Python, Go, and Swift) that care about indentation or brace style. That, I think, shows a trend. Language designers are beginning to care about these aspects of syntax, and developers are willing to work with languages that enforce them. We will not return to the harsh constraints of FORTRAN and COBOL, but we will retreat from the complete freedom of Algol, Pascal, and C. And I think that the middle ground allow us to develop and share programs effectively.
Labels:
brace style,
Fortran,
indentation,
programming languages,
punch cards
Tuesday, May 12, 2015
Cloud programs are mainframe programs, sort of
I was fortunate to start my programming career in the dawn of the age of BASIC. The BASIC language was designed with the user in mind and had several features that made it easy to use.
To truly appreciate BASIC, one must understand the languages that came before it. Comparing BASIC to JavaScript, or Swift, or Ruby makes little sense; each of those came after BASIC (long after) and built on the experience of BASIC. The advantages of BASIC are clear when compared to the languages of the time: COBOL and Fortran.
BASIC was interpreted, which meant that a program could be typed and run in one fast session. COBOL and Fortran were compiled, which meant that a program had to be typed, saved to disk, compiled, linked, and then run. With BASIC, one could change a program and re-run it; with other languages you had to go through the entire edit-save-compile-link cycle.
Where BASIC really had an advantage over COBOL and Fortran was with input. BASIC had a flexible INPUT statement that let a program read values from the user. COBOL was designed to read data from punch cards; Fortran was designed to read data from magnetic tape. Both were later modified to handle input from "the console" -- the terminal at which a programmer used for an interactive session -- but even with those changes, interactive programs were painful to write. Yet in BASIC it was easy to write a program that asked the user "would you like to run again?".
The interactive properties of BASIC made it a hit with microcomputer users. (It's availability, due to Microsoft's aggressive marketing, also helped.) Fortran and COBOL achieved minimal success with microcomputers, setting up the divide between "mainframe programming" (COBOL and Fortran) and "microcomputer programming" (BASIC, and later Pascal). Some rash young members of the computing field called the two divisions "ancient programming" and "modern programming".
But the division wasn't so much between mainframe and microcomputer (or old and new) as we thought. Instead, the division was between interactive and non-interactive. Microcomputers and their applications were interactive and mainframes and their applications were non-interactive. (Mainframe applications were also batch-oriented, which is another aspect.)
What does all of this history have to do with computing in the current day? Well, cloud computing is pretty modern stuff, and it is quite different from the interactive programming on microcomputers. I don't see anyone building cloud applications with BASIC or Pascal; people use Python or Ruby or Java or C#. But cloud computing is close to mainframe computing (yes, that "ancient" form of computing) in that it is non-interactive. A cloud application gets a request, processes it, and returns a response -- and that's it. There is no "would you like to run again?" option from cloud applications.
Which is not to say that today's systems are not interactive -- they are. But it is not the cloud portion of the system that is interactive. The interactivity with the use has been separated from the cloud; it lives in the mobile app on the user's phone, or perhaps in a JavaScript app in a browser.
With all of the user interaction in the mobile app (or browser app), cloud apps can go about their business and focus on processing. It's a pretty good arrangement.
But it does mean that cloud apps are quite similar to mainframe apps.
To truly appreciate BASIC, one must understand the languages that came before it. Comparing BASIC to JavaScript, or Swift, or Ruby makes little sense; each of those came after BASIC (long after) and built on the experience of BASIC. The advantages of BASIC are clear when compared to the languages of the time: COBOL and Fortran.
BASIC was interpreted, which meant that a program could be typed and run in one fast session. COBOL and Fortran were compiled, which meant that a program had to be typed, saved to disk, compiled, linked, and then run. With BASIC, one could change a program and re-run it; with other languages you had to go through the entire edit-save-compile-link cycle.
Where BASIC really had an advantage over COBOL and Fortran was with input. BASIC had a flexible INPUT statement that let a program read values from the user. COBOL was designed to read data from punch cards; Fortran was designed to read data from magnetic tape. Both were later modified to handle input from "the console" -- the terminal at which a programmer used for an interactive session -- but even with those changes, interactive programs were painful to write. Yet in BASIC it was easy to write a program that asked the user "would you like to run again?".
The interactive properties of BASIC made it a hit with microcomputer users. (It's availability, due to Microsoft's aggressive marketing, also helped.) Fortran and COBOL achieved minimal success with microcomputers, setting up the divide between "mainframe programming" (COBOL and Fortran) and "microcomputer programming" (BASIC, and later Pascal). Some rash young members of the computing field called the two divisions "ancient programming" and "modern programming".
But the division wasn't so much between mainframe and microcomputer (or old and new) as we thought. Instead, the division was between interactive and non-interactive. Microcomputers and their applications were interactive and mainframes and their applications were non-interactive. (Mainframe applications were also batch-oriented, which is another aspect.)
What does all of this history have to do with computing in the current day? Well, cloud computing is pretty modern stuff, and it is quite different from the interactive programming on microcomputers. I don't see anyone building cloud applications with BASIC or Pascal; people use Python or Ruby or Java or C#. But cloud computing is close to mainframe computing (yes, that "ancient" form of computing) in that it is non-interactive. A cloud application gets a request, processes it, and returns a response -- and that's it. There is no "would you like to run again?" option from cloud applications.
Which is not to say that today's systems are not interactive -- they are. But it is not the cloud portion of the system that is interactive. The interactivity with the use has been separated from the cloud; it lives in the mobile app on the user's phone, or perhaps in a JavaScript app in a browser.
With all of the user interaction in the mobile app (or browser app), cloud apps can go about their business and focus on processing. It's a pretty good arrangement.
But it does mean that cloud apps are quite similar to mainframe apps.
Labels:
BASIC,
cloud computing,
COBOL,
Fortran,
interactive processing,
mainframe,
mobile/cloud
Sunday, February 1, 2015
The return of multi-platform, part one
A long time ago, there was a concept known as "multi-platform". This concept was an attribute of programs. The idea was that a single program could run on computer systems of different designs. This is not a simple thing to implement. Different systems are, well different, and programs are built for specific processors and operating systems.
The computing world, for the past two decades, has been pretty much a Windows-only place. As such, programs on the market had to run on only one platform: Windows. That uniformity has simplified the work of building programs. (To anyone involved the creation of programs, the idea that building programs is an easy task may be hard to believe. But I'm not claiming that building programs for a single platform is simple -- I'm claiming that it is simpler than building programs for multiple platforms.)
Programs require a user interface (or an API), processing, access to memory, and access to storage devices. Operating systems provide many of those services, so instead of tailoring a program to a specific processor, memory, and input-output devices, one can tailor it to the operating system. Thus we have programs that are made for Windows, or for MacOS, or for Linux.
If we want a program to run on multiple platforms, we need it to run on multiple operating systems. So how do we build a program that can run on multiple operating systems? We've been working on answers for a number of years. Decades, actually.
The early programming languages FORTRAN and COBOL were designed for computers from different manufacturers. (Well, COBOL was. FORTRAN was an IBM creation that was flexible enough to implement on non-IBM systems.) They were standard, which meant that a program written in FORTRAN could be compiled and run on an IBM system, and compiled and run on a system from another vendor.
The "standard language" solution has advantages and disadvantages. It requires a single language standard and a set of compilers for each "target" platform. For COBOL and FORTRAN, the compiler for a platform was (generally) made by the platform vendor. The hardware vendors had incentives to "improve" or "enhance" their compilers, adding features to the language. The idea was to get customers to use one of their enhancements; once they were "hooked" it would be hard to move to another vendor. So the approach was less "standard language" and more "standard language with vendor-specific enhancements", or not really a standard.
The C and C++ languages overcome the problem of vendor enhancements with strong standards committees. They prevented vendors from "improving" languages by creating a "floor equals ceiling" standard which prohibited enhancements. For C and C++, a compliant compiler must do exactly what the standard says, and no more than that.
The more recent programming languages Java, Perl, Python, and Ruby use a different approach. They each have run-time engines that interpret or compile the code. Unlike the implementations with FORTRAN and COBOL, the implementations of these later languages are not provided by the hardware vendors or operating system vendors. Instead, they are provided by independent organizations who are not beholden to vendors.
The result is that we now have a set of languages that let us write programs for multiple platforms. We can write a Java program and run it on Windows, or MacOS, or Linux. We can do the same with Perl. And with Python. And with... well, you get the idea.
Programs for multiple platforms weakens the draw for any one operating system or hardware. If my programs are written in Visual Basic, I must run them on Windows. But if they are written in Java, I can run them on any platform.
With the fragmentation of the tech world and the rise of alternative platforms, a multi-platform program is a good thing. I expect to see more of them.
The computing world, for the past two decades, has been pretty much a Windows-only place. As such, programs on the market had to run on only one platform: Windows. That uniformity has simplified the work of building programs. (To anyone involved the creation of programs, the idea that building programs is an easy task may be hard to believe. But I'm not claiming that building programs for a single platform is simple -- I'm claiming that it is simpler than building programs for multiple platforms.)
Programs require a user interface (or an API), processing, access to memory, and access to storage devices. Operating systems provide many of those services, so instead of tailoring a program to a specific processor, memory, and input-output devices, one can tailor it to the operating system. Thus we have programs that are made for Windows, or for MacOS, or for Linux.
If we want a program to run on multiple platforms, we need it to run on multiple operating systems. So how do we build a program that can run on multiple operating systems? We've been working on answers for a number of years. Decades, actually.
The early programming languages FORTRAN and COBOL were designed for computers from different manufacturers. (Well, COBOL was. FORTRAN was an IBM creation that was flexible enough to implement on non-IBM systems.) They were standard, which meant that a program written in FORTRAN could be compiled and run on an IBM system, and compiled and run on a system from another vendor.
The "standard language" solution has advantages and disadvantages. It requires a single language standard and a set of compilers for each "target" platform. For COBOL and FORTRAN, the compiler for a platform was (generally) made by the platform vendor. The hardware vendors had incentives to "improve" or "enhance" their compilers, adding features to the language. The idea was to get customers to use one of their enhancements; once they were "hooked" it would be hard to move to another vendor. So the approach was less "standard language" and more "standard language with vendor-specific enhancements", or not really a standard.
The C and C++ languages overcome the problem of vendor enhancements with strong standards committees. They prevented vendors from "improving" languages by creating a "floor equals ceiling" standard which prohibited enhancements. For C and C++, a compliant compiler must do exactly what the standard says, and no more than that.
The more recent programming languages Java, Perl, Python, and Ruby use a different approach. They each have run-time engines that interpret or compile the code. Unlike the implementations with FORTRAN and COBOL, the implementations of these later languages are not provided by the hardware vendors or operating system vendors. Instead, they are provided by independent organizations who are not beholden to vendors.
The result is that we now have a set of languages that let us write programs for multiple platforms. We can write a Java program and run it on Windows, or MacOS, or Linux. We can do the same with Perl. And with Python. And with... well, you get the idea.
Programs for multiple platforms weakens the draw for any one operating system or hardware. If my programs are written in Visual Basic, I must run them on Windows. But if they are written in Java, I can run them on any platform.
With the fragmentation of the tech world and the rise of alternative platforms, a multi-platform program is a good thing. I expect to see more of them.
Sunday, August 17, 2014
Reducing the cost of programming
Different programming languages have different capabilities. And not surprisingly, different programming languages have different costs. Over the years, we have found ways of reducing those costs.
Costs include infrastructure (disk space for compiler, memory) and programmer training (how to write programs, how to compile, how to debug). Notice that the load on the programmer can be divided into three: infrastructure (editor, compiler), housekeeping (declarations, memory allocation), and business logic (the code that gets stuff done).
Symbolic assembly code was better than machine code. In machine code, every instruction and memory location must be laid out by the programmer. With a symbolic assembler, the computer did that work.
COBOL and FORTRAN reduced cost by letting the programmer not worry about the machine architecture, register assignment, and call stack management.
BASIC (and time-sharing) made editing easy, eliminated compiling, and made running a program easy. Results were available immediately.
Today we are awash in programming languages. The big ones today (C, Java, Objective C, C++, BASIC, Python, PHP, Perl, and JavaScript -- according to Tiobe) are all good at different things. That is perhaps not a coincidence. People pick the language best suited to the task at hand.
Still, it would be nice to calculate the cost of the different languages. Or if numeric metrics are not possible, at least rank the languages. Yet even that is difficult.
One can easily state that C++ is more complex than C, and therefore conclude that programming in C++ is more expensive that C. Yet that's not quite true. Small programs in C are easier to write than equivalent programs in C++. Large programs are easier to write in C++, since the ability to encapsulate data and group functions into classes helps one organize the code. (Where 'small' and 'large' are left to the reader to define.)
Some languages are compiled and some that are interpreted, and one can argue that a separate step to compile is an expense. (It certainly seems like an expense when I am waiting for the compiler to finish.) Yet languages with compilers (C, C++, Java, C#, Objective-C) all have static typing, which means that the editor built into an IDE can provide information about variables and functions. When editing a program written in one of the interpreted languages, on the other hand, one does not have that help from the editor. The interpreted languages (Perl, Python, PHP, and JavaScript) have dynamic typing, which means that the type of a variable (or function) is not constant but can change as the program runs.
Switching from an "expensive" programming language (let's say C++) to a "reduced cost" programming language (perhaps Python) is not always possible. Programs written in C++ perform better. (On one project, the C++ program ran for several hours; the equivalent program in Perl ran for several days.) C and C++ let one have access to the underlying hardware, something that is not possible in Java or C# (at least not without some add-in trickery, usually involving... C++.)
The line between "cost of programming" and "best language" quickly blurs, and nailing down the costs for the different dimensions of programming (program design, speed of coding, speed of execution, ability to control hardware) get in our way.
In the end, I find that it is easy to rank languages in the order of my preference rather than in an unbiased scheme. And even my preferences are subject to change, given the nature of the project. (Is there existing code? What are other team members using? What performance constraints must we meet?)
Reducing the cost of programming is really about trade-offs. What capabilities do we desire, and what capabilities are we willing to cede? To switch from C++ to C# may mean faster development but slower performance. To switch from PHP to Java may mean better organization of code through classes but slower development. What is it that we really want?
Costs include infrastructure (disk space for compiler, memory) and programmer training (how to write programs, how to compile, how to debug). Notice that the load on the programmer can be divided into three: infrastructure (editor, compiler), housekeeping (declarations, memory allocation), and business logic (the code that gets stuff done).
Symbolic assembly code was better than machine code. In machine code, every instruction and memory location must be laid out by the programmer. With a symbolic assembler, the computer did that work.
COBOL and FORTRAN reduced cost by letting the programmer not worry about the machine architecture, register assignment, and call stack management.
BASIC (and time-sharing) made editing easy, eliminated compiling, and made running a program easy. Results were available immediately.
Today we are awash in programming languages. The big ones today (C, Java, Objective C, C++, BASIC, Python, PHP, Perl, and JavaScript -- according to Tiobe) are all good at different things. That is perhaps not a coincidence. People pick the language best suited to the task at hand.
Still, it would be nice to calculate the cost of the different languages. Or if numeric metrics are not possible, at least rank the languages. Yet even that is difficult.
One can easily state that C++ is more complex than C, and therefore conclude that programming in C++ is more expensive that C. Yet that's not quite true. Small programs in C are easier to write than equivalent programs in C++. Large programs are easier to write in C++, since the ability to encapsulate data and group functions into classes helps one organize the code. (Where 'small' and 'large' are left to the reader to define.)
Some languages are compiled and some that are interpreted, and one can argue that a separate step to compile is an expense. (It certainly seems like an expense when I am waiting for the compiler to finish.) Yet languages with compilers (C, C++, Java, C#, Objective-C) all have static typing, which means that the editor built into an IDE can provide information about variables and functions. When editing a program written in one of the interpreted languages, on the other hand, one does not have that help from the editor. The interpreted languages (Perl, Python, PHP, and JavaScript) have dynamic typing, which means that the type of a variable (or function) is not constant but can change as the program runs.
Switching from an "expensive" programming language (let's say C++) to a "reduced cost" programming language (perhaps Python) is not always possible. Programs written in C++ perform better. (On one project, the C++ program ran for several hours; the equivalent program in Perl ran for several days.) C and C++ let one have access to the underlying hardware, something that is not possible in Java or C# (at least not without some add-in trickery, usually involving... C++.)
The line between "cost of programming" and "best language" quickly blurs, and nailing down the costs for the different dimensions of programming (program design, speed of coding, speed of execution, ability to control hardware) get in our way.
In the end, I find that it is easy to rank languages in the order of my preference rather than in an unbiased scheme. And even my preferences are subject to change, given the nature of the project. (Is there existing code? What are other team members using? What performance constraints must we meet?)
Reducing the cost of programming is really about trade-offs. What capabilities do we desire, and what capabilities are we willing to cede? To switch from C++ to C# may mean faster development but slower performance. To switch from PHP to Java may mean better organization of code through classes but slower development. What is it that we really want?
Wednesday, March 19, 2014
The fecundity of programming languages
Some programming languages are more rigorous than others. Some programming languages are said to be more beautiful than others. Some programming languages are more popular than others.
And some programming languages are more prolific than others, in the sense that they are the basis for new programming languages.
Algol, for example, influenced the development of Pascal and C, which in turn influenced Java, C# and many others.
FORTRAN influenced BASIC, which in turn gave us CBASIC, Visual Basic, and True Basic.
The Unix shell lead to Awk and Perl, which influenced Python and Ruby.
But COBOL has had little influence on languages. Yes, it has been revised, including an object-oriented version. Yes, it guided the PL/I and ABAP languages. But outside of those business-specific languages, COBOL has had almost no effect on programming languages.
Why?
I'm not certain, but I have two ideas: COBOL was as early language, and it designed for commercial uses.
COBOL is one of the earliest languages, dating back to the 1950s. Other languages of the time include FORTRAN and LISP (and oodles of forgotten languages like A-0 and FLOWMATIC). We had no experience with programming languages. We didn't know what worked and what didn't work. We didn't know which language features were useful to programmers. Since we didn't know, we had to guess.
For a near-blind guess, COBOL was pretty good. It has been useful in close to its original form for decades, a shark in the evolution of programming languages.
The other reason we didn't use COBOL to create other languages is that it was commercial. It was designed for business transactions. While it ran on general-purpose computers, COBOL was specific to the financial applications, and the people who would tinker and build new languages were working in other fields and with computers other than business mainframes.
The tinkerers were using minicomputers (and later, microcomputers). These were not in the financial setting but in universities, where people were more willing to explore new languages. Minicomputers from DEC were often equipped with FORTRAN and BASIC. Unix computers were equipped with C. Microcomputers often came with BASIC baked in, because it was easier for individuals to use.
COBOL's success in the financial sector may have doomed it to stagnancy. Corporations (especially banks and insurance companies) lean conservative with technology and programming; they prefer to focus on profits and not research.
I see a similar future for SQL. As a data descriptions and access language, it does an excellent job. But it is very specific and cannot be used outside of that domain. The up-and-coming NoSQL databases avoid SQL in part, I think, because the SQL language is tied to relational algebra and structured data. I see no languages (well, no popular languages) derived from SQL.
I think the languages that will influence or generate new languages will be those which are currently popular, easily learned, and easily used. They must be available to the tinkerers of today; those tinkerers will be writing the languages of the future. Tinkerers have limited resources, so less expensive languages have an advantage. Tinkerers are also a finicky bunch, with only a few willing to work with ornery products (or languages).
Considering those factors, I think that future languages will come from a set of languages in use today. That set includes C, C#, Java, Python, and JavaScript. I omit a number of candidates, including Perl, C++, and possibly your favorite language. (I consider Perl and C++ difficult languages; tinkerers will move to easier languages. I would like to include FORTH in the list, but it too is a difficult language.)
And some programming languages are more prolific than others, in the sense that they are the basis for new programming languages.
Algol, for example, influenced the development of Pascal and C, which in turn influenced Java, C# and many others.
FORTRAN influenced BASIC, which in turn gave us CBASIC, Visual Basic, and True Basic.
The Unix shell lead to Awk and Perl, which influenced Python and Ruby.
But COBOL has had little influence on languages. Yes, it has been revised, including an object-oriented version. Yes, it guided the PL/I and ABAP languages. But outside of those business-specific languages, COBOL has had almost no effect on programming languages.
Why?
I'm not certain, but I have two ideas: COBOL was as early language, and it designed for commercial uses.
COBOL is one of the earliest languages, dating back to the 1950s. Other languages of the time include FORTRAN and LISP (and oodles of forgotten languages like A-0 and FLOWMATIC). We had no experience with programming languages. We didn't know what worked and what didn't work. We didn't know which language features were useful to programmers. Since we didn't know, we had to guess.
For a near-blind guess, COBOL was pretty good. It has been useful in close to its original form for decades, a shark in the evolution of programming languages.
The other reason we didn't use COBOL to create other languages is that it was commercial. It was designed for business transactions. While it ran on general-purpose computers, COBOL was specific to the financial applications, and the people who would tinker and build new languages were working in other fields and with computers other than business mainframes.
The tinkerers were using minicomputers (and later, microcomputers). These were not in the financial setting but in universities, where people were more willing to explore new languages. Minicomputers from DEC were often equipped with FORTRAN and BASIC. Unix computers were equipped with C. Microcomputers often came with BASIC baked in, because it was easier for individuals to use.
COBOL's success in the financial sector may have doomed it to stagnancy. Corporations (especially banks and insurance companies) lean conservative with technology and programming; they prefer to focus on profits and not research.
I see a similar future for SQL. As a data descriptions and access language, it does an excellent job. But it is very specific and cannot be used outside of that domain. The up-and-coming NoSQL databases avoid SQL in part, I think, because the SQL language is tied to relational algebra and structured data. I see no languages (well, no popular languages) derived from SQL.
I think the languages that will influence or generate new languages will be those which are currently popular, easily learned, and easily used. They must be available to the tinkerers of today; those tinkerers will be writing the languages of the future. Tinkerers have limited resources, so less expensive languages have an advantage. Tinkerers are also a finicky bunch, with only a few willing to work with ornery products (or languages).
Considering those factors, I think that future languages will come from a set of languages in use today. That set includes C, C#, Java, Python, and JavaScript. I omit a number of candidates, including Perl, C++, and possibly your favorite language. (I consider Perl and C++ difficult languages; tinkerers will move to easier languages. I would like to include FORTH in the list, but it too is a difficult language.)
Monday, May 20, 2013
Where do COBOL programmers come from?
In the late Twentieth Century, COBOL was the standard language for business applications. There were a few other contenders (IBM's RPG, assembly language, and DEC's DIBOL) but COBOL was the undisputed king of the business world. If you were running a business, you used COBOL.
If you worked in the data processing shop of a business, you knew COBOL and programmed with it.
If you were in school, you had a pretty good chance of being taught COBOL. Not everywhere, and not during the entire second half of the century. I attended an engineering school; we learned FORTRAN, Pascal, and assembly language. (We also used the packages SPSS and CSMP.)
Schools have, for the most part, stopped teaching COBOL. A few do, but most moved on to C++, or Java, or C#. A number are now teaching Python.
Business have lots of COBOL code. Lots and lots of it. And they have no reason to convert that code to C++, or Java, or C#, or the "flavor of the month" in programming languages. Business code is often complex and working business code is precious. One modifies the code only when necessary, and one converts a system to a new language only at the utmost need.
But that code, while precious, does have to be maintained. Businesses change and those changes require fixes and enhancements to the code.
Those changes and enhancements are made by COBOL programmers.
Of which very few are being minted these days. Or for the past two decades.
Which means that COBOL programmers are, as a resource, dwindling.
Now, I recognize that the production of COBOL programmers has not ceased. There are three sources that I can name with little thought.
First are the schools (real-life and on-line) that offer courses in COBOL. Several colleges still teach it, and several on-line colleges offer it.
Second is offshore programming companies. Talent is available through outsourcing.
Third is existing programmers who learn COBOL. A programmer who knows Visual Basic and C++, for example, may choose to learn COBOL (perhaps through an on-line college).
Yet I believe that, in any given year, the number of new COBOL programmers is less than the number of retiring COBOL programmers. Which means that the talent pool is now at risk, and therefore business applications may be at risk.
For many years businesses relied on the ubiquitous nature of COBOL to build their systems. I'm sure that the managers considered COBOL to be a "safe" language: stable and reliable for many years. And to be fair, it was. COBOL has been a useful language for almost half a century, a record that only FORTRAN can challenge.
The dominance of COBOL drove a demand for COBOL programmers, which in turn drove a demand for COBOL training. Now, competing languages are pulling talent out of the "COBOL pool", starving the training. Can businesses be far behind?
If you are running a business, and you rely on COBOL, you may want to think about the future of your programming talent.
* * * * *
Such an effect is not limited to COBOL. It can happen to any popular language. Consider Visual Basic, a dominant language in Windows shops in the 1990s. It has fallen out of favor, replaced by C#. Or consider C++, which like COBOL has a large base of installed (and working) code. It, too, is falling out of favor, albeit much more slowly than Visual Basic or COBOL.
If you worked in the data processing shop of a business, you knew COBOL and programmed with it.
If you were in school, you had a pretty good chance of being taught COBOL. Not everywhere, and not during the entire second half of the century. I attended an engineering school; we learned FORTRAN, Pascal, and assembly language. (We also used the packages SPSS and CSMP.)
Schools have, for the most part, stopped teaching COBOL. A few do, but most moved on to C++, or Java, or C#. A number are now teaching Python.
Business have lots of COBOL code. Lots and lots of it. And they have no reason to convert that code to C++, or Java, or C#, or the "flavor of the month" in programming languages. Business code is often complex and working business code is precious. One modifies the code only when necessary, and one converts a system to a new language only at the utmost need.
But that code, while precious, does have to be maintained. Businesses change and those changes require fixes and enhancements to the code.
Those changes and enhancements are made by COBOL programmers.
Of which very few are being minted these days. Or for the past two decades.
Which means that COBOL programmers are, as a resource, dwindling.
Now, I recognize that the production of COBOL programmers has not ceased. There are three sources that I can name with little thought.
First are the schools (real-life and on-line) that offer courses in COBOL. Several colleges still teach it, and several on-line colleges offer it.
Second is offshore programming companies. Talent is available through outsourcing.
Third is existing programmers who learn COBOL. A programmer who knows Visual Basic and C++, for example, may choose to learn COBOL (perhaps through an on-line college).
Yet I believe that, in any given year, the number of new COBOL programmers is less than the number of retiring COBOL programmers. Which means that the talent pool is now at risk, and therefore business applications may be at risk.
For many years businesses relied on the ubiquitous nature of COBOL to build their systems. I'm sure that the managers considered COBOL to be a "safe" language: stable and reliable for many years. And to be fair, it was. COBOL has been a useful language for almost half a century, a record that only FORTRAN can challenge.
The dominance of COBOL drove a demand for COBOL programmers, which in turn drove a demand for COBOL training. Now, competing languages are pulling talent out of the "COBOL pool", starving the training. Can businesses be far behind?
If you are running a business, and you rely on COBOL, you may want to think about the future of your programming talent.
* * * * *
Such an effect is not limited to COBOL. It can happen to any popular language. Consider Visual Basic, a dominant language in Windows shops in the 1990s. It has fallen out of favor, replaced by C#. Or consider C++, which like COBOL has a large base of installed (and working) code. It, too, is falling out of favor, albeit much more slowly than Visual Basic or COBOL.
Monday, October 24, 2011
Steve Jobs, Dennis Ritchie, John McCarthy, and Daniel McCracken
We lost four significant people from the computing world this year.
Steve Jobs needed no introduction. Everyone new him as that slightly crazy guy from Apple, the one who would show off new products while always wearing a black mock-turtleneck shirt.
Dennis Ritchie was well-known by the geeks. Articles comparing him to Steve Jobs were wrong: Ritchie co-created Unix and C somewhat before Steve Jobs founded Apple. Many languages (C++, Java, C#) are descendants of C. Linux, Android, Apple iOS, and Apple OSX are descendants of Unix.
John McCarthy was know by the true geeks. He built a lot of AI, and created a language called LISP. Modern languages (Python, Ruby, Scala, and even C# and C++) are beginning to incorporate ideas from the LISP language.
Daniel McCracken is the unsung hero of the group. He is unknown even among true geeks. His work predates the others (except McCarthy), and had a greater influence on the industry than possibly all of them. McCracken wrote books on FORTRAN and COBOL, books that were understandable and comprehensive. He made it possible for the very early programmers to learn their craft -- not just the syntax but the craft of programming.
The next time you write a "for" loop with the control variable named "i", or see a "for" loop with the control variable named "i", you can thank Daniel McCracken. It was his work that set that convention and taught the first set of programmers.
Steve Jobs needed no introduction. Everyone new him as that slightly crazy guy from Apple, the one who would show off new products while always wearing a black mock-turtleneck shirt.
Dennis Ritchie was well-known by the geeks. Articles comparing him to Steve Jobs were wrong: Ritchie co-created Unix and C somewhat before Steve Jobs founded Apple. Many languages (C++, Java, C#) are descendants of C. Linux, Android, Apple iOS, and Apple OSX are descendants of Unix.
John McCarthy was know by the true geeks. He built a lot of AI, and created a language called LISP. Modern languages (Python, Ruby, Scala, and even C# and C++) are beginning to incorporate ideas from the LISP language.
Daniel McCracken is the unsung hero of the group. He is unknown even among true geeks. His work predates the others (except McCarthy), and had a greater influence on the industry than possibly all of them. McCracken wrote books on FORTRAN and COBOL, books that were understandable and comprehensive. He made it possible for the very early programmers to learn their craft -- not just the syntax but the craft of programming.
The next time you write a "for" loop with the control variable named "i", or see a "for" loop with the control variable named "i", you can thank Daniel McCracken. It was his work that set that convention and taught the first set of programmers.
Labels:
Apple,
books,
C,
COBOL,
Daniel McCracken,
Dennis Ritchie,
Fortran,
John McCarthy,
LISP,
steve jobs,
Unix,
unsung hero
Thursday, January 20, 2011
Fortran in any language
One of the witty remarks about programming goes: I can write Fortran in any language!
The comment is usually made during a discussion of programming languages, usually new languages. It is generally used to indicate the ability to limit the use of a new language to the features of an older language, such as using a new C++ compiler but writing the constructs of C. Since C++ is a superset of C, all C programs work. One can avoid the hard work of learning C++ and write the comfortable syntax of C while claiming to write C++ code. (One can make the claim, but savvy geeks will quickly learn the fraud.)
While witty, the comment is not quite true. And I think that it is a bit unfair that we keep picking on Fortran.
The idea, perhaps, is that Fortran is so basic, so rudimentary, so primitive, that every language has its features (plus a whole lot more that makes it a different language). Thus, BASIC is Fortran with better looping and input-output, and Pascal is Fortran with pointers and better structure, and one can write "primitive" Fortran-like programs in either BASIC or Pascal. The comment "I can write Fortran in any language" is a condensation of "I can write programs in any language that use a limited subset of that language which is very close to Fortran".
I think that this verbal abuse of Fortran is a bit undeserved. Fortran may be many things, and it may even be primitive, but it is not the parent object of modern programming languages, with descendants that have everything of Fortran and extra bits. Fortran had a lot that other languages abandoned.
For example, in FORTRAN (the early versions of the language: FORTRAN II and FORTRAN IV) had no variable declarations. Pascal and C (and Java and C#) require all variables to be declared in advance. Python and Ruby have gotten away from this, returning to the original FORTRAN style, and Perl, oddly, does not require declarations unless you use the 'strict' module which changes Perl to require them.
FORTRAN also had the GOTO statement, which was kept in C and even in C++ (in a reduced form). Pascal eliminated the GOTO, and it is not to be found in Java or other modern languages.
One interesting (and mind-bending) construct in FORTRAN was the arithmetic IF. Instead of the logical if (IF THEN ELSE ), FORTRAN used an arithmetical expression construct; IF LABEL1, LABEL2, LABEL3. Execution was routed to one of the labels based on the value of the expression: negative values to one label, positive values to another, and zero values to the third. Only with the introduction of FORTRAN-77 did we see the IF...THEN...ELSE construct.
FORTRAN has changed over the years. It started with a pretty good idea of a high-level language, and morphed as we figured out what we really wanted in a language. It's not C# or Ruby, and it won't be. But it has changed more than any other language (with the possible exception of Visual Basic).
Its resilience is a lesson to us all.
Subscribe to:
Posts (Atom)