Fitzpatrick's Fabulous Future: C

Showing posts with label C. Show all posts

Monday, September 26, 2022

The end of C++

The language wars are back!

Well, one war -- between C++ and Rust.

Perhaps "war" is too strong a word. A better description is "a discussion".

Programmers do like to participate in language wars. Or they used to, back in the days prior to open source and the internet. Back then, a programmer worked with the language that was chosen by his employer, or a language for which he (and programmers were overwhelmingly men) had spent money to acquire the compiler or interpreter. One's programming language was fixed, either by company mandate or by finances. That caused false pride in their programming language.

Open source and the internet made it possible for programmers to easily switch. They could try one and change if they found it better than their current language. With the ability to change, programmer's didn't need false pride, and didn't have to argue the merits of a language that they were most likely unhappy with. The language wars faded.

Until now. There were two events this past week, both concerning programming languages.

The first involved the Linux kernel. Linus Torvalds, the chief maintainer for the Linux kernel, announced that he would allow code written in Rust (an up-and-coming programming language) to be part of the kernel. (The kernel, up to now, has been written exclusively in C.) This announcement angered the C++ advocates, who would have preferred that language. Various arguments bounced across the internet, extolling the virtues of each. (Mostly "Rust is a safe language, designed to prevent many mistakes that can happen in C and in C++." and "C++ is a time-tested language with a mature toolset and a large base of experienced developers.")

It wasn't a war, or even a battle, but a discussion with lots of emotion.

The second event was an announcement from Microsoft's CTO for Azure. He stated that organizations and individuals should stop choosing C++ for new projects, and instead pick other languages. (I don't think he listed the languages, but I suspect he would prefer languages supported by the Azure platform.)

That announcement received much less interest. But still, it counts as a volley in the language disputes.

(I suspect that the Microsoft CTO is right, but for a different reason: the size of applications. C++ was designed for large systems, and today's cloud-based services are much smaller. They don't need C++; they need a language larger than C, smaller than C++, and safer than both.)

I find the timing of these two announcements interesting. It may be that we are seeing the beginning of the end for C++.

It may be that historians, in some distant future, draw a line and say "Here, here in 2022, is where C++ began its decline. This is the time that the IT world started to move away from C++."

I don't expect C++ to disappear. Some programming languages have "disappeared" in that they are used for nothing outside of a few museum exhibits and systems run by die-hard fans. Programming languages such as Flowmatic, Neliac, BASIC, and Modula-2 are all but unused in the modern world.

Yet other old languages continue to be used: Cobol, Fortran, and even RPG are running systems today. I expect C++ to join their ranks. They will remain, they will continue to do useful work, and they will be present in popularity surveys. But they won't be at the top.

Friday, May 17, 2019

Procedures and functions are two different things

The programming language Pascal had many good ideas. Many of those ideas have been adopted by modern programming languages. One idea that hasn't been adopted was the separation of functions and procedures.

Some definitions are in order. In Pascal, a function is a subroutine that accepts input parameters, can access variables in its scope (and containing scopes), performs some computations, and returns a value. A procedure is similar: it accepts input parameters, can access variables in its scope and containing scopes, performs calculations, and ... does not return a value.

Pascal has the notion of functions and a separate notion of procedures. A function is a function and a procedure is a procedure, and the two are different. A function can be used (in early Pascal, must be used) in an expression. It cannot stand alone.

A procedure, in contrast, is a computational step in a program. It cannot be part of an expression. It is a single statement, although it can be part of an 'if' or 'while' statement block.

Functions and procedures have different purposes, and I believe that the creators of Pascal envisioned functions to be unable to change variables outside of themselves. Procedures, I believe, were intended to change variables outside of their immediate scope. In C++, a Pascal-style function would be a function that is declared 'const', and a procedure would be a function that returns 'void'.

This arrangement is different from the C idea of functions. C combines the idea of function and procedure into a single 'function' construct. A function may be designed to return a value, or it may be designed to return nothing. A function may change variables outside of its scope, but it doesn't have to. (It may or may not have "side effects".)

In the competition among programming languages, C won big early on, and Pascal (or rather, the ideas in Pascal), have gained acceptance slowly. The C notion of function has been carried by other popular languages: C++, Java, C#, Python, Ruby, and even Go.

I remember quite clearly learning about Pascal (many years ago) and feeling that C was superior to Pascal due to its single approach. I sneered (mentally) at Pascal's split between functions and procedures.

I have come to regret those feelings, and now see the benefit of separating functions and procedures. When building (or maintaining) large-ish systems in modern languages (C++, C#, Java, Python), I have created functions that follow the function/procedure split. These languages force one to write functions -- there is no construct for a procedure -- yet I designed some functions to return values and others to not return values. The value-returning functions I made 'const' when possible, and avoided side effects. The functions with side effects I designed to not return values. In sum, I built functions and procedures, although the compiler uses only the 'function' construct.

The future may hold programming languages that provide functions and procedures as separate constructs. I'm confident that we will see languages that have these two ideas. Here's why:

First, there is a new class of programming languages called "functional languages". These include ML, Erlang, Haskell, and F#, to name a few. These functional languages use Pascal's original idea of functions as code blocks that perform a calculation with no side effects and return a value. Language designers have already re-discovered the idea of the "pure function".

Second, most ideas from Pascal have been implemented in modern languages. Bounds-checking for arrays. Structured programming. Limited conversion of values from one type to another. The separation of functions and procedures is one more of these ideas.

The distinction between functions and procedures is one more concept that Pascal got right. I expect to see it in newer languages, perhaps over the next decade. The enthusiasts of functional programming will realize that pure functions are not sufficient and that they need procedures. We'll then see variants of functional languages that include procedures, with purists holding on to procedure-less languages. I'm looking forward to the division of labor between functions and procedures; it has worked well for me in my efforts and a formal recognition will help me convey this division to other programmers.

Sunday, August 17, 2014

Reducing the cost of programming

Different programming languages have different capabilities. And not surprisingly, different programming languages have different costs. Over the years, we have found ways of reducing those costs.

Costs include infrastructure (disk space for compiler, memory) and programmer training (how to write programs, how to compile, how to debug). Notice that the load on the programmer can be divided into three: infrastructure (editor, compiler), housekeeping (declarations, memory allocation), and business logic (the code that gets stuff done).

Symbolic assembly code was better than machine code. In machine code, every instruction and memory location must be laid out by the programmer. With a symbolic assembler, the computer did that work.

COBOL and FORTRAN reduced cost by letting the programmer not worry about the machine architecture, register assignment, and call stack management.

BASIC (and time-sharing) made editing easy, eliminated compiling, and made running a program easy. Results were available immediately.

Today we are awash in programming languages. The big ones today (C, Java, Objective C, C++, BASIC, Python, PHP, Perl, and JavaScript -- according to Tiobe) are all good at different things. That is perhaps not a coincidence. People pick the language best suited to the task at hand.

Still, it would be nice to calculate the cost of the different languages. Or if numeric metrics are not possible, at least rank the languages. Yet even that is difficult.

One can easily state that C++ is more complex than C, and therefore conclude that programming in C++ is more expensive that C. Yet that's not quite true. Small programs in C are easier to write than equivalent programs in C++. Large programs are easier to write in C++, since the ability to encapsulate data and group functions into classes helps one organize the code. (Where 'small' and 'large' are left to the reader to define.)

Some languages are compiled and some that are interpreted, and one can argue that a separate step to compile is an expense. (It certainly seems like an expense when I am waiting for the compiler to finish.) Yet languages with compilers (C, C++, Java, C#, Objective-C) all have static typing, which means that the editor built into an IDE can provide information about variables and functions. When editing a program written in one of the interpreted languages, on the other hand, one does not have that help from the editor. The interpreted languages (Perl, Python, PHP, and JavaScript) have dynamic typing, which means that the type of a variable (or function) is not constant but can change as the program runs.

Switching from an "expensive" programming language (let's say C++) to a "reduced cost" programming language (perhaps Python) is not always possible. Programs written in C++ perform better. (On one project, the C++ program ran for several hours; the equivalent program in Perl ran for several days.) C and C++ let one have access to the underlying hardware, something that is not possible in Java or C# (at least not without some add-in trickery, usually involving... C++.)

The line between "cost of programming" and "best language" quickly blurs, and nailing down the costs for the different dimensions of programming (program design, speed of coding, speed of execution, ability to control hardware) get in our way.

In the end, I find that it is easy to rank languages in the order of my preference rather than in an unbiased scheme. And even my preferences are subject to change, given the nature of the project. (Is there existing code? What are other team members using? What performance constraints must we meet?)

Reducing the cost of programming is really about trade-offs. What capabilities do we desire, and what capabilities are we willing to cede? To switch from C++ to C# may mean faster development but slower performance. To switch from PHP to Java may mean better organization of code through classes but slower development. What is it that we really want?

Monday, May 19, 2014

The shift to cloud is bigger than we think

We've been using operating systems for decades. While they have changed over the years, they have offered a consistent set of features: time-slicing of the processor, memory allocation and management, device control, file systems, and interrupt handling.

Our programs ran "under" (or "on top of") an operating system. Our programs were also fussy -- they would run on one operating system and only that operating system. (I'm ignoring the various emulators that have come and gone over time.)

The operating system was the "target", it was the "core", it was the sun around which our programs orbited.

So it is rather interesting that the shift to cloud computing is also a shift away from operating systems.

Not that cloud computing is doing away with operating systems. Cloud computing coordinates the activities of multiple, usually virtualized, systems, and those systems run operating systems. What changes in cloud computing is the programming target.

Instead of a single computer, a cloud system is composed of multiple systems: web servers, database servers, and message queues, typically. While those servers and queues must run on computers (with operating systems), we don't care about them. We don't insist that they run any specific operating system (or even use a specific processor). We care only that they provide the necessary services.

In cloud computing, the notion of "operating system" fades into the infrastructure.

As cloud programmers, we don't care if our web server is running Windows. Nor do we care if it is running Linux. (The system administrators do care, but I am taking a programmer-centric view.) We don't care which operating system manages our message queues.

The level of abstraction for programmers has moved from operating system to web services.

That is a significant change. It means that programmers can focus on a higher level of work.

Hardware-tuned programming languages like C and C++ will become less important. Not completely forgotten, but used only by the specialists. Languages such as Python, Ruby, and Java will be popular.

Operating systems will be less important. They will be ignored by the application level programmers. The system architects and sysadmins, who design and maintain the cloud infrastructure, will care a lot about operating systems. But they will be a minority.

The change to services is perhaps not surprising. We long ago shifted away from processor-specific code, burying they work in our compilers. COBOL and FORTRAN, the earliest languages, were designed to run on different processors. Microsoft insulated us from the Windows API with MFC and later the .NET framework. Java separated us from the processor with its virtual machine. Now we take the next step and bury the operating system inside of web services.

Operating systems won't go away. But they will become less visible, less important in conversations and strategic plans. They will be more of a commodity and less of a strategic advantage.

Tuesday, April 8, 2014

Java and C# really derive from Pascal

In the history of programming, the "C versus Pascal" debate was a heated and sometimes unpleasant discussion on language design. It was fought over the capabilities of programming languages.

The Pascal side advocated restrictive code, what we today call "type safe" code. Pascal was designed as a teaching language, a replacement for BASIC that contained the ideas of structured programming.

The C side advocated liberal code, what we today call "unsafe code". C was designed not to teach but to get the job done, specifically systems programming jobs that required access to the hardware.

The terms "type safe" and "unsafe code" are telling, and they give away the eventual resolution. C won over Pascal in the beginning, at kept its lead for many years, but Pascal (or rather the ideas in Pascal) have been gaining ground. Even the C and C++ standards have been moving towards the restrictive design of Pascal.

Notable ideas in Pascal included:

Structured programming (blocks, 'while' and 'repeat' loops, 'switch/case' flows, limited goto)
Array data type
Array index checking at run-time
Pointer data type
Strong typing, including pointers
Overflow checking on arithmetic operations
Controlled conversions from one type to another
A constant qualifier for variables
Standard features across implementations

Notable ideas in K&R C:

Structured programming (blocks, 'while' and 'repeat' loops, 'switch/case' flows, limited goto)
Array data type (sort of -- really a syntactic trick involving pointers)
No checking of array index (at compile-time or run-time)
Pointer data type
Strong typing, but not for pointers
No overflow checking
Free conversions from one type to another
No 'const' qualifier
Many features were implementation-dependent

For programmers coming from BASIC (or FORTRAN) the structured programming concepts, common in C and Pascal, were appealing. Yet the other aspects of the C and Pascal programming languages were polar opposites.

It's hard to define a clear victor in the C/Pascal war. Pascal got a boost with the UCSD p-System and a large boost with the Turbo Pascal IDE. C was big in the Unix world and also big for programming Windows. Today, Pascal is viewed as a legacy language while C and its derivatives C++, Java, and C# enjoy popularity.

But if C won in name, Pascal won in spirit. The early, liberal K&R C has been "improved" with later standards that limit the ability to implicitly convert data types. K&R C was also enhanced with the 'const' keyword for variables. C++ introduced classes which allow programmers to build their own data types. So do Java and C#, and they eliminate pointers, check array indexes, and standardize operations across platforms. Java and C# are closer to the spirit of Pascal than C.

Yes, there are differences. Java and C# use braces to define blocks, where Pascal used 'BEGIN' and 'END'. Pascal declares variables with the name-and-then-type sequence, while C, Java, and C# use the type-and-then-name sequence. But if you look at the features, especially those Pascal features criticized as reducing performance, you see them in Java and C#.

We had many debates about the C and Pascal programming languages. In the end, it was not the "elegance" of a language or the capabilities of the IDE that solved the argument. Advances in technology neutralized many of our objections. Faster processors and improvements in compilers eliminated the need for speed tricks and allowed for the "performance killing" features in Pascal. And without realizing it, we adopted them, slowly, quietly, and with new names. We didn't adopt the name Pascal, we didn't adopt the syntax of Pascal, but we did adopt the features of Pascal.

Wednesday, March 19, 2014

The fecundity of programming languages

Some programming languages are more rigorous than others. Some programming languages are said to be more beautiful than others. Some programming languages are more popular than others.

And some programming languages are more prolific than others, in the sense that they are the basis for new programming languages.

Algol, for example, influenced the development of Pascal and C, which in turn influenced Java, C# and many others.

FORTRAN influenced BASIC, which in turn gave us CBASIC, Visual Basic, and True Basic.

The Unix shell lead to Awk and Perl, which influenced Python and Ruby.

But COBOL has had little influence on languages. Yes, it has been revised, including an object-oriented version. Yes, it guided the PL/I and ABAP languages. But outside of those business-specific languages, COBOL has had almost no effect on programming languages.

Why?

I'm not certain, but I have two ideas: COBOL was as early language, and it designed for commercial uses.

COBOL is one of the earliest languages, dating back to the 1950s. Other languages of the time include FORTRAN and LISP (and oodles of forgotten languages like A-0 and FLOWMATIC). We had no experience with programming languages. We didn't know what worked and what didn't work. We didn't know which language features were useful to programmers. Since we didn't know, we had to guess.

For a near-blind guess, COBOL was pretty good. It has been useful in close to its original form for decades, a shark in the evolution of programming languages.

The other reason we didn't use COBOL to create other languages is that it was commercial. It was designed for business transactions. While it ran on general-purpose computers, COBOL was specific to the financial applications, and the people who would tinker and build new languages were working in other fields and with computers other than business mainframes.

The tinkerers were using minicomputers (and later, microcomputers). These were not in the financial setting but in universities, where people were more willing to explore new languages. Minicomputers from DEC were often equipped with FORTRAN and BASIC. Unix computers were equipped with C. Microcomputers often came with BASIC baked in, because it was easier for individuals to use.

COBOL's success in the financial sector may have doomed it to stagnancy. Corporations (especially banks and insurance companies) lean conservative with technology and programming; they prefer to focus on profits and not research.

I see a similar future for SQL. As a data descriptions and access language, it does an excellent job. But it is very specific and cannot be used outside of that domain. The up-and-coming NoSQL databases avoid SQL in part, I think, because the SQL language is tied to relational algebra and structured data. I see no languages (well, no popular languages) derived from SQL.

I think the languages that will influence or generate new languages will be those which are currently popular, easily learned, and easily used. They must be available to the tinkerers of today; those tinkerers will be writing the languages of the future. Tinkerers have limited resources, so less expensive languages have an advantage. Tinkerers are also a finicky bunch, with only a few willing to work with ornery products (or languages).

Considering those factors, I think that future languages will come from a set of languages in use today. That set includes C, C#, Java, Python, and JavaScript. I omit a number of candidates, including Perl, C++, and possibly your favorite language. (I consider Perl and C++ difficult languages; tinkerers will move to easier languages. I would like to include FORTH in the list, but it too is a difficult language.)

Monday, October 24, 2011

Steve Jobs, Dennis Ritchie, John McCarthy, and Daniel McCracken

We lost four significant people from the computing world this year.

Steve Jobs needed no introduction. Everyone new him as that slightly crazy guy from Apple, the one who would show off new products while always wearing a black mock-turtleneck shirt.

Dennis Ritchie was well-known by the geeks. Articles comparing him to Steve Jobs were wrong: Ritchie co-created Unix and C somewhat before Steve Jobs founded Apple. Many languages (C++, Java, C#) are descendants of C. Linux, Android, Apple iOS, and Apple OSX are descendants of Unix.

John McCarthy was know by the true geeks. He built a lot of AI, and created a language called LISP. Modern languages (Python, Ruby, Scala, and even C# and C++) are beginning to incorporate ideas from the LISP language.

Daniel McCracken is the unsung hero of the group. He is unknown even among true geeks. His work predates the others (except McCarthy), and had a greater influence on the industry than possibly all of them. McCracken wrote books on FORTRAN and COBOL, books that were understandable and comprehensive. He made it possible for the very early programmers to learn their craft -- not just the syntax but the craft of programming.

The next time you write a "for" loop with the control variable named "i", or see a "for" loop with the control variable named "i", you can thank Daniel McCracken. It was his work that set that convention and taught the first set of programmers.

Fitzpatrick's Fabulous Future