The C and C++ languages lack one utility that is found in many other languages: a package manager. Will they ever have one?
The biggest challenge to a package manager for C or C++ is not the package manager. We know how to build them, how to manage them, and how to maintain a community that uses them. Perl, Python, and Ruby have package managers. Java has one (sort of). C# has one. JavaScript has several! Why not C and C++?
The issue isn't in the C and C++ languages. Instead the issue is in the preprocessor, an external utility that modifies C or C++ code before the compiler does its work.
The problem with the preprocessor is that it can change just about any token in the code to something else, including statements which would be used by package managers. The preprocessor can change "do_this" to "do_that" or change "true" to "TRUE" or change "BEGIN" to "{".
The idea of a package manager for C and C++ has been discussed, and someone (I forget the person now) listed a number of questions that the preprocessor raises for a package manager. I won't repeat the list here, but they were very good questions.
To me, it seems that a package manager and a preprocessor are incompatible. If you have one, you cannot have the other. (At least, not with any degree of consistency.)
So I started thinking... what if we eliminate the C/C++ preprocessor? How would that change the languages?
Let's look at what the preprocessor does for us.
For starters, it is the mechanism to include headers in programs. The "#include" lines are handled by the preprocessor, not the compiler. (When C was first designed, a preprocessor was considered a "win", as it separated some tasks from the compiler and followed the Unix philosophy of separation of duties.) We still need a way to include definitions of constants, functions, structures, and classes, so we need a replacement for the #include command.
A side note: C and C++ standard wonks will know that it is not required that the preprocessor and not the compiler handle "#include" lines. The standards dictate that after certain lines (such as #include "string") the compiler must exhibit certain behaviors. But this bit of arcane knowledge is not important to the general idea of elminating the preprocessor.
The preprocessor allows for conditional compilation. It allows for "#if/#else/#endif" blocks that can be conditionally compiled, based on what follows the "#if". Conditional compilation is extremely useful on software that has multiple targets, such as the Linux kernel (which targets many different processors).
The preprocessor also allows for macros and substitution of values. It accepts a "#define" line which can change any token into something else. This mechanism was used for the "max()" and "min()" functions.
All of that would be lost with the elimination of the preprocessor. As all of those features are used on many projects, they would all have to be replaced by some form of extension to the compiler. The compiler would have to read the included files, and would have to compile (or not compile) conditionally-marked code.
Such a change is possible, but not easy. It would probably break a lot of existing code -- perhaps all nontrivial C and C++ programs.
Which means that removing the preprocessor from C and C++ and replacing it with something else is a change to the language that makes C and C++ no longer C and C++. Removing the preprocessor changes the languages. They are no longer C and C++, but different languages, and deserving of different names.
So in once sense you can remove the preprocessor from C and C++, but in another sense you cannot.
Showing posts with label C preprocessor. Show all posts
Showing posts with label C preprocessor. Show all posts
Tuesday, October 9, 2018
Wednesday, March 7, 2012
The C preprocessor is a thing of beauty
Converting programs from one language to another can be easy or difficult, depending on the languages. More specifically, the ease of conversion depends on the commonality of language features.
For example: converting a program from FORTRAN IV to C. Converting a FORTRAN program to C is easy, converting a C program to FORTRAN IV is hard. The FORTRAN constructs of subroutines and functions are easily handled in C, as are the data types of integer, real, and character. C has constructs that are unavailable in FORTRAN: pointers, dynamic memory, and recursion. The FORTRAN language elements of IF/THEN and DO are available in C, but the C elements of 'while' and 'longjmp' are not. (We have to shoe-horn the non-FORTRAN aspects of C into FORTRAN.)
When converting a program from one language to another, the difficulties of conversion are the language-specific features of the 'origin' language -- the things that the 'destination' language cannot handle.
In the past, I have considered the C preprocessor to be a hideous thing, an orc-like programming language that has no manners. But I have been wrong.
The C preprocessor (which is the same as the C++ preprocessor), has some very interesting constructs. It is a language with concepts found only in advanced programming languages. It has access to the caller's scope, something that exists in the lambdas of LISP. It can use a parameter as a substitution token, or it can substitute the text value of the token name into the program ("stringification").
These features of the C preprocessor make conversion to other languages difficult. We think of the C preprocessor as an ugly beast not because it is ugly, but because we cannot easily convert its code to another language. The preprocessor language is beautiful -- simple, elegant, and powerful. We hate it because we cannot think of it in the terms of languages that we know and love.
It may be some time before mainstream languages have the features of the C preprocessor. (I'm considering LISP outside of the mainstream, at least for now.) Until such a time, the advanced language of the preprocessor will pose conversion challenges to programmers.
For example: converting a program from FORTRAN IV to C. Converting a FORTRAN program to C is easy, converting a C program to FORTRAN IV is hard. The FORTRAN constructs of subroutines and functions are easily handled in C, as are the data types of integer, real, and character. C has constructs that are unavailable in FORTRAN: pointers, dynamic memory, and recursion. The FORTRAN language elements of IF/THEN and DO are available in C, but the C elements of 'while' and 'longjmp' are not. (We have to shoe-horn the non-FORTRAN aspects of C into FORTRAN.)
When converting a program from one language to another, the difficulties of conversion are the language-specific features of the 'origin' language -- the things that the 'destination' language cannot handle.
In the past, I have considered the C preprocessor to be a hideous thing, an orc-like programming language that has no manners. But I have been wrong.
The C preprocessor (which is the same as the C++ preprocessor), has some very interesting constructs. It is a language with concepts found only in advanced programming languages. It has access to the caller's scope, something that exists in the lambdas of LISP. It can use a parameter as a substitution token, or it can substitute the text value of the token name into the program ("stringification").
These features of the C preprocessor make conversion to other languages difficult. We think of the C preprocessor as an ugly beast not because it is ugly, but because we cannot easily convert its code to another language. The preprocessor language is beautiful -- simple, elegant, and powerful. We hate it because we cannot think of it in the terms of languages that we know and love.
It may be some time before mainstream languages have the features of the C preprocessor. (I'm considering LISP outside of the mainstream, at least for now.) Until such a time, the advanced language of the preprocessor will pose conversion challenges to programmers.
Subscribe to:
Posts (Atom)