Monday, December 21, 2020

Measuring the success of .NET

Microsoft released .NET (and the language C#) in the early 2000s. With almost 20 years of experience, can we say that NET has been a success?

Microsoft, I am certain, will claim that .NET has been a success and continues to be a success. But they have a vested interest in the success of .NET, so their opinion is possibly biased.

It strikes me that Apple's move to ARM processors has provided us with a measure for the success of .NET. How can that be? Well, let's look at programming languages.

Programming languages come in three flavors: compiled, interpreted, and byte-code. Each has a different way of executing instructions to perform the desired calculations.

Compiled languages convert source code into machine code, and that machine code is directly executable by the processor. In today's world, that means (usually) that the compiler produces code for the Intel x86 processor, although other processors can be the "target". Once compiled, the executable form of the program is usable only on a system with appropriate processor. Code compiled for the Intel x86 runs only on the Intel x86 (or a compatible processor such as one from AMD). Notably, the compiled code cannot be used on a different processor, such as the ARM processors. (Apple can run Intel code on its new ARM processors because it emulates the Intel processor. The emulator pretends to be an Intel processor, and the code doesn't know that it is running on an emulator.)

Interpreted languages take a different approach. Instead of converting the source code to executable code, the interpreter parses each line of source code and executes it directly. The importance of this is that a program written in an interpreted language can be run on any processor, provided you have the interpreter.

BASIC, one of the earliest interpreted languages, ran on different processors, including mainframes, minicomputers, and microcomputers. All of them could run the same program, without changes, thus a BASIC program was quite "portable".

In between compiled languages and interpreted languages are the byte-code languages, which are a little bit of interpreter and little bit compiler. Byte-code languages are compiled, but not to a specific processors. Or rather, they are compiled to an imaginary processor, one that does not exist. The code produced by the compiler (often called "byte code") is interpreted by a small run-time system. The idea is that the run-time system can be created for various processors, and the same byte-code can run on any of them.

Java uses the byte-code approach, as does C#, VB.NET (all the .NET languages, actually), and the languages Python, Ruby, Perl, and Raku (and a bunch more).

The languages C, C++, Objective-C, Go, and Swift are compiled to executable code.

I can think of no programming languages that use the pure interpreted approach, at least not nowadays. The two languages that used an interpreted approach were BASIC and FORTH, and there were some implementations of BASIC that were byte-code and some even were fully compiled. But that's not important in 2020.

What is important in 2020 is that Apple is moving from Intel processors to ARM processors, and Microsoft may be considering a similar move. Of the two companies, Microsoft may be in the better position.

Apple uses the languages Objective-C and Swift for its applications, and encourages third-party developers to use those languages. Both of those languages are compiled, so moving from Intel processors to ARM processors means that programs, originally compiled for Intel processors must recompiled to run on the ARM processors.

Unless, of course, you run the Intel code on an ARM processor with an emulator. Not every application is ready to be recompiled, and not every third-party developer is ready to recompile their applications, so the emulator is an easy way to move applications to the ARM processors. The emulator is important to Apple, and I'm sure that they have spent a lot of time and effort on it. (And probably will spend more time and effort in the next year or two.)

Microsoft, in contrast, uses a large set of languages and encourages third-party developers to use its .NET framework. That puts it in a different position than Apple. While Apple's applications must be recompiled, Microsoft's applications do not. Microsoft can provide an ARM-based Windows and and ARM-based .NET framework, and all of the applications written in .NET will run. (And they will run without an emulator.)

Microsoft has a nice, simple path to move from Intel to ARM. Alas, the world is not so simple, and Microsoft's situation is not so simple.

The migration from Intel to ARM is easy (and low-risk) only for applications that are written completely in .NET. But applications are not always written completely in .NET. Sometimes, applications are written partially in .NET and partially in "native code" (code for the underlying processor). There are two reasons for such an approach: performance and legacy code. Native code tends to run faster than .NET byte-code, so applications that require high performance tend to be written in native code.

The other reason for native code is legacy applications. Large applications written prior to the introduction of .NET (and yes, that is twenty years ago) were written in a language that compiled to Intel code, typically C or C++. Converting that code from C or C++ to a .NET languages (C#, VB.NET, or the not-quite-C++ that was "managed C++") was a large effort, and entailed a large risk. Better to avoid the effort and avoid the risk, and maintain the application in its original language.

With all of that as a prelude (and I admit it is a long prelude), we can now look at the success of .NET.

If Microsoft releases a version of Windows for the ARM processor, we can see how many applications can migrate from Intel-based Windows to ARM-based Windows, and how many applications cannot. The applications that can move will be those applications written completely in .NET. The applications that cannot migrate (or that need an emulator) are those applications that are written all or partially in native code.

The degree of total (or pure) .NET applications can be a measure of the success of .NET. The more applications that use .NET (and no native code), the more success we can attribute to .NET.

The degree of native-code applications (partial or total), on the other hand, indicates a failure of .NET. It indicates choices made to not use .NET and use a different platform (Windows API, MFC, etc.) instead.

That is my measurement of the success of .NET: How many applications can move from Intel to ARM without recompiling. If Microsoft announces Windows for ARM, let's see which applications can move immediately and without the assistance of an emulator.


No comments: