Thursday, June 16, 2022

Consolidation of processors, and more

We're in an age of consolidation. PCs are moving to the ARM processor as a standard. Apple has already replaced their entire line with ARM-based processors. Microsoft has built ARM-based laptops. The advantages of ARM (lower production cost, lower heat dissipation) make such a move worthwhile.

If this consolidation extends to all manufacturers, then we would see a uniform processor architecture, something that we have not seen in the PC era. While the IBM PC set a standard with the Intel 8088 processor, other computers at the time used other processors, mostly the Zilog Z-80 and the Mostek 6502. When Apple shifted to the Macintosh line, is changed to the Motorola 68000 processor.

Is consolidation limited to processors?

There are, today, four major operating systems: Windows, mac OS, zOS, and Linux. Could we see a similar consolidation among operating systems? Microsoft is adding Linux to Windows with WSL, which melds Linux into Windows. Apple's mac OS is based on Net BSD Unix, which is not that far from Linux. IBM's zOS supports Linux as virtual machines. IBM might, one day, replace zOS with Linux; they certainly have the ability to build one.

If both of these consolidations were to occur, then we would see a uniform processor architecture and a uniform operating system, something that has not occurred in the computing age.

(I'm not so dreamy-eyed that I believe this would happen. I expect Microsoft, Apple, and IBM to keep some degree of proprietary extensions to their systems. But let's dream a little.)

What affect would a uniform processor architecture and uniform operating system have on programming languages?

At first glance, one might think that there would be no effect. Programming languages are different things from processors and operating systems, handling different tasks. Different programming languages are good at different things, and we want to do different things, so why not keep different programming languages?

It is true that different programming languages are good at different things, but that doesn't mean that each and every programming language have unique strengths. Several programming languages have capabilities that overlap, some in multiple areas, and some almost completely. C# and VB.NET, for example. Or C# and Java, two object-oriented languages that are good for large-scale projects.

With a single processor architecture and a single operating system, Java loses one of its selling points. Java was designed to run on multiple platforms. It's motto was "Write Once, Run Everywhere." In the mid 1990s, such a goal made sense. There were different processors and different operating systems. But with a uniform architecture and uniform operating system, Java loses that point. The language remains a solid performer, so the loss is not fatal. But the argument for Java weakens.

A pair of overlapping languages is VB.NET and C#. Both are made by Microsoft, and both are made for Windows. Or were made for Windows; they are now available on multiple platforms. They overlap quite a bit. Do we need both? Anything one can do in C# one can also do in VB.NET, and the reverse is true. There is some evidence that Microsoft wants to drop VB.NET -- although there is also evidence that developers want to keep programming in VB.NET. That creates tension for Microsoft.

I suspect that specialty languages such as SQL and JavaScript will remain. SQL has embedded itself in databases, and JavaScript has embedded itself in web browsers.

What about other popular languages? What about COBOL, and FORTRAN, and Python, and R, and Delphi (which oddly still ranks high in the Tiobe index)?

I see no reason for any of them to go away. Each has a large base of existing code; converting those programs to another language would be a large effort with little benefit.

And I think that small, niche languages will remain. Programming languages such as AWK will remain because they are small, easy to use, good at what they do, and they can be maintained by a small team.

The bottom line is that the decision is not practical and logical, but emotional. We have multiple programming languages not because different languages are good at different things (although they are) but because we want multiple programming languages. Programmers become comfortable with programming languages; different programmers choose different programming languages.

Wednesday, June 1, 2022

Ideas for Machine Learning

A lot of what is called "AI" is less "Artificial Intelligence" and more "Machine Learning". The differences between the two are technical and rather boring, so few people talk about them. From a marketing perspective, "Artificial Intelligence" sounds better, so more people use that term.

But whichever term you use, I think we can agree that the field of "computers learning to think" has yielded dismal results. Computers are fast and literal, good at numeric computation and even good at running video games.

It seems to me that our approach to Machine Learning is not the correct one. We've been at it for decades, and our best systems suffer from fragility, providing wildly different answers for similar inputs.

That approach (from what I can tell) is to build a Machine Learning system, train it on a large set of inputs, and then have it match other inputs to the training set. The approach tries to match similar aspects to similar aspects.

I have two ideas for Machine Learning, although I suspect that they will be rejected by the community.

The first idea is to change the basic mechanism of Machine Learning. Instead of matching similar inputs, design systems which minimize errors. That is, balance the identification of objects with the identification of errors.

This is a more complex approach, as it requires some basic knowledge of an object (such as a duck or a STOP sign) and then it requires analyzing aspects and classifying them as "matching", "close match", "loose match", or "not a match". I can already hear the howls of practitioners, switching their mechanisms to something more complex.

But as loud as those complaints may be, they will be a gentle whisper compared to the reaction of my second idea: Switch from 2-D photographs to stereoscopic photographs.

Stereoscopic photographs are pairs of photographs of an object, taken by two cameras some distance apart. By themselves they are simple photographs. Together, they allow for the calculation of depth of objects. (Anyone who has used an old "Viewmaster" to look at a disk of transparencies has seen the effect.)

A stereoscopic photograph should allow for better identification of objects, because one can tell that items in the photograph are in the same plane or different planes. Items in different planes are probably different objects. Items in the same plane may be the same object, or may be two objects in close proximity. It's not perfect, but it is information.

The objections are, of course, that the entire corpus of inputs must be rebuilt. All of the 2-D photographs used to train ML systems are now invalid. Worse, a new collection of stereoscopic photographs must be taken (not an easy task), stored, classified, and vetted before they can be used.

I recognize the objections to my ideas. I understand that they entail a lot of work.

But I have to ask: is the current method getting us what we want? Because if it isn't, then we need to do something else.

Friday, May 27, 2022

The promise of Windows

Windows made a promise to run on various hardware, and allow different hardware platforms. This was a welcome promise, especially for those of us who liked computers other than the IBM PC.

At the time Windows was introduced, the IBM PC design was popular, but not universal. Some manufacturers had their own designs for PCs, different from the IBM PC. Those other PCs ran some software that was designed for IBM PCs, but not all software. The Victor 9000 and the Zenith Z-100 were PCs that saw modest popularity, running MS-DOS but with different specifications for keyboards, video, and input-output ports.

Some software was released in multiple versions, or included configuration programs, to match the different hardware. Lotus 1-2-3 had packages specific to the Z-100; WordStar came with a setup program to define screen and keyboard functions.

Buying hardware and software was a big effort. One had to ensure that the software ran on the hardware (or could be configured for it) and that the hardware supported the software.

Windows promised to simplify that effort. Windows would act as an intermediary, allowing any software (if it ran on Windows) to use any hardware (if Windows ran on it). Microsoft released Windows for different hardware platforms (including the Zenith Z-100). The implications were clear: Windows could "level the playing field" and make those other PCs (the ones not compatible with the IBM PC) useful and competitive.

That promise was not fulfilled. Windows ran on various computing hardware, but the buyers were trained to look for IBM PCs or compatibles, and they stayed with IBM PCs and compatibles. It didn't matter that Windows ran on different computers; people wanted IBM PCs, and they bought IBM PCs. The computers that were different were ignored and discontinued by their manufacturers.

And yet, Windows did keep its promise of separating software from hardware and allowing programs to run on different hardware. We can look at the history of Windows and see its growth over time, and the different hardware that it supported.

When USB was introduced, Windows supported it. (The implementation was rough at first, but Microsoft improved it.)

As displays improved and display adapters improved, Windows supported them. One can attach almost any display unit, and any display adapter to a PC and Windows can use them.

Printers and scanners have the same story. Windows supported lots of printers, from laserjets to inkjets to dot-matrix printers.

Much of this success is due to Microsoft and its clear specifications for adapters, displays, printers, and scanners. But those specifications allowed for growth and innovation.

Microsoft supported different processors, too. Windows ran on Intel's Itanium processors, and DECs Alpha processors. Even now Microsoft has support for ARM processors.

Windows did keep its promise, albeit in a way that we were not expecting.


Thursday, May 19, 2022

More than an FPU, less than a GPU

I think that there is an opportunity to enhance, or augment, the processing units in our current PCs.

Augmenting processors is not a new idea. Intel supplied numeric coprocessors for its 8086, 80286, and 80386 processors. These coprocessors performed numeric computations that were not natively available on the main processor. The main processor could perform the calculation, but the coprocessors were designed for numeric functions and performed the work much faster.

(Intel did not invent this idea. Coprocessors were available on minicomputers before PCs existed, and on mainframes before minicomputers existed.)

A common augmentation to processors is the GPU. Today's GPUs are in a combination of video adapter and numeric processor. They drive a video display, and they also perform graphic-oriented processing. They often use more power than the CPU, and perform more computations than the CPU, so one could argue that the GPU is the main processor, and the CPU is an auxiliary processor that handles input and output for the GPU.

Let's consider a different augmentation to our PC. We can keep the CPU, memory, and input-output devices, but let's replace the GPU with a different processor. Instead of a graphics-oriented processor that drives a video display, let's imagine a numeric-oriented processor.

This numeric-oriented processor (let's call it a NOP*) is different from the regular GPU. A GPU is designed, in part, to display images, and our NOP does not need to do that. The NOP waits for a request from the CPU, acts on it, and provides a result. No display is involved. The request that we send to the NOP is a program, but to avoid confusion with our main program, let's call it a 'subprogram'.

A GPU does the same, but also displays images, and therefore has to operate in real-time. Therefore, we can relax the constraints on our NOP. It has to be fast (faster at computations than the CPU) but it does not have to be as fast as a GPU.

And as a NOP does not connect to a display, it does not need to have a port for the display, nor does it need to be positioned in a PC slot for external access. It can be completely in the PC, much like memory modules are attached to the motherboard.

Also, our NOP can contain multiple processors. A GPU contains a single processor with multiple cores. Our NOP could have a single processor with multiple cores, or it could have multiple processors, each with multiple cores.

A program with multiple threads runs on a single processor with multiple cores, so a NOP with one processor can one run 'subprogram' that we assign to it. A NOP with multiple processors (each with multiple cores) could run multiple 'subprograms' at a time. A simple NOP could have one processor, and a high-end NOP could have multiple processors.

Such a device is quite possible with our current technology. The question is, what do we do with it?

The use of a GPU is obvious: driving a video card.

The use of a NOP is... less obvious.

It's less obvious because a NOP is a form of computing that is different from our usual form of computing. Our typical program is a linear thing, a collection of instructions that are processed in sequence, starting with the first and ending with the last. (Our programs can have loops and conditional branches, but the idea is still linear.)

Our NOP is capable of performing multiple calculations at the same time (multiple cores) and performing those calculations quickly. To use a NOP, one must have a set of calculations that can run in parallel (not dependent on each other) and with a large set of data. That's a different type of computing, and we're not accustomed to thinking in those terms. That's why the use of a NOP is not obvious.

There are several specialized applications that could use a NOP. We could analyze new designs for automobiles or airplanes, simulate protein folding, explore new drugs, and analyze buildings for structural integrity. But these are all specialized applications.

To justify a NOP as part of a regular PC, we need an application for the average person. Just as the spreadsheet was the compelling application for early PCs, and GPS and online maps were the compelling app for cell phones, we need a compelling app for a NOP.

I don't know what such an application be, but I know something of what it would look like. A NOP, like a GPU, has a processor with many cores. Today's GPUs can have in excess of 6000 cores (which exceed today's CPUs that have a puny dozen or two). But cores don't automatically make your program run faster. Cores allow a program with multiple threads to run faster.

Therefore, a program that takes advantage of a NOP would use multiple threads -- lots of them. It would perform numerical processing (of course) and it would operate on lots of data. The CPU would act as a dispatcher, sending "jobs" to the NOP and coordinating the results.

If our PCs had NOPs built in, then creative users would make programs to use them. But our PCs don't have NOPs built in, and NOPs cannot be added, either. The best we can do is use a GPU to perform some calculations. Many PCs do have GPUs, and some are used for numeric-oriented programming, but the compelling application has not emerged.

The problem is not one of technology but one of vision and imagination.

- - - - -

* I am quite aware that the term NOP is used, in assembly language, for the "no operation" instruction, an instruction that literally does nothing.

Thursday, May 5, 2022

C but for Intel X86

Let's step away from current events in technology, and indulge in a small reverie. Let's play a game of "what if" with past technology. Specifically the Intel 8086 processor.

I will start with the C programming language. C was designed for the DEC PDP-7 and PDP-11 processors. Those processors had some interesting features, and the C language reflects that. One example is the increment and decrement operators (the '++' and '--' operators) which map closely to addressing modes on the DEC processors.

Suppose someone had developed a programming language for the Intel 8086, just as Kernighan and Ritchie developed C for the PDP-11. What would it look like?

Let's also suppose that this programming language was developed just as the 8086 was designed, or shortly thereafter. It was released in 1978. We're looking at the 8086 (or the 8088, which has the identical instruction set) and not the later processors.

The computing world in 1978 was quite different from today. Personal computers were just entering the market. Apple and Commodore had computers that used the 6502 processor; Radio Shack had computers that used the Z-80 and the 68000.

Today's popular programming languages didn't exist. There was no Python, no Ruby, no Perl, no C#, and no Java. The common languages were COBOL, FORTRAN, BASIC, and assembler. Pascal was available, for some systems. Those would be the reference point for a new programming language. (C did exist, but only in the limited Unix community.)

Just as C leveraged features of the PDP-7 and PDP-11 processors, our hypothetical language would leverage features of the 8086. What are those features?

One feature that jumps out is text strings. The 8086 has instructions to handle null-terminated text. It seems reasonable that an 8086-centric language would support them. They might even be a built-in type for the language. (Strings do require memory management, but that is a feature of the run-time library, not the programming language itself.)

The 8086 supports BCD (binary-converted decimal) arithmetic. BCD math is rare today, but it was common on IBM mainframes and a common way to encode data for exchange with other computers.

The 8086 had a segmented architecture, with four different segments (code, data, stack, and "extra"). Those four segments map well to Pascal's organization of code, static data, stack, and heap. (C and C-derivatives use the same organization.) A language could support dynamic allocation of memory and recursive functions (two things that were not available in COBOL, FORTRAN, or BASIC). And it could also support a "flat" organization like those used in COBOL and FORTRAN, in which all variables and functions are laid out at link time and fixed in position.

There would be no increment or decrement operators.

Who would build such a language? I suppose a natural choice would be Intel, as they knew the processor best. But then, maybe not, as they were busy with hardware design, and had no operating system on which to run a compiler or interpreter.

The two big software houses for small systems (at the time) were Microsoft and Digital Research. Both had experience with programming languages. Microsoft provided BASIC for many different computer systems, and also had FORTRAN and COBOL. Digital Research provided CBASIC (a compiled BASIC) and PL/M (a derivative of IBM's PL/I).

IBM would probably not create a language for the 8086. They had no offering that used that processor. The IBM PC would arrive in 1981, and IBM didn't consider it a serious computer -- at least not until people started buying them in large quantities.

DEC, at the time successful with minicomputers, also had no offering that used the 8086. DEC offered many languages, but used their own processors.

Our language may have been developed by a hardware vendor, such as Apple or Radio Shack, but they like Intel were busy with hardware and did very little in terms of software.

So it may have been either Microsoft or Digital Research. Both companies were oriented for business, so a language developed by either of them would be oriented for business. A new language for business might be modelled on COBOL, but COBOL didn't allow for variable-length strings. FORTRAN was oriented for numeric processing, and it didn't handle strings either. Even Pascal had difficulty with variable-length strings.

My guess is that our new language would mix elements of each of the popular languages. It would be close to Pascal, but with more flexibility for text strings. It would support BCD numeric values, not only in calculations but also in input-output operations. The language would be influenced by COBOL's verbose approach to a language.

We might see something like this:

PROGRAM EXAMPLE
DECLARE
  FILE DIRECT CUSTFILE = "CUST.DAT";
  RECORD CUSTREC
    BCD ACCTNUM PIC 9999,
    BCD BALANCE PIC 9999.99;
    STRING NAME PIC X(20);
    STRING ADDRESS PIC X(20);
  FILE SEQUENTIAL TRANSFILE = "TRANS.DAT";
  RECORD TRANSREC
    BCD ACCTNUM PIC 9999,
    BCD AMOUNT PIC 999.99;
  STRING NAME;
  BCD PAYMENT,TAXRATE;
PROCEDURE INIT
  OPEN CUSTFILE, TRANSFILE;
PROCEDURE MAINLINE
  WHILE NOT END(TRANSFILE) BEGIN
    READ TRANSFILE TO TRANSREC;
  END;
CALL INIT;
CALL MAINLINE;

and so forth. This borrows heavily from COBOL; it could equally borrow from Pascal.

It may have been a popular language. Less verbose than COBOL, but still able to process transactions efficiently. Structured programming from Pascal, but with better input-output. BCD data for efficient storage and data transfer to other systems.

It could have been a contender. In an alternate world, we could be using programming languages derived not from C but from this hybrid language. That might solve some problems (such as buffer overflows) but maybe given us others. (What problems, you ask? I don't know. I just now invented the language, I'll need some time to find the problems.)


Sunday, May 1, 2022

Apple wants only professional developers

Apple got itself some news this past week: It sent intimidating letters to the developers of older applications. Specifically, Apple threatened to expel old apps -- apps that had not been updated in three years -- from the iTunes App Store. (Or is it the Apple App Store?)

On the surface, this seems a reasonable approach. Apple has received complaints about the organization of the App Store, and the ability (or lack thereof) to find specific apps. By eliminating older apps, Apple can reduce the number of apps in the store and improve the user experience.

But Apple showed a bit of its thinking when it sent out the notice. It specified a grace period of 30 days. If the developers of an app submitted a new version, the app could remain in the App Store.

The period of 30 days calls my attention. I think it shows a lack of understanding on Apple's part. It shows that Apple thinks anyone can rebuild and resubmit their app in less than one month.

For professional teams, this seems a reasonable limit. Companies that develop apps should be familiar with the latest app requirements, have the latest tools, and have processes to release a new version of their app. Building apps is a full-time job, and they should be ready to go.

Individuals who build apps "for fun" are in a different situation. For them, building apps is not a full-time job. They probably have a different full-time job, and building apps is a side job. They don't spend all day working on app development, and they probably don't have the latest tools. (They may not even have the necessary equipment to run the compilers and packager necessary to meet Apple's requirements.) For them, the 30-day period is an impossible constraint.

Apple, in specifying that limit, showed that it does not understand the situation of casual developers. (Or, more ominously, deliberately chose to make life difficult for casual developers. I see no evidence of hostility, so I will credit this limit to ignorance.)

In either case (ignorance or malice) Apple thinks -- apparently -- that developers can spin out new versions of their apps quickly. In the long term, this will become true. Casual developers will give up on Apple and stop developing their apps. (They may also drop Apple's products. Why buy a phone that won't run their app?)

As casual developers leave the field, only the serious developers will remain. Those will be the commercial developers, or the professional developers who are paid by corporations.

That sets the Apple environment as a commercial platform, one that serves commercially-developed apps for commercial purposes. The Apple platform will have online banking, email, commercial games, video streaming, location tracking for auto insurance, and other apps in the realm of commerce. But it will have very few fun apps, very few apps built by individuals for enjoyment. Every app will have a purpose, and that purpose will be to make money, either directly or indirectly.

Yet it doesn't have to be this way. Apple has an alternative, if they want it.

Right now, the Apple App Store is a single store. Apple could change that, splitting the App Store into multiple stores, or a single store with sections. One section could be for the commercial apps and another section for the amateur apps. The commercial section will have the tighter restrictions that Apple wants to ensure a good experience for users (updated apps, recent APIs, etc.) and the amateur section will hold the casually-built apps. But the two sections are not equal: apps in the commercial section are allowed to use API calls for payments, and apps in the amateur section are not. (Apple could limit other APIs too, such as biometrics or advertising.)

A two-tiered approach to the App Store gives the developers of iOS apps a choice: play in the pro league, or play in the amateur league. It may be an approach worth considering.

Wednesday, April 20, 2022

Advances in programming come from restraints

Advances in programming come from, to a large extent, advances in programming languages. And those advances in programming languages, unlikely as it seems, are mostly not expanded features but restrictions.

That advances come from restrictions seems counter-intuitive. How does fewer choices make us better programmers?

Let's look at some selected changes in programming languages, and how they enabled better programming.

The first set of restrictions was structured programming. Structured programming introduced the concepts of the IF/THEN/ELSE statement and the WHILE loop. More importantly, structured programming banished the GOTO statement (and its cousin, the IF/GOTO statement). This restriction was an important advancement for programming.

A GOTO statement allows for arbitrary flows of control within programs. Structured programming's IF/THEN/ELSE and WHILE statements (and WHILE's cousin, the FOR statement) force structure onto programs. Arbitrary flows of control were not possible.

The result was programs that were harder to write but easier to understand, easier to debug, and easier to modify. Structured programming -- the loss of GOTO -- was an advancement in programming.

A similar advance occurred with object-oriented programming. Like structured programming, object-oriented programming was a set of restrictions coupled with a set of new features. In object oriented programming, those restrictions were encapsulation (hiding data within a class) and the limiting of functions (requiring an instance of the class to execute). Data encapsulation protected data from arbitrary changes; one had to go through functions (in well-designed systems) to change the data. Instance functions were limited to executing on instances of the class, which meant that one had to *have* an instance of the class to call the function. Functions could not be called at arbitrary points in the code.

Both structured programming and object-oriented programming advanced the state of the art for programming. They did it by restricting the choices that programmers could make.

I'm going to guess that future advancements in programming will also come from restrictions in new programming languages. What could those restrictions be?

I have a few ideas.

One idea is immutable objects. This idea has been tested in the functional programming languages. Those languages often have immutable objects, objects which, once instantiated, cannot change their state.

In today's object-oriented programming languages, objects are often mutable. They can change their state, either through functions or direct access of member data.

Functional programming languages take a different view. Objects are immutable: once formed they cannot be changed. Immutable objects enforce discipline in programming: you must provide all of the ingredients when instantiating an object; you cannot partially initialize an object and add things later.

I would like to see a programming language that implements immutable objects. But not perfectly -- I want to allow for some objects that are not immutable. Why? Because the shift to "all objects are immutable" is too much, too fast. My preference is for a programming language to encourage immutable designs and require extra effort to design mutable objects.

A second idea is a limit to the complexity of expressions.

Today's programming languages allow for any amount of complexity in an expression. Expressions can be simple (such as A + 1) or complex (such as A + B/C - sqr(B + 7) / 2), or worse.

I want expressions to be short and simple. This means breaking a complex expression into multiple statements. The only language that I know that placed restrictions on expressions was early FORTRAN, and then only for the index to an array variable. (The required form was I*J+K, where I, J, and K were optional.)

Perhaps we could design a language that limited the number of operations in an expression. Simpler expressions are, well, simpler, and easier to understand and modify. Any expression that contained more than a specific number of operations would be an error, forcing the programmer to refactor the expression.

A third idea is limits on the size of functions and classes. Large functions and large classes are harder to understand than small functions and small classes. Most programming languages have a style-checker, and most style-checkers issue warnings for long functions or classes with lots of functions.

I want to strengthen those warnings and change them to errors. A function that is too long (I'm not sure how long is too long, but that's another topic) is an error -- and the compiler or interpreter rejects it. The same applies to a class: too many data members, or too many functions, and you get an error.

But like immutable objects, I will allow for some functions to be larger than the limit, and some classes to be more complex than the limit. I recognize that some classes and functions must break the rules. (But the mechanism to allow a function or class to break the rules must be a nuisance, more than a simple '@allowcomplex' attribute.)

Those are the restrictions that I think will help us advance the art of programming. Immutable objects, simple expressions, and small functions and classes.

Of these ideas, I think the immutable objects will be the first to enter mainstream programming. The concept has been implemented, some people have experience with it, and the experience has been positive. New languages that combine object-oriented programming with functional programming (much like Microsoft's F#, which is not so new) will allow more programmers to see the benefits of immutable objects.

I think programming will be better for it.