Monday, July 22, 2024

CrowdStrike, Windows blue screens, and the future

A small problem with CrowdStrike, a Windows security application, has caused a wide-spread problem with thousands, perhaps millions, of PCs running Windows.

Quite a few folks have provided details about the problem, and how it happened.

Instead, I have some ideas about what will happen: what will happen at Microsoft, and what will happen at all of the companies that use CrowdStrike.

Microsoft long ago divided Windows into two spaces: one space for user programs and another space for system processes. The system space includes device drivers.

Applications in the user space can do some things, but not everything. They cannot, for example, interact directly with devices, nor can they access memory outside of their assigned range of addresses. If they do attempt to perform a restricted function, Windows stops the program -- before it causes harm to Windows or another application.

User-space applications cannot cause a blue screen of death.

If an error in CrowdStrike caused a blue screen of death (BSOD), then CrowdStrike must run in the system space. This makes sense, as CrowdStrike must access a lot of things to identify attacks, things normal applications do not look at. CrowdStrike runs with elevated privileges.

I'm guessing that Microsoft, as we speak, is thinking up ways to restrict third-party applications that must run with elevated privileges such as CrowdStrike. Microsoft won't force CrowdStrike into the user space, but Microsoft also cannot allow CrowdStrike to live in the system space where it can damage Windows. We'll probably see an intermediate space, one with more privileges than user-space programs but not all the privileges of system-space applications. Or perhaps application spaces with tailored privileges, each specific to the target application.

The more interesting future is for companies that use Microsoft Windows and applications such as CrowdStrike.

These companies are -- I imagine -- rather disappointed with CrowdStrike. So disappointed that they may choose to sue. I expect that management at several companies are already talking with legal counsel.

A dispute with CrowdStrike will be handled as a contract dispute. But I'm guessing the CrowdStrike, like most tech companies, specified arbitration in their contracts, and limited damages to the cost of the software.

Regardless of contract terms, if CrowdStrike loses, they could be in severe financial hardship. But if they prevail, they could also face a difficult future. Some number of clients will move to other providers, which will reduce CrowdStrike's income.

Other companies will start looking seriously at the contracts from suppliers, and start making adjustments. They will want the ability to sue in court, and they will want damages if the software fails. When the maintenance period renews, clients will want a different set of terms, one that imposes risk upon CrowdStrike.

CrowdStrike will have a difficult decision: accept the new terms or face further loss of business.

This won't stop at CrowdStrike. Client companies will review terms of contracts with all of their suppliers. The "CrowdStrike event" will ripple across the industry. Even companies like Adobe will see pushback to their current contract terms.

Supplier companies that agree to changes in contract terms will have to improve their testing and deployment procedures. Expect to see a wave of interest in process management, testing, verification, static code analysis, and code execution coverage. And, of course, consulting companies and tools to help in those efforts.

Client companies may also review the licenses for open source operating systems and applications. They may also attempt to push risk onto the open source projects. This will probably fail; open source projects make their software available at no cost, so users have little leverage. A company can choose to replace Python with C#, for example, but the threat of "we will stop using your software and pay you nothing instead of using your software and paying you nothing" has little weight.

Therefore shift in contracts will occur in the commercial space, at least not at first. It may change in the future, as changes in the commercial space become the norm.

Thursday, June 6, 2024

What to do with an NPU

Microsoft announced "Copilot PC", a new standard for hardware. It includes a powerful Neural Processing Unit (NPU) along with the traditional (yet also powerful) CPU and GPU. The purpose of this NPU is to support Microsoft's Copilot+, an application that uses "multiple state-of-the-art AI models ... to unlock a new set of experiences that you can run locally". It's clear that Microsoft will add generative AI to Windows and Windows applications. (It's not so clear that customers want generative AI or "a new set of experiences" on their PCs, but that is a different question.)

Let's put Windows to the side. What about Linux?

Linux is, if I may use the term, a parasite. It runs on hardware designed for other operating systems (such as Windows, macOS, or even Z/OS). I fully expect that it will run on these new "Copilot+ PCs", and when running, it will have access to the NPU. The question is: will Linux use that NPU for anything?

I suppose that before we attempt an answer, we should review the purpose of an NPU. A Neural Processing Unit is designed to perform calculations for a neural network. A neural network is a collection of nodes with connections between nodes. It has nothing to do with LANs or WANs or telecommunication networks.

The calculations of a neural network can be performed on a traditional CPU, but they are a poor match for the typical CPU. The calculations are a better match for a GPU, which is why so many people ran neural networks on them -- GPUs performed better than CPUs.

NPUs are better at the calculations than GPUs (and much better than CPUs), so if we have a neural network, its calculations would run fastest on an NPU. Neural Processing Units perform a specialized set of computations.

One application that uses those computations is the AI that we hear about today. And it may be that Linux, when detecting an NPU, will route computations to it, and those computations will be for artificial intelligence.

But Linux doesn't have to use an NPU for generative AI, or other commercial applications of AI. A Neural Network is, at its essence, a pattern-matching mechanism, and while AI as we know it today is a pattern-matching application (and therefore well-served by NPUs), it is not the only patter-matching application. It is quite possible (and I think probable) that the open-source community will develop non-AI applications that take advantage of the NPU.

I suspect that this development will happen in Linux and in the open source community, and not in Windows or the commercial market. Those markets will focus on the AI that is being developed today. The open source community will drive the innovation of neural network applications.

We are early in the era of neural networks. So early that I think we have no good understanding of what they can do, what they cannot do, and which of those capabilities match our personal or business needs. We have yet to develop the "killer app" of AI, the equivalent of the spreadsheet. "VisiCalc" made it obvious that computers were useful; once we had seen it, we could justify the purchase of a PC. We have yet to find the "killer app" for AI.


Thursday, May 16, 2024

Apple's Pentium moment

In the 1990s, as the market shifted from the 80486 processor to the newer Pentium processor, Intel had a problem. On some Pentium processors, a certain mathematical operation was incorrect. It was called the "FDIV bug". What made this a problem was that the error was detected only after a significant number of Pentium processors had been sold inside PCs.

Now that Apple is designing its own processors (not just the M-series for Mac computers but also the A-series for phones and tablets), Apple faces the risk of a similar problem.

It's possible that Apple will have a rather embarrassing problem with one of its processors. The question is: how will Apple handle it?

In my not-so-happy prediction, the problem will be more than an exploit that allows data to be extracted from the protected vault in the processor, or memory to be read across processes. It will be more severe. It will be a problem with the instruction set, much like Intel's FDIV problem.

If we assume that the situation will be roughly the same as the Intel problem, then we will see:

- A new processor (or a set of new processors) from Apple
- These processors will have been released; they will be in at least one product and perhaps more
- The problem will be rare, but repeatable. If one creates a specific sequence, one can see the problem

Apple may be able to correct it with an update. If it is, then Apple's course is easy: an apology and an update. Apple may take some minor damage to its reputation, which will fade over time.

Or maybe the problem cannot be fixed with an update. The error might be "hard-coded" into the chip. Apple now has a few options, all of them bad but some less bad than others.

It can fix the problem, build a new set of processors, and then assemble new products and offer free replacements. Replacing the defective units is expensive for Apple, in the short term. It probably creates the most customer loyalty, which can improve revenue and profits in the longer term.

Apple could build a new set of products and instead of offering free replacements, offer high trade-in values for the older units. Less expensive in the short term, but less loyalty moving forward.

I'm not saying that this will happen. I'm saying that it may happen. I have no connection with Apple (other than as a customer) and no insight into their design process and quality assurance procedures.

Intel, when faced with the FDIV bug, handled it poorly. Yet Intel survives today, so its response was not fatal. Let's see what Apple does.

Sunday, May 5, 2024

iPad gets an M4

There is a lot of speculation about Apple's forthcoming announcement. Much of it has to do with new models of the iPad and the use of a new M4 processor. Current models of iPad have M2 processors; Apple has not released an M3 iPad. People have tried to suss out the reason for Apple making such a jump.

Here's my guess: Apple is using the M4 processors because it has to. Or rather, using M3 processors in the new iPads has a cost that Apple doesn't want to pay.

I must say here that I am not employed by Apple, or connected with Apple, or with any of its suppliers. I have no contacts, no inside information. I'm looking at publicly available information and my experience with inventory management (which itself is quite limited).

My guess is based on the process of manufacturing processors. They are made in large batches, the larger the better ('better' as in 'lower unit costs').

Apple has a stock of M3 processors on hand. Possibly some outstanding orders for additional processors.

Apple also has projections for the sales of its various products, and therefore projections for the reduction of its inventory and the allocation of future orders. I'm pretty sure that Apple has gotten good at making these projections. It has projections for MacBooks, iMacs, iPhones, and iPads.

My guess is that Apple has enough M3 processors (on hand or in future orders) for the projected sales of MacBooks and iMacs, and that it does not have enough M3 processors for the sale of MacBooks, iMacs, and iPads.

Apple could increase its orders for M3 processors. But my second guess is that the minimum order quantity is much larger than the projected sales of iPads. (The iPad models have low sales numbers.) Therefore, ordering M3 processors for iPads means ordering a lot of M3 processors. Many more processors than are needed for iPad sales, and probably for the MacBook and iMac line. (The MacBooks and iMacs will switch to M4 processors soon, possibly in September.)

Apple doesn't want to over-order M3 processors and pay for processors that it will never use. Nor does it want to order a small batch, with the higher unit cost.

So instead, Apple puts M4 processors in iPads. The M4 production batches are just starting, and Apple can expect a number of future batches. Diverting a small number of M4 processors to the iPad is the least cost option here.

That's my idea for the reason of M4 processors in iPads. Not because Apple wants to use AI on the iPads, or make the iPad a platform suitable for development, or switch iPads to Mac OS. The decision is not driven by features, but instead by inventory costs.


Tuesday, April 23, 2024

Apple is ready for AI

I have been critical of Apple, and more specifically its designs with the M-series processors. My complaint is that the processors are too powerful, that even the simplest M1 processor is more than capable of handling tasks of an average user. (That is, someone who browses the web, reads and sends e-mail, and pays bills.)

The arrival of "AI" has changed my opinion. The engines that we call "artificial intelligence" require a great deal of processing, memory, and storage, which is just what the M-series processors have. Apple is ready to deploy AI on its next round of computers, powered by M4 processors. Those processors, merely speculative today, will most likely arrive in 2025 with companion hardware and software that includes AI-driven features.

Apple is well positioned for this. Their philosophy is to run everything locally. Applications run on the Mac, not in the cloud. Apps run on iPhones and iPads, not in the cloud. Apple can sell the benefits of AI combined with the benefits of privacy, as nothing travels across the internet.

This is different from the Windows world, which has seen applications and apps rely on resources in the cloud. Microsoft Office has been morphing, slowly into cloud-based applications. (There is a version one can install on a local PC, but I suspect that parts of that use cloud-based resources.)

I'm not sure how Microsoft and other application vendors will respond. Will they shift back to local processing? (Such a move would require a significant increase in processing power on the PC.) Will they continue to move to the cloud? (That will probably require additional security, and marketing, to convince users that their data is safe.)

Microsoft's response may be driven by the marketing offered by Apple. If Apple stresses privacy, Microsoft will (probably) counter with security for cloud-based applications. If Apple stresses performance, Microsoft may counter with cloud-based data centers and distributed processing.

In any case, it will be interesting to see the strategies that both companies use.

Tuesday, April 2, 2024

WFH and the real estate crisis

Over the past decades, companies (that is, employers) have shifted responsibilities (and risk) to their employees.

Employer companies replaced company-funded (and company-managed) pension plans with employee-funded (and employee-managed) 401-k retirement plans.

Employer companies have shifted the cost of medical insurance to employees. The company-run (and company-funded) benefit plan is a thing of the past. Today, the hiring process includes a form for the employee to select insurance options and authorize payment via payroll deduction.

Some companies have shifted other forms of risk to employees. Restaurants and fast-food companies, subject to large swings in demand during the day, have shifted staffing risk to employees. Instead of a steady forty-hour workweek with eight-hour shifts, employers now schedule employees with "just in time" methods, informing employees of their schedule one day in advance. Employees cannot plan for many activities, as they may (or may not) be scheduled to work in any future day.

In all of these changes, the employer shifted the risk to the employees.

Now we come to another form of risk: real estate. It may seem strange that real estate could be a risk. And it isn't; the risk is the loans companies have to buy real estate.

Many companies cannot afford the loans for their buildings. Here's why: A sizable number of companies have allowed employees to work from home (or locations of their own choosing), and away from the office. As a result, those companies need less office space than they needed in the past. So they rent less space.

It's not the tenant companies that have the risk of real estate loans -- it's the landlord companies. They made the loans and purchased the buildings.

But risk is risk, and it won't take long for landlord companies to shift this risk away from themselves. But this shift won't be easy, and it won't be like the previous shifts.

A building has two (or perhaps more) companies. One that owns the building (the landlord company), and a second (the tenant company) that leases space within. (The owner could be a group of companies in a joint effort. And a large building could have more than one tenant.)

But notice that this risk has two levels of corporations: the landlord company and the tenant company. The landlord company has employees, but they are not the employees who work in the building. Shifting the risk to them makes no sense.

The employees who work in the building are employees of the tenant company, and they have no direct connection to the landlord company. The landlord company cannot shift the risk to them, either.

Thus, the shift of risk (if a shift does occur) must move between the two companies. For the risk of real estate, the landlord company must shift the risk to its tenant companies.

That shift is difficult. It occurs not between an employer and employee, but between two companies. Shifting risk from employer to employee is relatively easy, due to the imbalance of power between the two. Shifting risk between companies is difficult: the tenant company can hire lawyers and fight the change.

If the owning company is successful, and does manage to shift the risk to its tenant company, then one might assume that the tenant company would want to shift the risk to its employees. That shift is also difficult, because there is little to change in the employment arrangement. Medical benefits and pension plans were easy to change, because employees were receiving something. With the risk of building ownership (or more specifically the risk of a lower value for the building) the employee is currently receiving... nothing. The have no share in the building, no part of the revenue, no interest in the transaction.

Savvy readers will have already thought of other ways of hedging the risk of real estate loans (or the risk of reduced demand for real estate). There are other ways; most involve some form of insurance. With them, the landlord company purchases a policy or some other instrument. The risk is shifted to a third company (the insurer) with payments.

I expect that the insurance option will be the one adopted by most companies. It works, it follows existing patterns of business, and it offers predictable payments to mitigate risk.

Sometimes you can shift risk to employees. Sometimes you can't.

Thursday, March 7, 2024

The doom of C++

The C++ programming language is successful. It has been popular for decades. It has been used in large, important systems. It has grown over the years, adding features that keep it competitive with other languages.

And it may be on its way out.

Other languages have challenged C++. Java was an early challenger, back in the 1990s. The latest challenger is Rust, and there is a good case for it. But I think the demise of C++ will not be caused by another language, but by the C++ language itself.

Or more specifically, changes to the language.

Consider a recent change to the C++ standard, to add variadic parameters to template functions.

I understand the need for variable -- excuse me, variadic -- parameters.

And I understand the desire by the C++ committee to minimize changes to the C++ syntax and grammar rules.

But code looks like this:

void dummy(auto&&...) {}

template<std::same_as<char> ...C>
void
expand(C...c)
{
  std::tuple<C...> tpl(c...);

  const char msg[] = { C(std::toupper(c))..., '\0' };
  dummy(msg, c...);
}

This code is not easy to read. In fact, I find it a candidate for the long-ago obfuscation contest.

This enhancement is not alone. Most changes to the C++ specification, over the past two decades have made C++ harder to read. The result is that we have lost the readability of the original C, and the early C++ syntax. This is a problem.

While it was possible to write obscure and difficult-to-read C code (and C++ code), it wasn't inevitable. It was, with care, possible to write code that was readable. Not merely readable by those who knew C or C++, but by almost anyone familiar with a programming language or the concepts of programming. (Although the concept of pointers was necessary.)

The changes to the C++ standard have resulted in code that is, in a word, ugly. This ugliness now is inevitable -- one cannot avoid it.

Programmers dislike two types of programming languages: wordy (think COBOL) and ugly (what C++ is becoming).

(Programmers also dislike programming languages that use the ':=' operator. That may fall under the category of 'ugly'.)

The demise of C++ won't be due to some new language, or government dictates to use memory-safe programming languages, or problems in applications.

Instead, C++ will be abandoned because it will be considered "uncool". As the C++ standards committee adds new features, it maintains compatibility at the cost of increasing ugliness of code. It is a trade-off that has long-term costs. Those costs may result in the demise of C++.