Sunday, September 8, 2024

Agile, Waterfall, and Risk

For some years (decades, really), software development has used an agile approach to project management. The Agile method sees short iterations that each focus on a single feature, with the entire team reviewing progress and selecting the feature for the next iteration. Over time, a complete system evolves. The advantage is that the entire team (programmers, managers, salespersons, etc.) learn about the business problem, the functions of the system, and the capabilities of the team. The team can change course (hence the name "agile") as they develop each feature.

Prior to Agile, for some years (decades, really), software development used the "waterfall" approach to project management. The Waterfall method starts with a set of requirements and a schedule, and moves through different phases for analysis, design, coding, testing, and deployment. The important aspect is the schedule. The Waterfall method promises to deliver a complete system on the specified date.

This last aspect of Waterfall is quite different from Agile. The Agile method makes no promise to deliver a completed system on a specific date. It does promise that each iteration ends with a working system that implements the features selected by the team. Thus, a system developed with Agile is always working -- although incomplete -- whereas a system developed with Waterfall is not guaranteed to work until the delivery date.

(It has been observed that while the Waterfall method promises a complete, working system on the specified delivery date, it is quite poor at keeping that promise. Many projects overrun both schedule and budget.)

Here is where risk comes into play.

With Agile, the risk is shared by the entire team, key among these are developers and managers. An agile project has no specified delivery date, but more often than not senior managers (those above the agile-involved managers) have a date in mind. (And probably a budget, too.) Agile projects can easily overrun these unstated expectations. When they do, the agile-involved managers are part of the group held responsible for the failure. Managers have some risk.

But look at the Waterfall project. When a waterfall project fails (that is, runs over schedule or budget) the managers have way to distance themselves from the failure. They can say (honestly) that they provided the developers with a list of requirements and a schedule (and a budget) and that the developers failed to meet meet the "contract" of the waterfall project. Managers can deflect the risk to the development team.

(For some reason, we rarely question the feasibility of the schedule, or the consistency and completeness of the requirements, or the budget assigned to the project. These are considered "good", and any delay or shortcoming is therefore the fault of the developers.)

Managers want to avoid risk -- or at least transfer it to another group. Therefore, I predict that in the commercial space, projects will slowly revert from Agile methods to Waterfall methods.

Thursday, August 1, 2024

Google search is broken, and we all suffer

Google has a problem. That problem is web search.

Google, the long-time leader in web search, recently modified its techniques to use artificial intelligence (AI). The attempt at AI-driven search has lead to embarrassing results. One person asked how to keep cheese on pizza, and Google suggested using glue. Another asked about cooking spaghetti, and Google recommended gasoline.

The problem was that Google pointed its AI search engine at the entire web, absorbing posts from various sources. Some of those posts contained text that was a joke or sarcastic. A human would be able to tell that the entries were not to be used in search results, but Google's algorithm isn't human.

Google has rediscovered the principle of "garbage into a computer system yields garbage output".

One might think that Google could simply "pull the plug" on the AI search and revert back to the older mechanisms it used in the past. But here too Google has a problem: the old search algorithms don't work (anymore).

Google started with a simple algorithm for search: count links pointing to the page. This was a major leap forward in search; previous attempts were curated by hand. Over the years, web designers have "gamed" the Google web crawler to move their web pages up in the results, and Google has countered with changes to their algorithm. The battle continues; there are companies that help with "Search Engine Optimization" or "SEO". Those optimizing companies have gotten quite good at tweaking web sites to appear high in searche results. But the battle is lost. Despite Google's size (and clever employees) the SEO companies have won, and the old-style Google search no longer shows meaningful results but mostly advertisement links.

SEO has changed Google search from a generic search engine into a sales lead tool. If you want to purchase something, Google is a great way to find a good price. But if you want something else, Google is much less useful that it used to be. It is no longer a tool for answers to general questions.

That means that search, for the internet, is broken.

It's not completely broken. In fact,"broken" is too strong of a word for the concept. Better choices might be "damaged" or "compromised", or even "inconsistent". Some searches work, and others don't.

Broken, damaged, or inconsistent, Google's search engine has suffered. Its reputation is reduced, and fewer people use it. That's a problem for Google, because the search results is a location to display advertisements, and advertisements are Google's major source of income.

A broken Google search is a problem for us all, in two ways.

First, with Google search broken, we (all) must now find alternative means of answering questions. AI might help for some -- although I don't recommend it for recipes -- and that can be a partial replacement. Other search engines (Bing, Yahoo) may work for now, but I expect that they will succumb to the same SEO forces that broke Google. With no single reliable source of information, we must now turn to multiple sources (stackexchange, Red Hat web pages, and maybe the local library) which means more work for us.

Secondly, the defeat of the Google whale to the SEO piranhas is another example of "this is why we cannot have nice things". It is the tragedy of the commons, with individuals acting selfishly and destroying a useful resource. Future generations will look back, possibly in envy, at the golden age of Google and a single source of reliable information.

Monday, July 22, 2024

CrowdStrike, Windows blue screens, and the future

A small problem with CrowdStrike, a Windows security application, has caused a wide-spread problem with thousands, perhaps millions, of PCs running Windows.

Quite a few folks have provided details about the problem, and how it happened.

Instead, I have some ideas about what will happen: what will happen at Microsoft, and what will happen at all of the companies that use CrowdStrike.

Microsoft long ago divided Windows into two spaces: one space for user programs and another space for system processes. The system space includes device drivers.

Applications in the user space can do some things, but not everything. They cannot, for example, interact directly with devices, nor can they access memory outside of their assigned range of addresses. If they do attempt to perform a restricted function, Windows stops the program -- before it causes harm to Windows or another application.

User-space applications cannot cause a blue screen of death.

If an error in CrowdStrike caused a blue screen of death (BSOD), then CrowdStrike must run in the system space. This makes sense, as CrowdStrike must access a lot of things to identify attacks, things normal applications do not look at. CrowdStrike runs with elevated privileges.

I'm guessing that Microsoft, as we speak, is thinking up ways to restrict third-party applications that must run with elevated privileges such as CrowdStrike. Microsoft won't force CrowdStrike into the user space, but Microsoft also cannot allow CrowdStrike to live in the system space where it can damage Windows. We'll probably see an intermediate space, one with more privileges than user-space programs but not all the privileges of system-space applications. Or perhaps application spaces with tailored privileges, each specific to the target application.

The more interesting future is for companies that use Microsoft Windows and applications such as CrowdStrike.

These companies are -- I imagine -- rather disappointed with CrowdStrike. So disappointed that they may choose to sue. I expect that management at several companies are already talking with legal counsel.

A dispute with CrowdStrike will be handled as a contract dispute. But I'm guessing the CrowdStrike, like most tech companies, specified arbitration in their contracts, and limited damages to the cost of the software.

Regardless of contract terms, if CrowdStrike loses, they could be in severe financial hardship. But if they prevail, they could also face a difficult future. Some number of clients will move to other providers, which will reduce CrowdStrike's income.

Other companies will start looking seriously at the contracts from suppliers, and start making adjustments. They will want the ability to sue in court, and they will want damages if the software fails. When the maintenance period renews, clients will want a different set of terms, one that imposes risk upon CrowdStrike.

CrowdStrike will have a difficult decision: accept the new terms or face further loss of business.

This won't stop at CrowdStrike. Client companies will review terms of contracts with all of their suppliers. The "CrowdStrike event" will ripple across the industry. Even companies like Adobe will see pushback to their current contract terms.

Supplier companies that agree to changes in contract terms will have to improve their testing and deployment procedures. Expect to see a wave of interest in process management, testing, verification, static code analysis, and code execution coverage. And, of course, consulting companies and tools to help in those efforts.

Client companies may also review the licenses for open source operating systems and applications. They may also attempt to push risk onto the open source projects. This will probably fail; open source projects make their software available at no cost, so users have little leverage. A company can choose to replace Python with C#, for example, but the threat of "we will stop using your software and pay you nothing instead of using your software and paying you nothing" has little weight.

Therefore shift in contracts will occur in the commercial space, at least not at first. It may change in the future, as changes in the commercial space become the norm.

Thursday, June 6, 2024

What to do with an NPU

Microsoft announced "Copilot PC", a new standard for hardware. It includes a powerful Neural Processing Unit (NPU) along with the traditional (yet also powerful) CPU and GPU. The purpose of this NPU is to support Microsoft's Copilot+, an application that uses "multiple state-of-the-art AI models ... to unlock a new set of experiences that you can run locally". It's clear that Microsoft will add generative AI to Windows and Windows applications. (It's not so clear that customers want generative AI or "a new set of experiences" on their PCs, but that is a different question.)

Let's put Windows to the side. What about Linux?

Linux is, if I may use the term, a parasite. It runs on hardware designed for other operating systems (such as Windows, macOS, or even Z/OS). I fully expect that it will run on these new "Copilot+ PCs", and when running, it will have access to the NPU. The question is: will Linux use that NPU for anything?

I suppose that before we attempt an answer, we should review the purpose of an NPU. A Neural Processing Unit is designed to perform calculations for a neural network. A neural network is a collection of nodes with connections between nodes. It has nothing to do with LANs or WANs or telecommunication networks.

The calculations of a neural network can be performed on a traditional CPU, but they are a poor match for the typical CPU. The calculations are a better match for a GPU, which is why so many people ran neural networks on them -- GPUs performed better than CPUs.

NPUs are better at the calculations than GPUs (and much better than CPUs), so if we have a neural network, its calculations would run fastest on an NPU. Neural Processing Units perform a specialized set of computations.

One application that uses those computations is the AI that we hear about today. And it may be that Linux, when detecting an NPU, will route computations to it, and those computations will be for artificial intelligence.

But Linux doesn't have to use an NPU for generative AI, or other commercial applications of AI. A Neural Network is, at its essence, a pattern-matching mechanism, and while AI as we know it today is a pattern-matching application (and therefore well-served by NPUs), it is not the only patter-matching application. It is quite possible (and I think probable) that the open-source community will develop non-AI applications that take advantage of the NPU.

I suspect that this development will happen in Linux and in the open source community, and not in Windows or the commercial market. Those markets will focus on the AI that is being developed today. The open source community will drive the innovation of neural network applications.

We are early in the era of neural networks. So early that I think we have no good understanding of what they can do, what they cannot do, and which of those capabilities match our personal or business needs. We have yet to develop the "killer app" of AI, the equivalent of the spreadsheet. "VisiCalc" made it obvious that computers were useful; once we had seen it, we could justify the purchase of a PC. We have yet to find the "killer app" for AI.


Thursday, May 16, 2024

Apple's Pentium moment

In the 1990s, as the market shifted from the 80486 processor to the newer Pentium processor, Intel had a problem. On some Pentium processors, a certain mathematical operation was incorrect. It was called the "FDIV bug". What made this a problem was that the error was detected only after a significant number of Pentium processors had been sold inside PCs.

Now that Apple is designing its own processors (not just the M-series for Mac computers but also the A-series for phones and tablets), Apple faces the risk of a similar problem.

It's possible that Apple will have a rather embarrassing problem with one of its processors. The question is: how will Apple handle it?

In my not-so-happy prediction, the problem will be more than an exploit that allows data to be extracted from the protected vault in the processor, or memory to be read across processes. It will be more severe. It will be a problem with the instruction set, much like Intel's FDIV problem.

If we assume that the situation will be roughly the same as the Intel problem, then we will see:

- A new processor (or a set of new processors) from Apple
- These processors will have been released; they will be in at least one product and perhaps more
- The problem will be rare, but repeatable. If one creates a specific sequence, one can see the problem

Apple may be able to correct it with an update. If it is, then Apple's course is easy: an apology and an update. Apple may take some minor damage to its reputation, which will fade over time.

Or maybe the problem cannot be fixed with an update. The error might be "hard-coded" into the chip. Apple now has a few options, all of them bad but some less bad than others.

It can fix the problem, build a new set of processors, and then assemble new products and offer free replacements. Replacing the defective units is expensive for Apple, in the short term. It probably creates the most customer loyalty, which can improve revenue and profits in the longer term.

Apple could build a new set of products and instead of offering free replacements, offer high trade-in values for the older units. Less expensive in the short term, but less loyalty moving forward.

I'm not saying that this will happen. I'm saying that it may happen. I have no connection with Apple (other than as a customer) and no insight into their design process and quality assurance procedures.

Intel, when faced with the FDIV bug, handled it poorly. Yet Intel survives today, so its response was not fatal. Let's see what Apple does.

Sunday, May 5, 2024

iPad gets an M4

There is a lot of speculation about Apple's forthcoming announcement. Much of it has to do with new models of the iPad and the use of a new M4 processor. Current models of iPad have M2 processors; Apple has not released an M3 iPad. People have tried to suss out the reason for Apple making such a jump.

Here's my guess: Apple is using the M4 processors because it has to. Or rather, using M3 processors in the new iPads has a cost that Apple doesn't want to pay.

I must say here that I am not employed by Apple, or connected with Apple, or with any of its suppliers. I have no contacts, no inside information. I'm looking at publicly available information and my experience with inventory management (which itself is quite limited).

My guess is based on the process of manufacturing processors. They are made in large batches, the larger the better ('better' as in 'lower unit costs').

Apple has a stock of M3 processors on hand. Possibly some outstanding orders for additional processors.

Apple also has projections for the sales of its various products, and therefore projections for the reduction of its inventory and the allocation of future orders. I'm pretty sure that Apple has gotten good at making these projections. It has projections for MacBooks, iMacs, iPhones, and iPads.

My guess is that Apple has enough M3 processors (on hand or in future orders) for the projected sales of MacBooks and iMacs, and that it does not have enough M3 processors for the sale of MacBooks, iMacs, and iPads.

Apple could increase its orders for M3 processors. But my second guess is that the minimum order quantity is much larger than the projected sales of iPads. (The iPad models have low sales numbers.) Therefore, ordering M3 processors for iPads means ordering a lot of M3 processors. Many more processors than are needed for iPad sales, and probably for the MacBook and iMac line. (The MacBooks and iMacs will switch to M4 processors soon, possibly in September.)

Apple doesn't want to over-order M3 processors and pay for processors that it will never use. Nor does it want to order a small batch, with the higher unit cost.

So instead, Apple puts M4 processors in iPads. The M4 production batches are just starting, and Apple can expect a number of future batches. Diverting a small number of M4 processors to the iPad is the least cost option here.

That's my idea for the reason of M4 processors in iPads. Not because Apple wants to use AI on the iPads, or make the iPad a platform suitable for development, or switch iPads to Mac OS. The decision is not driven by features, but instead by inventory costs.


Tuesday, April 23, 2024

Apple is ready for AI

I have been critical of Apple, and more specifically its designs with the M-series processors. My complaint is that the processors are too powerful, that even the simplest M1 processor is more than capable of handling tasks of an average user. (That is, someone who browses the web, reads and sends e-mail, and pays bills.)

The arrival of "AI" has changed my opinion. The engines that we call "artificial intelligence" require a great deal of processing, memory, and storage, which is just what the M-series processors have. Apple is ready to deploy AI on its next round of computers, powered by M4 processors. Those processors, merely speculative today, will most likely arrive in 2025 with companion hardware and software that includes AI-driven features.

Apple is well positioned for this. Their philosophy is to run everything locally. Applications run on the Mac, not in the cloud. Apps run on iPhones and iPads, not in the cloud. Apple can sell the benefits of AI combined with the benefits of privacy, as nothing travels across the internet.

This is different from the Windows world, which has seen applications and apps rely on resources in the cloud. Microsoft Office has been morphing, slowly into cloud-based applications. (There is a version one can install on a local PC, but I suspect that parts of that use cloud-based resources.)

I'm not sure how Microsoft and other application vendors will respond. Will they shift back to local processing? (Such a move would require a significant increase in processing power on the PC.) Will they continue to move to the cloud? (That will probably require additional security, and marketing, to convince users that their data is safe.)

Microsoft's response may be driven by the marketing offered by Apple. If Apple stresses privacy, Microsoft will (probably) counter with security for cloud-based applications. If Apple stresses performance, Microsoft may counter with cloud-based data centers and distributed processing.

In any case, it will be interesting to see the strategies that both companies use.