Fitzpatrick's Fabulous Future: artificial intelligence

Showing posts with label artificial intelligence. Show all posts

Tuesday, March 3, 2026

AI and the mortgage debt crisis of 2008

In 2008, investment banks saw tremendous losses caused by defaults on mortgages. It wasn't just mortgages; investment companies had bundled and repackaged mortgage loans into securities and sold those securities to other investors. The demand for these mortgage-backed securities was high (they paid good interest) and that demand spurred demand for mortgages, which spurred banks to offer (and originate) mortgages to a large number of people including a large number of people whom they would normally not give mortgage loans. The problem came when interest rates rose, causing mortgage payments to increase (many were adjustable-rate mortgages), and many mortgage holders could not afford the higher payments. They defaulted on the loans, which triggered failures through the entire chain of investments.

The end products, the mortgage-based securities, were supposedly top quality. The mortgages upon which they were based were not; the investment bankers had convinced themselves that a combination of mixed-grade mortgages could support a top-grade investment product.

It was a system that worked, until it didn't.

What does this have to do with AI? Keep in mind the notion of building top-grade products from a composite of mixed-grade products.

AI -- at least AI for programming -- works by building a large dataset of programs and then using that dataset to generate requested programs. The results are, in a sense, averages of certain selected items in the provided data (the "training data").

The quality of the output depends on the quality of the input. If I train an AI model on a large set of incorrect programs, the results will match those flawed programs. By training on large sets of programs, AI providers are betting on the "knowledge of the masses"; they assume that a very large collection of programs will be mostly correct. Scanning open source repositories is a common way to build such datasets. Companies with large datasets of their own (such as Microsoft) can use those private datasets for training an AI model.

I think that averaging to correctness works for most requests, but not necessarily for all requests.

I expect that simpler code is more available in code repositories, and complex and domain-specific code is less common. We can see lots and lots of "hello, world" programs, in almost any programming language. We can see lots of simple classes for a customer address (again, in almost any programming language).

We don't see lots of code for obscure applications, or very large applications. There are few publicly available applications to run oil rigs, for example. Or large, multinational accounting systems. Or perhaps even control software for a consumer-grade microwave oven.

There may be a few large, complex programs available in AI training data. But a few (or one) is not drawing on "the knowledge of the masses". It is not averaging a large set of mostly right code into a correct set of code.

Here we can see the parallel of AI for coding to the mortgage securities industry. The latter built (what it thought were) top-grade investment products from mixed-grade mortgages. The former is building (what it and users think are) quality code from mixed-grade existing code.

But I won't be surprised to learn that AI coding models work for small, simple code and fail for large, complex code.

In other words, AI coding works -- until it doesn't.

Monday, July 14, 2025

AI and programmer productivity

In the sixty years of IT, we have seen a number of productivity tools and techniques. Now we're looking at "artificial intelligence" as a way to improve the productivity of programmers. Google, Microsoft, and others are pushing their AI tools upon developers.

Will AI really work? Will it improve programmer productivity? It's not their first attempt to improve the productivity of programmers. Let's look at the history of programming and some of the ways we have improved (or attempted to improve) productivity.

Start with hardware: The first electronic computers were "programmed" by wiring. That is, the hardware was built to perform specific calculations. When you wanted a different calculation, you had to rewire the computer. This was the first form of programming. It's not really an improvement, but we have to start somewhere. Why not at the beginning?

Plug boards: The first productivity improvement was the "plug board". It was a physical board that was plugged into the computer, and it held the wiring for the specific problem. To change a computer's program, one could easily remove one plug board and install a different one. The computer became a general calculation device and the plug boards held the specific calculations - one could call then "programs".

Programs in memory: The next advance was changing the program from wiring (or plug boards) into values stored in the computer. A program consisting of numeric codes could be loaded into memory and then executed. No more plug boards! Programs could be loaded via switches on the computer's front panel, or from prepared paper tapes or punch cards.

Assemblers: But creating the long lists of numbers was tedious. Each operation required its own numeric code, such as 1 to add a number and 5 to store a number to memory. Programmers had to first decide on the sequence of operations and then convert those operations to numeric values. To help programmers (to improve performance) we invented the assembler. The assembler was a program that converted text op-codes into the numeric values that are executed by the computer. (The assembler was also a program that created another program!) Each computer model had its own set of numeric codes and its own assembler.

The first assemblers converted text operation codes to the proper numeric values. But programs are more than just operation codes. Many operations need additional information, such as a numeric constant (for the operation of "add 1 to the accumulator") or a memory address (for the operation "store the value in the accumulator to memory location 1008). It made sense to use names instead of the raw values, so we could write "store accumulator into location named 'total' " as STA TOTAL instead of STA 1008,

Symbols provided to benefits. First, referring to a memory location as "TOTAL" instead of its numeric address made the programs more readable. Second, as the program was revised, the location for TOTAL changed (it was 1008 in the first version, then 1010 in the second version because we needed some memory for other values, and 1015 in a third version). As the real address of TOTAL moved, the symbolic assembler kept up with the changes and the programmer didn't have to worry about them.

Each of those techniques improved productivity and eased the jobs of programmers. But we didn't stop there.

Programming languages: After assemblers, we invented the notion of programming "languages". There were many languages in those early days; Fortran and Cobol are two that we still use today, albeit with enhancements and changes to syntax. We called these "high level" languages to distinguish them from the "low level" assemblers.

We created compilers to convert programs written in high level languages into either machine code or into assembly code (which could then be converted to machine code by the assemblers). We still use high level languages today. We have Rust and Go and C++, which all follow the same process as the early compilers.

But after the invention of programming languages, things changed. The ideas and techniques for improving productivity focussed on the programmers and how they used the compilers, how they stored "source code" (the text programs), and how they interacted with the computer.

Structured programming: Structured programming was not a new language, or a new compiler or assembler, or even a program. It was a set of techniques to write programs that could be better understood by the original author and others. We had decided that programs were hard to read because the sequence of execution was hard to follow, and it was hard to follow because of the GOTO operation, which changed the sequence of control to another part of the program. Avoiding the GOTO became a goal of "good programming". GOTO statements were replaced with IF/THEN/ELSE, WHILE, and SWITCH/CASE statements. These rules were enforced in Pascal (which had a constrained GOTO) and implemented in PL/I, C, and C++ (but they still allowed unconstrained GOTO).

The IDE (integrated development environment: Prior to the IDE, work on programs was divided between a text editor and the compiler. You would run the editor, make changes, save the file, and exit the text editor. Then your would run the compiler, get a list of errors, and note them. Then run the editor again and fix the errors, then run the compiler again. The development process consisted of alternately running the editor and the compiler. The IDE combined the editor and compiler into a single program, so you could edit and compile quickly. Popularized by Turbo Pascal in the 1980s, they had existed prior in the UCSD p-System. One could even say that BASIC was the first IDE, as it let one edit a program, run the program, and diagnose errors without leaving the BASIC program.

Fourth generation languages: Higher, more abstract than the third generation languages (Cobol, Fortran). SQL is a fourth-generation language, probably the only one we use today. The others were discarded due to poor performance and inability to handle low-level

Program generators: If compilers take source code and convert it to assembly language (and some of them did), then we can apply the same trick and create a configuration file and feed it into a program generator which generates a program in a third level language (which can then be compiled as usual). Program generators failed bot because they couldn't do the job, but because the very high level languages were very limited in their capabilities. As one colleague said, "program generators do what they do very well, but nothing beyond that".

Source control systems: From the first source control system (sccs) to today's modern tool (git) source control kept previous versions of code and allowed programmers to compare the current version to earlier versions of the code. It allowed programmers to commit changes into a central repository and easily share their updates with other members of the team.

UML (Universal Modelling Language): Similar to program generators, UML was a notation for specifying computation. (The 'U' for 'universal' was the result of combining multiple competing modelling notations.) UML wasn't a programming language -- it wasn't fed into a generator which created the program; instead, it was used by human programmers to create the programs in traditional programming languages. UML was more generic than the configuration files for program generators. But it was not adopted by the industry, for reasons of money, time, and politics.

Object-oriented programming: A way to organize source code for large systems. One might say that it was "structured programming but bigger". Object-oriented programming is one the biggest successes in programming.

Function points: A way to measure the effort to develop programs from the requirements. Function points were a tool not for programmers but for project managers. They calculated estimates for effort, based on easily identified aspects such as inputs, processing steps, and outputs. This was advertised as an improvement over the previous method of intuition or just plain guessing. Function points were unbiased and un-overly-optimistic, and the approach should have been welcomed by managers. Yet managers eschewed function points. There were challenges (tools for all languages were not available, or were expensive) but I believe that the real reason was that the analyses provided by the tools were often higher than managers wanted. Managers did not like the estimates from the function point reports, and reverted back to the older technique of guessing (which could give a number that managers did like).

Looking back, we can see that we have tried various ideas for improving productivity. Many succeeded, some did not.

But from what I've seen, AI seems to be closest to the fourth generation languages and program generators of the past. It creates programs from a specification (an input prompt). Compared to the program generators of the 1970s and 1980s, today's AI tools are much more sophisticated and can generate many more types of programs. Yet they are still limited to the input data used to train them, and AI can go only so far in creating programs. I expect that we will quickly find those limits and become disappointed with AI.

I suspect that AI has a place in programming, probably with junior developers, as aides to develop simple programs. I have yet to be convinced that AI will handle the creation of large-scale, complex systems -- at least not today's version of AI. Future versions of AI may be able to generate large, complex applications; I will wait and see.

Tuesday, July 1, 2025

A lesson from the 1960s

The recent push for AI (artificial intelligence) is forcing us to learn a lesson -- or rather, re-learn a lesson that we learned back in the early days of computing.

In the 1960s, when computers were the shiny new thing, we adored computers. They were superior at computing (compared to us humans) and could calculate much faster and more accurately than us. Computers became the subject of books, movies, magazine articles, and even television programs. They were depicted as large, efficient, and always correct (or so we thought).

We trusted computers. That was the first phase of our relationship.

Yet after some time, we learned that computers were not infallible. They could "make mistakes". Many problems within organizations were blamed on "computer error". It was a convenient excuse, and one that was not easily challenged. They became scapegoats. That was the second phase of our relationship.

Given more time, we realized that computers were tools, and like any tools, they could be used or misused, that they were good at some tasks and not others. We also learned that the quality of a computer's output depended on two things: the quality of the program and the quality of the data. Both had to be correct for the results to be correct. Relatively few people worked on the programs; more people worked on the data being fed into computers.

This was the third phase of our relationship with computers. We recognized that their output was based on the input. We began to check our input data. We began to select sources for our data based on the quality of the data. We even invented a saying: "Garbage in yields garbage out".

That was the 1960s.

Fast-forward to the 2020s. Look carefully at our relationship with AI and see how it matches that first phase of the 1960s relationship with computers. AI is the shiny new thing. We adore it. We trust it.

We don't recognize that it is a tool, and like any tool it is good at some things and not others. We don't recognize that the quality of its output depends on the quality of its input.

We build large language models and train them on any data that we can find. We don't curate the data. We don't ensure that it is correct.

The rule from the 1960s still holds. Garbage in yields garbage out. We have to re-learn that rule.

Wednesday, January 8, 2025

The missing conversation about AI

For Artificial Intelligence (AI), -- or at least the latest fad that we call "AI" -- I've seen lots of announcements, lots of articles, lots of discussions, and lots of advertisements. All of them -- and I do mean "all" -- fall into the category of "hype". I have yet to see a serious discussion or article on AI.

Here's why:

In business -- and in almost every organization -- there are four dimensions for serious discussions. Those dimensions are: money, time, risk, and politics. (Politics internal to the organization, or possible with external suppliers or customers; not the national-level politics.)

Businesses don't care if an application is written in Java or C# or Rust. They *do* care that the application is delivered on time, that the development cost was reasonably close to the estimated cost, and that the application runs as expected with no ill effects. Conversations about C++ and Rust are not about the languages but about the risks of applications written in those languages. Converting from C++ to Rust is about the cost of conversion, the time it takes, opportunities lost during the conversion, and reduction of risk due to memory leaks, invalid access, and other exploits. The serious discussion ignores the issues of syntax and IDE support (unless one can tie them to money, time, or risk).

With AI, I have not seen a serious discussion about money, for either the cost to implement AI or the reduction in expenditures, other than speculation. I have not seen anyone list the time it took to implement AI with any degree of success. I have yet to see any articles or discussions about the risks of AI and how AI can provide incorrect information that seems, at first glance, quite reasonable.

These are the conversations about AI that we need to have. Without them, AI is merely a shiny new thing that has no clearly understood benefits and no place in our strategies or tactics. Without them, we do not understand the true costs to implement AI and how to decide when and where to implement it. Without them, we do not understand the risks and how to mitigate them.

The first rule of investment is: If you don't understand an investment instrument, then don't invest in it.

The first rule of business management is: If you don't understand a technology (how it can help you, what it costs, and its risks), then don't implement it. (Other than small, controlled research projects to learn about it.)

It seems to me that we don't understand AI, at least not well enough to use it for serious tasks.

Thursday, June 6, 2024

What to do with an NPU

Microsoft announced "Copilot PC", a new standard for hardware. It includes a powerful Neural Processing Unit (NPU) along with the traditional (yet also powerful) CPU and GPU. The purpose of this NPU is to support Microsoft's Copilot+, an application that uses "multiple state-of-the-art AI models ... to unlock a new set of experiences that you can run locally". It's clear that Microsoft will add generative AI to Windows and Windows applications. (It's not so clear that customers want generative AI or "a new set of experiences" on their PCs, but that is a different question.)

Let's put Windows to the side. What about Linux?

Linux is, if I may use the term, a parasite. It runs on hardware designed for other operating systems (such as Windows, macOS, or even Z/OS). I fully expect that it will run on these new "Copilot+ PCs", and when running, it will have access to the NPU. The question is: will Linux use that NPU for anything?

I suppose that before we attempt an answer, we should review the purpose of an NPU. A Neural Processing Unit is designed to perform calculations for a neural network. A neural network is a collection of nodes with connections between nodes. It has nothing to do with LANs or WANs or telecommunication networks.

The calculations of a neural network can be performed on a traditional CPU, but they are a poor match for the typical CPU. The calculations are a better match for a GPU, which is why so many people ran neural networks on them -- GPUs performed better than CPUs.

NPUs are better at the calculations than GPUs (and much better than CPUs), so if we have a neural network, its calculations would run fastest on an NPU. Neural Processing Units perform a specialized set of computations.

One application that uses those computations is the AI that we hear about today. And it may be that Linux, when detecting an NPU, will route computations to it, and those computations will be for artificial intelligence.

But Linux doesn't have to use an NPU for generative AI, or other commercial applications of AI. A Neural Network is, at its essence, a pattern-matching mechanism, and while AI as we know it today is a pattern-matching application (and therefore well-served by NPUs), it is not the only patter-matching application. It is quite possible (and I think probable) that the open-source community will develop non-AI applications that take advantage of the NPU.

I suspect that this development will happen in Linux and in the open source community, and not in Windows or the commercial market. Those markets will focus on the AI that is being developed today. The open source community will drive the innovation of neural network applications.

We are early in the era of neural networks. So early that I think we have no good understanding of what they can do, what they cannot do, and which of those capabilities match our personal or business needs. We have yet to develop the "killer app" of AI, the equivalent of the spreadsheet. "VisiCalc" made it obvious that computers were useful; once we had seen it, we could justify the purchase of a PC. We have yet to find the "killer app" for AI.

Tuesday, April 23, 2024

Apple is ready for AI

I have been critical of Apple, and more specifically its designs with the M-series processors. My complaint is that the processors are too powerful, that even the simplest M1 processor is more than capable of handling tasks of an average user. (That is, someone who browses the web, reads and sends e-mail, and pays bills.)

The arrival of "AI" has changed my opinion. The engines that we call "artificial intelligence" require a great deal of processing, memory, and storage, which is just what the M-series processors have. Apple is ready to deploy AI on its next round of computers, powered by M4 processors. Those processors, merely speculative today, will most likely arrive in 2025 with companion hardware and software that includes AI-driven features.

Apple is well positioned for this. Their philosophy is to run everything locally. Applications run on the Mac, not in the cloud. Apps run on iPhones and iPads, not in the cloud. Apple can sell the benefits of AI combined with the benefits of privacy, as nothing travels across the internet.

This is different from the Windows world, which has seen applications and apps rely on resources in the cloud. Microsoft Office has been morphing, slowly into cloud-based applications. (There is a version one can install on a local PC, but I suspect that parts of that use cloud-based resources.)

I'm not sure how Microsoft and other application vendors will respond. Will they shift back to local processing? (Such a move would require a significant increase in processing power on the PC.) Will they continue to move to the cloud? (That will probably require additional security, and marketing, to convince users that their data is safe.)

Microsoft's response may be driven by the marketing offered by Apple. If Apple stresses privacy, Microsoft will (probably) counter with security for cloud-based applications. If Apple stresses performance, Microsoft may counter with cloud-based data centers and distributed processing.

In any case, it will be interesting to see the strategies that both companies use.

Tuesday, November 28, 2023

Today's AI means QA for data

Some time ago, I experimented with n-grams. N-grams are a technique that reads an existing text and produces a second text that is similar but not the same. It splits the original text into pieces; for 2-grams it uses two letters, for 3-grams it uses three letters, etc. It computes the frequency of each combination of letters and then generates new text, selecting each letter based on the frequency of occurrence after a set of letters.

For 2-grams, the word 'every' is split into 'ev', 've', 'er', and 'ry'. When generating text, the program sees that 'e' is followed by either 'v' or 'r' and builds text with that same pattern. That's with an input of one word. With a larger input, the letter 'e' is followed by many different letters, each with its own frequency.

Using a program (in C, I believe) that read text, split it into n-grams, and generated new text, I experimented with names of friends. I gave the program a list of names and the program produced a list of names that were recognizable as names, but not the names of the original list. I was impressed, and considered it pretty close to magic.

It strikes me that the AI model ChatGPT uses a similar technique, but with words instead of individual letters. Given a large input, or rather, a condensation of frequencies of words (the 'weights') it can generate text using the frequencies of words that follow other words.

There is more to ChatGPT, of course, as the output is not simply random text but text about a specified topic. But let's focus on the input data, the "training text". That text is half of what makes ChatGPT possible. (The other half being the code.)

The training text enables, and also limits, the text generated by ChatGPT. If the training text (to create the factors) were limited to Shakespeare's plays and sonnets, for example, any output from ChatGPT would strongly resemble Shakespeare's work. Or if the training were limited to the Christian Bible, then the output would be in the style of the Bible. Or if the training text were limited to lyrics of modern songs, then the output would be... you get the idea.

The key point is this: The output of ChatGPT (or any current text-based AI engine) is defined by the training text.

Therefore, any user of text-based AI should understand the training text for the AI engine. And this presents a new aspect of quality assurance.

For the entire age of automated data processing, quality assurance has focussed on code. The subject of scrutiny has been the program. The input data has been important, but generally obtained from within the organization or from reputable sources. It was well understood and considered trustworthy.

And for the entire age of automated data processing, the tests have been pointed at the program and the data that it produces. All of the procedures for tests have been designed for the program and the data that it produces. There was little consideration to the input data, and almost no tests for it. (With the possible exception of completeness of input data, and input sets for unusual cases.)

I think that this mindset must change. We must now understand and evaluate the data that is used to train AI models. Is the data appropriate for our needs? Is the data correct? Is it marked with the correct metadata?

With a generally-available model such as ChatGPT, where one does not control the training data, nor does one have visibility into the training data, such analyses are not possible. We have to trust that the administrators of ChatGPT have the right data.

Even with self-hosted AI engines, where we control the training data, the effort is significant. The work includes collecting the data, verifying its provenance, marking it with the right metadata, updating it over time, and removing it when it is no longer appropriate.

It strikes me that the work is somewhat similar to that of a librarian, managing books in a library. New books must be added (and catalogued), old books must be removed.

Perhaps we will see "Data Librarian" as a new job title.

Thursday, August 3, 2023

We'll always have source code

Will AI change programming? Will it eliminate the need for programmers? Will we no longer need programs, or rather, the source code for programs? I think we will always have source code, and therefore always have programmers. But perhaps not as we think of programmers and source code today.

But first, let's review the notions of computers, software, and source code.

Programming has been with us almost as long as we have had electronic computers.

Longer than that, if we include the punch cards used by the Jacquard loom, But let's stick to electronic computers and the programming of them.

The first digital electronic computers were built in the 1940s. They were programming not by software but by wires -- connecting various wires to various points to perform a specific set of computations. There was no concept of a program -- at least not one for the computer. There were no programming languages and there was no notion of source code.

The 1950s saw the introduction of the stored-program computer. Instead of wiring plug-boards, program instructions were stored in cells inside the computer. We call these instructions "machine code". When programming a computer, machine code is a slightly more convenient than wiring plug-boards, but not by much. Machine code consists of a number of instructions, which each reside at distinct, sequential locations in memory. The processor executes the program by simply reading one instruction from a starting location, executing it, and then reading the next instruction at the next memory address.

Building a program in machine code took a lot of time and required patience and attention to detail. Changing a program often meant inserting instructions, which meant that the programmer had to recalculate all of the destination addresses for loops, branches, and subroutines. With stored-program computers, there was the notion of programming, but not the notion of source code.

Source code exists to be processed by a computer and converted into machine code. We first had source code with symbolic assemblers. Assemblers were (and still are) programs that read a text file and generate machine code. Not just any text file, but a text file that follows specific rules for content and formatting, and specifies a series of machine instructions but as text -- not as numbers. The assembler did the grunt work of converting "mnemonic" codes to numeric machine codes. It also converted numeric and text data to the proper representation for the processor, and calculated the destinations for loops, branches, and subroutines. Revising a program written in assembly language was much easier than revising machine code.

Later languages such as FORTRAN and COBOL converted higher-level text into machine code. They, too, had source code.

Early C compilers converted code into assembly code, which then had to be processed by an assembler. This last sequence looked like this:

C source code --> [compiler] --> assembly source code --> [assembler] --> machine code

I've listed both the C code and the assembly code as "source code", but in reality only the C code is the source code. The assembly code is merely an intermediate form of the code, something generated by machine and later read by machine.

A better description of the sequence is:

C source code --> [compiler] --> assembly code --> [assembler] --> machine code

I've changed the "assembly source code" to "assembly code". The adjective "source" is not really correct for it. The C program (at the left) is the one and only source.

Later C compilers omitted this intermediate step and generated machine code directly. The sequence become:

C source code --> [compiler] --> machine code

Now let's consider AI. (You didn't forget about AI, did you?)

AI can be used to create programs in two ways. One is to enhance a traditional programming IDE with AI, and thereby assist the programmer as he (or she) is typing. That's no different from our current process; all we have done is made the editor a bit smarter.

The other way is to use AI directly and ask it to create the program. In this method, a programmer (or perhaps a non-programmer) provides a prompt text to an AI engine and the AI engine creates the entire program, which is then compiled into machine code. The sequence looks like this:

AI prompt text --> [AI engine] --> source code --> [compiler] --> machine code

Notice that the word "source" has sneaked back into the middle of the stream. The term doesn't belong there; that code is intermediate and not the source. A better description is:

Source AI prompt text --> [AI engine] --> intermediate code --> [compiler] --> machine code

This description puts the "source" back on the first step of the process. That prompt text is the true source code. One may argue that a prompt text is not really source code, that it is not specific enough, or not Turing-complete, or not formatted like a traditional program. I think that it is the source code. It is created by a human and it is the text used by the computer to generate the machine code that we desire. That makes it the source.

Notice that in this new process with AI, we still have source code. We still have a way for humans to instruct computers. I've been writing about source code as if it were written. Source code has always been written (or typed, or keypunched) in the past. It is possible that future systems recognize human speech and build programs from that (much like on several science fiction TV programs). If so, those spoken words will be the source code.

AI may change the programming world. It may upend the industry. It may force many programmers to learn new skills, or to retire. But humans will always want to express their desires to computers. The way they express them may be through text, or through speech, or (in some far-off day) through direct neural links. Those thoughts will be source code, and we will always have it. The people who create that source code are programmers, so we will always have them.

We will always have source code and programmers, but source code and programming will change over time.

Thursday, July 20, 2023

Hollywood's blind spot

Hollywood executives are probably correct in that AI will have a significant effect on the movie industry.

Hollywood executives are probably underestimating the effect that AI will have on the movie industry.

AI, right now, can create images. Given some prompting text, an AI engine can form an image that matches the description in the text. The text can be simple, such as "a zombie walking in an open field", or it can be more complex.

It won't be long before AI can make not a single image but a video. A video is nothing more than a collection of images, each different from the previous in minor ways. When played back at 24 frames per second, the human mind perceives the images not as individual images but as motion. (This is how movies on film work, and how movies on video tape work.) I'm sure people are working on "video from AI" right now -- and they may already have it.

A movie is, essentially, a collection of short videos. If AI can compose a single video, then AI can compose a collection of videos. The prompting text for a movie might resemble a traditional movie script -- with some formatting changes and additional information about costumes, camera angles, and lighting.

Thus, with enough computing power, AI can start with an enhanced, detailed script and render a movie. Let's call this a "script renderer".

A script renderer makes the process of moviemaking cheap and fast. It is the word processor of the twenty-first century. And just as word processors upended the office jobs of the twentieth century, the script renderer will upend the movie jobs of this century. Word processors (the software on commonplace computers) replaced people and equipment: secretaries, proofreaders, typewriters, carbon paper, copy machines, and Wite-out erasing fluid.

Script renderers (okay, that's a clumsy term and we'll probably invent something better) will do similar things for movies. If an AI can make a movie from a script, then movie makers don't need equipment (cameras, lights, costumes, sets, props, microphones) and the people who handle that equipment. It may be possible for a single individual to write a script, send it through a renderer, and get a movie. What's more, just as word processors let one print a document, review it, make changes, and print it again, a script renderer will let one render a movie, view it, make changes, and render it again -- perhaps all in a few hours.

Hollywood executives, if they have seen this far ahead, may be thinking that their studios will be much more profitable. They won't need to pay actors, or camera operators, or build sets, or ... lots of other things. All of those expenses disappear, but the revenue from the movies remain.

But here's what they don't see: Making a movie will simply be a matter of computing power. Anyone with a computer and access to a sufficiently powerful AI will be able to convert a script into a movie.

Today, anyone can start a newsletter. Or print invitations to a party. Or their own business cards.

Tomorrow, anyone will be able to make a movie. It won't be easy; one still needs a script with the right details, and one should have a compelling story and good dialog. But it will be much easier than it is today.

And create movies they will. Not just movies, but TV episodes, mini series, and perhaps even short videos like the old Flash Gordon serials.

I suspect that the first wave of "civilian movies" will be built on existing materials. Fans of old "Star Trek" shows will create new episodes with new stories but using the likenesses of the original actors. The studios will sue, of course, but it won't be a simple case of copyright infringement. The owners of the old shows will have to build a case on different grounds. (They will probably prevail, if only because the amateurs cannot pay the court costs.)

The second wave will be different. It will be new material, away from the copyrighted and trademarked properties. But it will still be amateurish, with poor dialog and awkward pacing.

The third wave of non-studio movies will be better, and will be the real threat to today's movie studios. These movies will have higher quality, and will obtain some degree of popularity. That will get the attention of Hollywood executives, because now these "civilian" movies will compete with "real" movies.

Essentially, AI removes the moat around movie studios. That moat is the equipment, sound stages, and people needed to make a movie today. When the moat is gone, lots of people will be able to make movies. And lots will.

Wednesday, June 1, 2022

Ideas for Machine Learning

A lot of what is called "AI" is less "Artificial Intelligence" and more "Machine Learning". The differences between the two are technical and rather boring, so few people talk about them. From a marketing perspective, "Artificial Intelligence" sounds better, so more people use that term.

But whichever term you use, I think we can agree that the field of "computers learning to think" has yielded dismal results. Computers are fast and literal, good at numeric computation and even good at running video games.

It seems to me that our approach to Machine Learning is not the correct one. We've been at it for decades, and our best systems suffer from fragility, providing wildly different answers for similar inputs.

That approach (from what I can tell) is to build a Machine Learning system, train it on a large set of inputs, and then have it match other inputs to the training set. The approach tries to match similar aspects to similar aspects.

I have two ideas for Machine Learning, although I suspect that they will be rejected by the community.

The first idea is to change the basic mechanism of Machine Learning. Instead of matching similar inputs, design systems which minimize errors. That is, balance the identification of objects with the identification of errors.

This is a more complex approach, as it requires some basic knowledge of an object (such as a duck or a STOP sign) and then it requires analyzing aspects and classifying them as "matching", "close match", "loose match", or "not a match". I can already hear the howls of practitioners, switching their mechanisms to something more complex.

But as loud as those complaints may be, they will be a gentle whisper compared to the reaction of my second idea: Switch from 2-D photographs to stereoscopic photographs.

Stereoscopic photographs are pairs of photographs of an object, taken by two cameras some distance apart. By themselves they are simple photographs. Together, they allow for the calculation of depth of objects. (Anyone who has used an old "Viewmaster" to look at a disk of transparencies has seen the effect.)

A stereoscopic photograph should allow for better identification of objects, because one can tell that items in the photograph are in the same plane or different planes. Items in different planes are probably different objects. Items in the same plane may be the same object, or may be two objects in close proximity. It's not perfect, but it is information.

The objections are, of course, that the entire corpus of inputs must be rebuilt. All of the 2-D photographs used to train ML systems are now invalid. Worse, a new collection of stereoscopic photographs must be taken (not an easy task), stored, classified, and vetted before they can be used.

I recognize the objections to my ideas. I understand that they entail a lot of work.

But I have to ask: is the current method getting us what we want? Because if it isn't, then we need to do something else.

Monday, February 15, 2021

Linked lists, dictionaries, and AI

When I was learning the craft of programming, I spent a lot of time learning about data structures (linked lists, trees, and other things). How to create them. How to add a node. How to remove a node. How to find a node. There was a whole class in college about data structures.

At the time, everyone learning computer science learned those data structures. Those data structures were the tools to use when designing and building programs.

Yet now in the 21st century, we don't use them. (At least not directly.)

We use lists and dictionaries. Different languages use different names. C++ calls them 'vectors' and 'maps'. Perl calls them 'lists' and 'hashes'. Ruby calls them ... you get the idea. The names are not important.

What is important is that these data structures are the ones we use. Every modern language implements them. And I must admit that lists and dictionaries are much easier to use than linked lists and balanced trees.

Lists and dictionaries did not come for free, though. They cost more in terms of both execution time and memory. Yet we, as an industry, decided that the cost of lists and dictionaries was worth the benefit (which was less time and effort to write programs).

What does this have to do with AI?

It strikes me that AI is in a phase equivalent to the 'linked list' phase of programming.

Just as we were convinced, some years ago, that linked lists and trees were the key to programming, we are (today) convinced that our current techniques are the key to AI.

It would not surprise me to find that, in five or ten years, we are using completely different tools for AI.

I don't know what those new tools will be. (If I did, I would be making a small fortune implementing and selling them.)

But just as linked lists and trees morphed into lists and dictionaries with the aid of faster processors and more memory, I think AI tools of today will morph into the tools of tomorrow with better hardware. That better hardware might be faster processors and more memory, or it might be advanced network connections and coordination between processes on different computers, or it might even be better data structures. (The last, technically, is of course not hardware.)

Which doesn't mean we should stop work on AI. It doesn't mean that we should all just sit around and wait for better tools for AI to appear. (If no one is working on AI, then no one will have ideas for better tools.)

We should continue to work on AI. But just as we replaced to code that used older data structures with code that used newer data structures, we should expect to replace early AI techniques with later AI techniques. In other words, the things that we build in AI will be temporary. We can expect to replace them with better tools, better models -- and perhaps not that far off in the future!

Tuesday, January 29, 2019

Intelligence, real and artificial

We humans have been working on artificial intelligence for a long time. At least fifty years, by my count, and through most of that time, true artificial intelligence has been consistently "twenty years away".

Of course, one should not talk solely about artificial intelligence. We humans have what we style as real intelligence. Perhaps the term "organic intelligence" is more appropriate, as human intelligence evolved organically over the ages. Let's not argue too much about terms.

The human mind is a strange thing. It is the only thing in the universe that can examine itself. (At least, it's the only one that we humans know about.)

There are many models of the human mind. We have studied the anatomy, the physiology, the chemistry, ... and we still understand little about how it works. Freud studied the human psyche (close enough to the mind for this essay) and Skinner studied animal behaviors with reward systems, and we still know little about the mind.

But there is one model that strikes me as useful when developing artificial intelligence: the notion of the human brain as two different but connected processors.

In this model, humans have not one but two processors: one slow and linear, the other fast and parallel. The slow, linear processor gives us analytical thought, math, logic, and language. The parallel side gives us intuition, using s a pattern-matching system.

The logical side is easy for us to examine. It is linear and relatively slow, and since it has language, it can talk to us. We can follow a chain of reasoning and understand how we arrive at an answer. (We can also explain our reasoning to another person, or write it down.)

The intuitive side is difficult to examine. It is parallel and relatively fast, and since it does not have language, it cannot explain how it arrives at an answer. We don't know why we get the results we get.

From an evolution point of view, it is easy to see how we developed the intuitive (pattern-matching) side. Our ancestors were successful when they identified a rabbit (and ate it) and identified a tiger (and ran away from it). Pattern matching is quite useful for survival.

It is less clear how we evolved the linear-logical side of our brain. Slow, analytic thought may be helpful for survival, but perhaps not as helpful as avoiding tigers. Communication is clearly a benefit when living in groups. No matter how it arose, we have it.

These two sides make up our brain. (Yes, I am aware that there are various levels of the brain, all the way down to the brain stem, but bear with me.)

Humans are successful, I believe, because we have both the logical and the intuitive processors. We use both brains in our everyday life, from recognizing other humans and breakfast cereal, and we think about business strategies and algebra homework. We pick the right processor for the problem at hand.

Now let's shift from our human intelligence to ... not artificial intelligence but computer intelligence, such as it is.

Traditional computing is our logic, math-oriented brains with a turbocharger. Computers are fast, and can perform calculations rapidly and reliably, but they don't have "common sense", or the intuition that we humans use. While fast, we can examine the program and understand how a computer arrives at a result.

Artificial intelligence, on the other hand, corresponds to the intuitive side of human intelligence. It can solve problems (relatively quickly) many times through pattern-matching techniques. And, just as the human intuitive, pattern-matching brain cannot explain how it arrives at a result, neither can artificial intelligence systems. We cannot simple examine the program and look at some variables to understand how the result was determined.

So now we have two artificial systems, one logical and one intuitive systems. These two types of "intelligence" are the two types in humans.

The real advance will be to combine the traditional computing systems (the logical systems) with artificial intelligence (the pattern-matching systems), just as our brains combine the two. Bringing the two disparate systems into one will be necessary for true, Skynet-class, Forbin-class, HAL-9000-class, artificial intelligence.

I expect that joining the two will be quite the challenge. We understand little about our human brains and how the logical and intuitive processors coordinate their work. Getting the logical and intuitive computer systems to work together will be (I think) a long effort.

But when we get it -- watch out!

Thursday, February 23, 2017

The (possibly horrifying) killer app for AI

The original (and so far only) "killer app" was the spreadsheet. The specific spreadsheet was VisiCalc (or Lotus 1-2-3, depending on who you ask) and it was the compelling reason to get a personal computer.

We may see a killer app for AI, and from a completely unexpected direction: performance reviews.

Employee performance reviews, in large companies, often work as follows: each employee is rated on a number of items, frequently from 1 to 5 and sometimes as "meets expectations" or "needs improvement". Items range from meeting budgets and delivery dates to soft skills such as communication and leadership.

HR works to ensure that performance reviews are administered fairly, which means as consistently as possible, which often means "one size fits all". Everyone in the organization, from the entry-level developer to the vice president of accounting, all have the same performance review form and topics. It leads to developers being rated on "meeting budgets" and vice presidents of accounting being rated on "meeting delivery dates".

Just about everyone fears and dislikes the process. Employees dread the annual (or semiannual) review. Managers have no joy for it either.

This is where AI may be attractive.

Instead of a human-driven process, a company may look for an AI-driven process. The human-administered process is rife with potential for inconsistencies (including favoritism) and opens the company to lawsuits. Instead of expending effort to enforce consistent criteria, HR may choose to implement AI for performance reviews. (Managers may have little say in the decision, and many may be secretly relieved at such a change.)

This is a possibly horrifying concept. The mere idea of a computer (which is what AI is, at bottom) rating and ranking employees may be unwelcome among the ranks. The fear of "computer overlords" from the 1960s is still with us, and I suspect few companies would want to be the first to implement such a system.

I recognize that such a system cannot work in a vacuum. It would need input, starting with a list of job responsibilities, assigned tasks and deadlines, and status reports. Early versions will most likely get many things wrong. Over time, I expect they will improve.

Should we move to AI for performance reviews, I have some observations.

First, AI performance review systems may move outside of companies. Just as payroll processing is often outsourced, performance review systems might be outsourced too. The driver is risk avoidance, and companies that build their own performance review AI systems may build in subtle discrimination against women or minorities. An external supplier would have to warrant their system conforms to anti-discrimination laws -- a benefit to the client company.

Second, automating performance reviews could mean more frequent reviews, and more frequent feedback to employees. The choice of annual as a frequency for performance reviews is driven, I suspect, by two factors. First, they are needed to justify changes in compensation. Second, they are expensive to administer. The former mandates at least one per year, the second discourages anything more frequent.

But automating performance reviews should reduce effort and cost. Or at least reduce the marginal cost for reviews beyond the annual review.

Another result of more frequent performance reviews? More frequent information to management about the state of their workforce.

In sum, AI offers a way to reduce cost and risk in performance reviews. It also offers more frequent feedback to employees and more frequent information to management. I see advantages to the use of AI for this despised task.

Now all we need to do is bell the cat.

Wednesday, December 14, 2016

Steps to AI

The phrase "Artificial Intelligence" (AI) has been used to describe computer programs that can perform sophisticated, autonomous operations, and it has been used for decades. (One wag puts it as "artificial intelligence is twenty years away... always".)

Along with AI we have the term "Machine Learning" (ML). Are they different? Yes, but the popular usages make no distinction. And for this post, I will consider them the same.

Use of the term waxes and wanes. The AI term was popular in the 1980s and it is popular now. Once difference between the 1980s and now: we may have enough computing power to actually pull it off.

Should anyone jump into AI? My guess is no. AI has preconditions, things you should be doing before you start with a serious commitment to AI.

First, you need a significant amount of computing power. Second, you need a significant amount of human intelligence. With AI and ML, you are teaching the computer to make decisions. Anyone who has programmed a computer can tell you that this is not trivial.

It strikes me that the necessary elements for AI are very similar to the necessary elements for analytics. Analytics is almost the same as AI - analyzing large quantities of data - except it uses humans to interpret the data, not computers. Analytics is the predecessor to AI. If you're successful at analytics, then you are ready to move on to AI. If you haven't succeeded (or even attempted) at analytics, you're not ready for AI.

Of course, one cannot simply jump into analytics and expect to be successful. Analytics has its own prerequisites. Analytics needs data, the tools to analyze the data and render it for humans, and smart humans to interpret the data. If you don't have the data, the tools, and the clever humans, you're not ready for analytics.

But we're not done with levels of prerequisites! The data for analytics (and eventually AI) has its own set of preconditions. You have to collect the data, store the data, and be able to retrieve the data. You have to understand the data, know its origin (including the origin date and time), and know its expiration date (if it has one). You have to understand the quality of your data.

The steps to artificial intelligence are through data collection, metadata, and analytics. Each step has to be completed before you can advance to the next level. (Much like the Capability Maturity Model.) Don't make the mistake of starting a project without the proper experience in place.

Fitzpatrick's Fabulous Future