Tuesday, July 1, 2025

A lesson from the 1960s

The recent push for AI (artificial intelligence) is forcing us to learn a lesson -- or rather, re-learn a lesson that we learned back in the early days of computing.

In the 1960s, when computers were the shiny new thing, we adored computers. They were superior at computing (compared to us humans) and could calculate much faster and more accurately than us. Computers became the subject of books, movies, magazine articles, and even television programs. They were depicted as large, efficient, and always correct (or so we thought).

We trusted computers. That was the first phase of our relationship.

Yet after some time, we learned that computers were not infallible. They could "make mistakes". Many problems within organizations were blamed on "computer error". It was a convenient excuse, and one that was not easily challenged. They became scapegoats. That was the second phase of our relationship.

Given more time, we realized that computers were tools, and like any tools, they could be used or misused, that they were good at some tasks and not others. We also learned that the quality of a computer's output depended on two things: the quality of the program and the quality of the data. Both had to be correct for the results to be correct. Relatively few people worked on the programs; more people worked on the data being fed into computers. 

This was the third phase of our relationship with computers. We recognized that their output was based on the input. We began to check our input data. We began to select sources for our data based on the quality of the data. We even invented a saying: "Garbage in yields garbage out".

That was the 1960s.

Fast-forward to the 2020s. Look carefully at our relationship with AI and see how it matches that first phase of the 1960s relationship with computers. AI is the shiny new thing. We adore it. We trust it.

We don't recognize that it is a tool, and like any tool it is good at some things and not others. We don't recognize that the quality of its output depends on the quality of its input.

We build large language models and train them on any data that we can find. We don't curate the data. We don't ensure that it is correct.

The rule from the 1960s still holds. Garbage in yields garbage out. We have to re-learn that rule.