Fitzpatrick's Fabulous Future: optimization

Showing posts with label optimization. Show all posts

Thursday, February 9, 2023

AI answers may improve traditional search

Isaac Asimov, the writer of science and science fiction, described his experience with publishing houses as a writer. People had warned him to stay away from the publishing world, telling him that it was full of unscrupulous opportunists who would take advantage of him. Yet his experience was a good one; the publishers, editors, and others he worked with were (for the most part) honest, hard-working, and ethical.

Asimov had a conjecture about this. He surmised that for some time prior to his arrival as a writer, the publishing industry did have a large number of unscrupulous opportunists, and they gave the industry a bad reputation. He further theorized that when he started as an author, those individuals had moved on to a different industry. Not because of his arrival, but because there was a newer, larger, and more lucrative industry to take advantage of individuals. It was the movie industry that provided a better "home" for those individuals. Once they saw that movies were the richer target, they abandoned the publishing industry, and left the ethical people (who really wanted to work in publishing) behind.

I don't recall that Asimov proved his conjecture, but it has a good feel to it.

What does this have to do with software? Well, not much for the programming world, but maybe a lot for the online search world.

Search engines (Google, Bing, Duck-duck-go, and others) make a valiant attempt to provide good results, but web sites use tricks to raise a web site's ranking in the search engines. The result is that today, in 2023, many searches work poorly. Searches to purchase something work fairly well, and some searches for answers (when does the Superbowl start) tend to be relevant, but many queries return results that are not helpful.

As I see it, web site operators, in their efforts to increase sales, have hired specialists to optimize their ranking in search engines, leading to an endless race of constantly outdoing their competition. The result is that search engines provide little in the way of "organic" lists and too many "sponsored" or "optimized" responses.

The situation with search engines is, perhaps, similar to the pre-Asimov era of publishing: full of bad operators that distort the product.

So what happens with the new AI-driven answer engines?

If people switch from the old search engines to the new answer engines, we can assume that the money will follow. That is, the answer engines will be popular, and lead to lots of ad revenue. When the revenue shifts from search engines to answer engines, the optimizations will also shift to answer engines. Which means that the efforts to game search engines will stop, and search engines can drift back to organic results.

This change occurs only if the majority of users switch to the answer engines. If a sizable number of people stay on the older search engines, then the gains from optimizing results will remain, and the optimization games will continue.

I'm hoping that most people do switch to the new answer engines, and a small number of people -- just enough for search engines to remain in business -- keep using the older engines.

Wednesday, August 28, 2019

Show me the optimizations!

Compilers have gotten good at optimizing code. So good, in fact, that we programmers take optimizations as granted. I don't object to optimizations, but I do think we need to re-think the opaqueness of them.

Optimizations are, in general, good things. They are changes to the code to make it faster and more efficient, while keeping the same functionality. Many times, they are changes that seem small.

For example, the code:

a = f(r * 4 + k)
b = g(r * 4 + k)

can be optimized to

t1 = r * 4 + k
a = f(t1)
b = g(t1)

The common expression r * 4 + k can be performed once, not twice, which reduces the time to execute. (It does require space to store the result, so this optimization is really a trade-off between time and space. Also, it assumes that r and k remain unchanged between calls to f() and g().)

Another example:

for i = 1 to 40
a[i] = r * 4 + k

which can be optimized to:

t1 = r * 4 + k
for i = 1 to 40
a[i] = t1

In this example, the operation r * 4 + k is repeated inside the loop, yet it does not change from iteration to iteration. (It is invariant during the loop.) The optimization moves the calculation outside the loop, which means it is calculated only once.

These are two simple examples. Compilers have made these optimizations for years, if not decades. Today's compilers are much better at optimizing code.

I am less concerned with the number of optimizations, and the types of optimizations, and more concerned with the optimizations themselves.

I would like to see the optimizations.

Optimizations are changes, and I would like to see the changes that the compiler makes to the code.

I would like to see how my code is revised to improve performance.

I know of no compiler that reports the optimizations it makes. Not Microsoft's compilers, not Intel's, not open source. None. And I am not satisfied with that. I want to see the optimizations.

Why do I want to see the optimizations?

First, I want to see how to improve my code. The above examples are trivial, yet instructive. (And I have, at times, written the un-optimized versions of those programs, although on a larger scale and with more variables and calculations to worry about.) Seeing the improvements to the code helps me become a better developer.

Second, I want to see what the compiler is doing. It may be making assumptions that are not true, possibly due to my failure to annotate variables and functions properly. I want to correct those failures.

Third, when reviewing code with other developers, I think we should review not only the original code but also the optimized code. The optimizations may give us insight into our code and data.

It is quite possible that future compilers will provide information about their optimizations. Compilers are sophisticated tools, and they do more than simply convert source code into executable bytes. It is time for them to provide more information to us, the programmers.

Wednesday, January 31, 2018

Optimizing in the wrong direction

Back in the late 200X years, I toyed with the idea of a new version control system. It wasn't git, or even git-like. In fact, it was the opposite.

At the time, version control was centralized. There was a single instance of the repository and you (the developer) had a single "snapshot" of the files. Usually, your snapshot was the "tip", the most recent version of each file.

My system, like other version control systems of the time, was a centralized system, with versions for each file stored as 'diff' packages. That was the traditional approach for version control, as storing a 'diff' was smaller than storing the entire version of the file.

Git changed the approach for version control. Instead of a single central repository, git is a distributed version control system. It replicates the entire repository in every instance and uses a sophisticated protocol to synchronize changes across instances. When you clone a repo in git, you get the entire repository.

Git can do what it does because disk space is now plentiful and cheap. Earlier version control systems worked on the assumption that disk space was expensive and limited. (Which, when SCCS was created in the 1970s, was true.)

Git is also directory-oriented, not file-oriented. Git looks at the entire directory tree, which allows it to optimize operations that move files or duplicate files in different directories. File-oriented version control systems, looking only at the contents of a single file at a time, cannot make those optimizations. That difference, while important, is not relevant to this post.

I called my system "Amnesia". My "brilliant" idea was to, over time, remove diffs from the repository and thereby use even less disk space. Deletion was automatic, and I let the use specify a set of rules for deletion, so important versions could be saved indefinitely.

My improvement was based on the assumption of disk space being expensive. Looking back, I should have known better. Disk space was not expensive, and not only was it not expensive it was not getting expensive -- it was getting cheaper.

Anyone looking at this system today would be, at best, amused. Even I can only grin at my error.

I was optimizing, but for the wrong result. The "Amnesia" approach reduced disk space, at the cost of time (it takes longer to compute diffs than it does to store the entire file), information (the removal of versions also removes information about who made the change), and development cost (for the auto-delete functions).

The lesson? Improve, but think about your assumptions. When you optimize something, do it in the right direction.

Sunday, July 9, 2017

Cloud and optimizations

We all recognize that cloud computing is different.

It may be that cloud computing breaks some of our algorithms.

A colleague of mine, a long time ago, shared a story about programming early IBM mainframes. They used assembly language, because code written in assembly executed faster than code written in COBOL. (And for business applications on IBM mainframes, at the time, those were the only two options.)

Not only did they write in assembly language, they wrote code to be fast. That is, they "optimized" the code. One of the optimizations was with the "multiply" instruction.

The multiply instruction does what you think: it multiplies to numbers and stores the result. To optimize it, they wrote the code to place the larger of the two values in one register and the smaller of the two values in the other register. The multiply instruction was implemented as a "repeated addition" operation, so the second register was really a count of the number of addition operations that would be performed. By storing the smaller number in the second register, programmers reduced the number of "add" operations and improved performance.

(Technically inclined folks may balk at the notion of reducing a multiply operation to repeated additions, and observe that it works for integer values but not floating-point values. The technique was valid on early IBM equipment, because the numeric values were either integers or fixed-point values, not floating-point values.)

It was an optimization that was useful at the time, when computers were relatively slow and relatively expensive. Today's faster, cheaper computers can perform multiplication quite quickly, and we don't need to optimize it.

Over time, changes in technology make certain optimizations obsolete.

Which brings us to cloud computing.

Cloud computing is a change in technology. It makes available a variable number of processors.

Certain problems have a large number of possible outcomes, with only certain outcomes considered good. The problems could describe the travels of a salesman, or the number of items in a sack, or playing a game of checkers. We have algorithms to solve specific configurations of these problems.

One algorithm is the brute-force, search-every-possibility method, which does just what you think. While it is guaranteed to find an optimal solution, there are sometimes so many solutions (millions upon millions, or billions, or quintillions) that this method is impractical.

Faced with an impractical algorithm, we invent others. Many are iterative algorithms which start with a set of conditions and then move closer and closer to a solution by making adjustments to the starting conditions. Other algorithms discard certain possibilities ("pruning") which are known to be no better than current solutions. Both techniques reduce the number of tested possibilities and therefore reduce the time to find a solution.

But observe: The improved algorithms assume a set of sequential operations. They are designed for a single computer (or a single person), and they are designed to minimize time.

With cloud computing, we no longer have a single processor. We have multiple processors, each operating in parallel. Algorithms designed to optimize for time on a single processor may not be suitable for cloud computing.

Instead of using one processor to iteratively find a solution, it may be possible to harness thousands (millions?) of cloud-based processors, each working on a distinct configuration. Instead of examining solutions in sequence, we can examine solutions in parallel. The result may be a faster solution to the problem, in terms of "wall time" -- the time we humans are waiting for the solution.

I recognize that this approach has its costs. Cloud computing is not free, in terms of money or in terms of computing time. Money aside, there is a cost in creating the multiple configurations, sending them to respecting cloud processors, and then comparing the many results. That time is a cost, and it must be included in our evaluation.

None of these ideas are new to the folks who have been working with parallel processing. There are studies, papers, and ideas, most of which have been ignored by mainstream (sequential) computing.

Cloud computing will lead, I believe, to the re-evaluation of many of our algorithms. We may find that many of them have a built-in bias for single-processor operation. The work done in parallel computing will be pertinent to cloud computing.

Cloud computing is a very different form of computing. We're still learning about it. The application of concepts from parallel processing is one aspect of it. I won't be surprised if there are more. There may be all sorts of surprises ahead of us.

Tuesday, September 17, 2013

When programming, think like a computer

When programming, it is best to think like a computer. It is tempting to think like a human. But humans think very differently than computers (if we allow that computers think), and thinking like a human leads to complex programs.

This was brought home to me while reading William Conley's "Computer Optimization Techniques" which discusses the solutions to Integer Programming problems and related problems. Many of these problems can be solved with brute-force calculations, evaluating every possible solution and identifying the most profitable (or least expensive).

The programs for these brute-force methods are short and simple. Even in FORTRAN, they run less than fifty lines. Their brevity is due to their simplicity. There is no clever coding, no attempt to optimize the algorithm. The programs take advantage of the computer's strength of fast computation.

Humans think very differently. They tire quickly of routine calculations. They can identify patterns and have insights into shortcuts for algorithms. They can take creative leaps to solutions. These are all survival skills, useful for dealing with an uncertain environment and capable predators. But they are quite difficult to encode into a computer program. So hard that it is often more efficient to use brute-force calculations without insights and creative leaps. The time spent making the program "smart" is larger than the time saved by the improved program.

Brute-force is not always the best method for calculations. Sometimes you need a smart program, because the number of computations is staggering. In those cases, it is better to invest the time in improvements. (To his credit, Conley shows techniques to reduce the computations, sometimes by increasing the complexity of the code.)

Computing efficiency (that is, "smart" programs) has been a concern since the first computing machines were made. Necessary at first, the need for efficiency drops over time. Mainframe computers became faster, which allowed for "sloppy" programs ("sloppy" meaning "anything less than maximum efficiency").

Minicomputers were slower than mainframes, significantly less expensive, and another step away from the need for optimized, "smart" programs. PCs were another step. Today, smart phones have more computing power than PCs of a few years ago, at a fraction of the price. Cloud computing, a separate branch in the evolution of computing, offers cheap, readily-available computing power.

I won't claim that computing power is (or will ever be) "too cheap to meter". But it is cheap, and it is plentiful. And with cheap and plentiful computing power, we can build programs that use simple methods.

When writing a computer program, think like a computer. Start with a simple algorithm, one that is not clever. Chances are, it will be good enough.

Saturday, January 21, 2012

Premature optimization?

It has been said that premature optimization is the root of all evil. (I read this as a loose translation of "optimizing too early can cause you grief", not as "it makes you a bad person".)

We often optimize our programs and systems. We admire system designers who can build systems that work smoothly and with minimal resources -- that is, systems that are optimized.

But what are optimizations? Most of the time, they are not pure optimizations (which use the smallest amount of system resources) but trade-offs (which use one resource in lieu of another). A simple trade-off optimization is caching: using memory (a cheap resource) to avoid database lookups (an expensive operation).

Optimization, or the selection of a specific set of trade-offs, is a good thing as long as the underlying assumptions hold.

Let us consider a long-standing tool in the developer world: version control systems (VCSs). We have used these tools for forty years, starting with SCCS and moving through various generations (RCS, CVS, PVCS, SourceSafe, Subversion, to name a few).

Many version control systems store revisions to files not as whole files but as 'deltas', the changes from one version to another. This decision is a trade-off: using computations (generating the change list) and reducing disk usage. (The list of differences is often smaller than the revised file.)

This trade-off relied on several assumptions:

The files stored in the VCS would be text files
The changes from one version to another would be a small fraction of the file
Disk space was expensive (compared to the user's time)

It turns out that, some forty years later, these assumptions do not always hold. We are using version control systems for more than source code, and some files that are not text files. (Non-text files are handled poorly by the 'delta' calculation logic, and most VCSs simply give up and store the entire file.) User time is expensive (and getting more so) and disk space is cheap (and also getting more so).

The trade-offs made by version control systems are now working against us. We grumble while our systems generate deltas. We care little that the Microsoft Word document files are stored in their entirety.

The latest version control systems ('git' is an example) do away with the notion of deltas. They store the entire file, with various techniques to compress the file and to de-duplicate data. (We still care about disk usage.)

The notion of storing revisions as deltas was an optimization. It is a notion that we are now moving away from. Was it a premature optimization? Was it a trade-off that we made in error? Is it an example of "the root of all evil"?

I think that the answer is no. At the time, with the technology that we had, using deltas was a good trade-off. It reduced our use of resources, and one can justify the claim of optimization. And most importantly, it worked for a long period of time.

An optimization becomes "bad" when the the underlying assumptions fail. At that point, the system is "upside down", or de-optimized. (Some might say "pessimized".) When that happens, we want to re-design the system to use a better technique (and thus reduce our use of resources). The cost of that change is part of the equation, and must be tallied. A long-running optimization with a low cost of change is good; a short-lived optimization (especially one with a high 'fix' cost at the end) is bad.

Optimizations are like leased cars. You can get by for a period of time with lower payments, but in the end you must turn in the car (or purchase it). Knowing the length of the lease and the tail-end costs is important in your decision. Optimizing without knowing the costs, in my mind, is the root of all evil.

Fitzpatrick's Fabulous Future