Fitzpatrick's Fabulous Future: performance

Showing posts with label performance. Show all posts

Tuesday, March 22, 2022

Apple has a software problem

Apple has a problem. Specifically, Apple has a software problem. More specifically, Apple has a problem between its hardware and its software. The problem is that Apple's hardware is getting bigger and more powerful at a much faster rate than its software.

Why is better hardware a problem? By itself, better hardware isn't a problem. But hardware doesn't exist by itself -- we use hardware with software. Faster, more powerful hardware requires bigger, more capable software.

One might think that fast hardware is a good thing, regardless of software. Faster hardware runs software faster, right? And faster is always a good thing, right? Well, not always.

For software that operates on a large set of data, such that the user is waiting for the result, yes, faster hardware is better. But faster hardware is only better if the user is waiting. Software that operates in "batch mode" or with limited interaction with the user is improved with better hardware. But software that interacts with the user, software that must wait for the user to do something, isn't necessarily improved.

Consider the word processor, a venerable tool that has been with us since the introduction of the personal computer. Word processors spend most of their time waiting for the user to press a key. This was true even with computers from prior to the IBM PC. In the forty-odd years since, hardware has gotten much, much faster but word processors have not gotten much more complicated. (There was a significant increase in complexity when we shifted from DOS and its character-based display and printing to Windows and graphics-based display and printing, but very little otherwise.)

The user experience for word processors has changed little in that time. Faster hardware has not improved the experience. If we limited computers to word processors, there would be no need for a processor more powerful than the Intel 80386. By extension, if you needed a computer for word processing (and nothing else) today's bottom-of-the-line, cheap, minimal computer would be more than enough for you. There is no point in spending on a premium computer (or even a mediocre one) because the minimal computer can do the job adequately.

The same logic applies to spreadsheets. And e-mail. And web browsers. Computers have gotten better faster than these programs and their data have gotten bigger.

The computers I use for my day-to-day and even unusual tasks are old PCs, ranging from five to twenty years in age. All of them are fast enough for what I need to do. An aged Dell Inspiron N5010 runs Windows 10 and lets me use Zoom and Teams to join virtual meetings. I could replace it with a modern laptop, but the experience would be the same! Why should I bother?

A premium computer is needed only for those tasks that perform complex operations on large sets of data. And this is where Apple fails to provide the tools to justify its powerful (and expensive) hardware.

Apple is focussed on hardware, and it does a terrific job of designing and manufacturing powerful computers. But software is another story. Apple develops applications and then seems to lose interest in them. It built Pages, Numbers, and Keynote, and has made precious few improvements -- other than recompiling for ARM processors, or adding support for things like AirPlay. It hasn't added features.

The same goes for applications such as GarageBand, iTunes, and FaceTime. Even Xcode.

Apple has even let utilities such as grep and sed in MacOS (excuse me, "mac os". Or is it "macos"?) are aging with no updates. The corresponding GNU utilities have been modified and improved in various ways, to the point that developers now recommend the installation of the GNU utilities on Apple computers.

Apple may be waiting for others to build the applications that will take advantage of the latest Mac computers. I'm not sure that many will want to do that.

Building applications to leverage the new Apple processors may seem a no-brainer. But there are a number of disincentives.

First, Apple may build their own version of the application, and compete with the vendor. Independent vendors may be reluctant to enter a market when Apple is a possible competitor.

Second, developing applications to take advantage of the M1 architecture requires a lot of time and effort. The application must be multithreaded -- single-threaded applications cannot fully leverage the multiple cores on the M1. Designing, coding, and testing such an application is a lot of work.

Third, the market is limited. Applications developed to take advantage of the M1 processor are, well, developed for the M1 processor. They won't run on Windows PCs. You can't cross-build to run on Windows PCs, because PC processors are much slower than the Apple processors. The set of potential customers is limited to those who have the upper-end Apple computers. That's a small market (compared to Windows) and the potential revenue may not cover the development costs.

That is an intimidating set of disincentives.

So if Apple isn't building applications to leverage the upper-end Macs, and third-party developers aren't building applications to leverage upper-end Macs, then ... no one is building them! The result being that there are no applications (or very few) to take advantage of the higher-end processors.

I expect sales of the M1-based Macs to be robust, for a short time. But as people realize that their experience has not improved (except, perhaps for "buttery smooth" graphics) they will hesitate for the next round of new Macs (and iPads, and iPhones). As customers weigh the benefits and costs of new hardware, some will decide that their current hardware is good enough. (Just as my 15-year-old PCs are good enough for my simple word processing and spreadsheet needs.) If Apple introduces newer systems with even faster processors, more people will look at the price tags and decide to wait before upgrading.

Apple is set up to learn an important lesson: High-performance hardware is not enough. One needs the software to offer solutions to customers. Apple must work on its software.

Monday, January 3, 2022

The biggest gains of Apple's M processors is behind us

Improvements in hardware are not linear. If we look at the performance of hardware over time, we can see that performance improvements follow a pattern: a sharp rise in performance followed by a period of little improvements. (A graph looks like a staircase, with the pattern of "rise, flat, rise, flat".)

The first implementation of a change provides significant increases. Over time, we refine the improvements and gain additional increases. But those later increases are smaller. Eventually, subsequent refinements provide minimal improvements. So we move on to other ideas.

The history of personal computers follows this pattern. We have made a number of changes to hardware that have improved performance. Each of those changes yielded a large initial gain, and then gradually diminishing improvements.

We have increased the clock speed.

We changed memory technology from "core" memory (ferrous rings) to transistor-based memory (static at first, and then dynamic memory).

We have added caches, to store values in processor, reducing the dependence on memory. We liked the idea of caches so much that we did it more than once. Processors now have three levels of caches.

We have off-loaded work to (smarter) devices. Devices now have their own processors and can perform tasks independently of the CPU.

We have increased the number of CPU cores, which improves performance for systems with multiple processes and multiple threads. (Which is just about every system we have today.)

Each of these changes improved performance. A large step up at first, and then smaller increases.

Now Apple has used another method to improve performance: Reduce distance between chips with system-on-a-chip designs. The M1 chips include all components of the computer: CPU, memory, storage, GPU, and more.

Yet the overall pattern of improvements will hold with this new design. The first M1 chip will have significant improvements over the older design of discrete components. The M1 Pro and M1 Max will have improvements over the M1 chip, but not larger than the initial M1 gains.

Later chips, such as the M2, M2 Pro, and even the M3, will have gains, but less and less (in terms of percentages) than the previous chips. The performance curve, after a sharp rise with the M1 chip, will flatten. Apple will have entered the "plain of modest gains" phase.

Apple's M1 chips are nice. They provide good performance. Newer versions will be better: faster, more powerful. But the biggest increases, I think, are already behind us.

Sunday, November 21, 2021

CPU power and developer productivity

Some companies have issued M1-based Macbooks to developers. Their reason? To improve productivity.

Replacing "regular" PCs with M1-based Macbooks is nice, and it certainly provides developers with more CPU power, but does it really increase the productivity of developers?

In a very simple way, yes, it does. The more powerful Macbooks will let developers compile faster, build deployment packages faster, and run tests faster.

But the greater leaps in developer productivity were not performing steps faster.

Increases in developer productivity come not from raw CPU power, or more memory, or faster network connections, or higher-resolution displays. Meaningful increases come from better tools, which in turn are built on CPU power, memory, network connections, and displays.

The tools that have helped developers become productive are many. They include programming languages, compilers and interpreters, editors, debuggers, version control systems, automated test systems, and communication systems (such as e-mail or chat or streamed messages like Slack).

We build better tools when we have the computing power to support them. In the early days of computing, prior to the invention of the IDE (integrated development environment), the steps of editing and compiling were distinct and handled by separate programs.

Early editors could not hold the entire text of a long program in memory at once, so they had special commands to "page" through the file. (One opened the editor and started with the first "page" of text, made changes and got things right, and then moved to the next "page". It sounds like the common "page down" operation in modern editors, except that a "page" in the old editors was longer than the screen, and -- note this -- there was no "page up" operation. Paging was a one-way street. If you wanted to go back, you had to page to the end of the file, close the file, and then run the editor again.

Increased memory ended the need for "page" operations.

The first integrated environment may have been the UCSD P-System, which offered editing, compiling, and running of programs. The availability of 64K of memory made this system possible. Unfortunately, the CPUs at the time could not support the virtual processor (much like the later JVM for Java) and the p-System never achieved popularity.

The IBM PC, with a faster processor and more memory, made the IDE practical. In addition to editing and compiling, there were add-ons for debugging.

The 80386 processor, along with network cards and cables, made Windows practical, which made network applications possible. That platform allowed for e-mail (at least within an organization) and that gave a boost to lots of employees, developers included. Networks and shared file servers also allowed for repositories of source code and version control systems. Version control systems were (and still are) an immense aid to programming.

Increases in computing power (CPU, memory, network, or whatever) let us build tools to be more effective as developers. An increase of raw power, by itself, is nice, but the payoff is in the better tools.

What will happen as a result of Apple's success with its M1 systems?

First, people will adopt the new, faster M1-based computers. This is already happening.

Second, competitors will adopt the system-on-a-chip design. I'm confident that Microsoft is already working on a design (probably multiple designs) for its Surface computers, and as a reference for other manufacturers. Google is probably working on designs for Chromebooks.

Third, once the system-on-a-chip design has been accepted in the Windows environment, people will develop new tools to assist programmers. (It won't be Apple, and I think it won't be Google either.)

Where-ever the source, what kinds of tools can we expect? That's an interesting question, and the answer is not obvious. But let us consider a few things:

First, the increase in power is in CPU and GPU capacity. We're not seeing an increase in memory or storage, or in network capacity. We can assume that innovations will be built on processing, not communication.

Second, innovations will probably help developers in their day-to-day jobs. Developers perform many tasks. Which are the tasks that need help? (The answer is probably not faster compiles.)

I have a few ideas that might help programmers.

One idea is to analyze the run-time performance of programs and identify "hot spots" -- areas in the code that take a lot of time. We already have tools to do this, but they are difficult to use and run the system in a slow mode, such that a simple execution can take minutes instead of seconds. (A more complex task can run for an hour under "analyze mode".) A faster CPU can help with this analysis.

Another idea is for debugging. The typical debugger allows a programmer to step through the code one line at a time, and to set breakpoints and run quickly to those points in the code. What most debuggers don't allow is the ability to go backwards. Often, a programming stepping through the code gets to a point that is know to be a problem, and needs to identify the steps that got the program to that point. The ability to "run in reverse" would let a programmer "back up" to an earlier point, where he (or she) could see the decisions and the data that lead to the problem point. Computationally, this is a difficult task, but we're looking at a significant increase in computational power, so why not?

A third possibility is the analysis of source code. Modern IDEs perform some analysis, often marking code that is syntactically incorrect. With additional CPU power, could we identify code that is open to other errors? We have tools to identify SQL injection attacks, and memory errors, and other poor practices. These tools could be added to the IDE as a standard feature.

In the longer term, we may see new programming languages emerge. Just as Java (and later, C#) took advantage of faster CPUs to execute byte-code (the UCSD p-System was a good idea, merely too early) new programming languages may do more with code. Perhaps we will see a shift to interpreted languages (Python and Ruby, or their successors). Or maybe we will see a combination of compiled and interpreted code, with some code compiled and other code interpreted at run-time.

More powerful computers let so do things faster, but more importantly, they let us do things differently. They expand the world of computation. Let's see how we use the new world given to us with system-on-a-chip designs.

Thursday, June 11, 2020

The computer of Linus Torvalds

My experience as developer ranges from solo artist to member of large, enterprise projects. That experience has given me various insights about hardware, operating systems, programming languages, teamwork, and management.

One observation is about a combination of those aspects, specifically hardware and development teams: The minimum hardware requirements for a system are (most likely) the hardware that the developers are using. If you equip developers with top-of-the-line hardware, the system when delivered will require top-of-the-line hardware to run acceptably. As a corollary, if you equip developers with mid-line hardware, the delivered system will run acceptably on that level of hardware.

Developers may often complain about slow hardware, and point out that top-level hardware is not that expensive, and may actually reduce expenses once you factor in the time to pay developers to wait for slow compiles and tests. That is a valid point, but it loses sight of the larger point of a system that performs for a user with hardware that is less than top-of-the-line.

With fast hardware, developers do not see the performance problems. With slower hardware, developers are aware of performance issues, and build a better system. (Or at least one that runs faster.)

Which brings us to Linux Torvalds, the chief developer for the Linux kernel. More specifically, his computer.

A recent article on slashdot lists the specifications of his new computer. It sounds really nice. Fast. Powerful. And just what will lead Torvalds (and Linux) into the "performance trap". Such a computer will hide performance issues from Linus. That may send Linux into a direction that lets it run well on high-end hardware, and not so well on lower-end hardware or older systems.

With a high-end system to run and test on, Torvalds will miss the feedback when some changes have negative affects on performance on slower hardware. Those changes may work "just fine" on his computer, but not so well on other computers.

I recognize that the development effort of the Linux kernel has a lot of contributors, not all of whom have top-level hardware. Those developers may see performance issues. They may even raise them. But do they have a voice? Will their concerns be heard, and addressed? Or will Torvalds reject the issues as complaints and arrogantly tell those developers get "real computers and stop whining". His reputation suggests the latter.

If Torvalds does fall into the "performance trap" it may have significant effects on the future success of Linux. Linux may become "tuned" to high-performance hardware, running acceptably on expensive systems but slow and laggy on cheaper hardware. It may run well on new equipment but poorly on older systems.

That, in turn, may force users of older, slower hardware to re-think their decision to use Linux.

Friday, December 8, 2017

The cult of fastest

In IT, we (well, some of us) are obsessed with speed. The speed-cravers seek the fastest hardware, the fastest software, and the fastest network connections. They have been with us since the days of the IBM PC AT, which ran at 6MHz which was faster than the IBM PC (and XT) speed of 4.77MHz.

Now we see speed competition among browsers. First Firefox claims their browser is fastest. Then Google releases a new version of Chrome, and claims that it is the fastest. At some point, Microsoft will claim that their Edge browser is the fastest.

It is one thing to improve performance. When faced with a long-running job, we want the computer to be faster. That makes sense; we get results quicker and we can take actions faster. Sometimes it is reasonable to go to great lengths to improve performance.

I once had a job that compared source files for duplicate code. With 10,000 source files, and the need to compare each file against each other file, there were 1,000,000 comparisons. Each comparison took about a minute, so the total job was projected to run for 1,000,000 minutes -- or about 2 years! I revised the job significantly, using a simpler (and faster) comparison to identify if two files had any common lines of code and then using the more detailed (and longer) comparison on only those pairs with over 1,000 lines of common code.

Looking for faster processing in that case made sense.

But it is another thing to look for faster processing by itself.

Consider a word processor. Microsoft Word has been around for decades. (It actually started its life in MS-DOS.) Word was designed for systems with much smaller memory and much slower processors, and it still has some of that design. The code for Word is efficient. It spends most of its time not in processing words but in waiting for the user to type a key or click the mouse. Making the code twice as fast would not improve its performance (much), because the slowness comes from the user.

E-mail is another example. Most of the time for e-mail is, like Word, the computer waiting for the user to type something. When an e-mail is sent, the e-mail is passed from one e-mail server to another until it arrives at the assigned destination. Changing the servers would let the e-mail arrive quicker, but it doesn't help with the composition. The acts of writing and reading the e-mail are based on the human brain and physiology; faster processors won't help.

The pursuit of faster processing without definite benefits is, ironically, a waste of time.

Instead of blindly seeking faster hardware and software, we should think about what we want. We should identify the performance improvements that will benefit us. (For managers, this means lower cost or less time to obtain business results.)

Once we insist on benefits for improved performance, we find a new concept: the idea of "fast enough". When an improvement lets us meet a goal (a goal more specific than "go faster"), we can justify the effort or expense for faster performance. But once we meet that goal, we stop.

This is a useful tool. It allows us to eliminate effort and focus on changes that will help us. If we decide that our internet service is fast enough, then we can look at other things such as database and compilers. If we decide that our systems are fast enough, then we can look at security.

Which is not to say that we should simply declare our systems "fast enough" and ignore them. The decision should be well-considered, especially in the light of our competitors and their capabilities. The conditions that let us rate our systems as "fast enough" today may not hold in the future, so a periodic review is prudent.

We shouldn't ignore opportunities to improve performance. But we shouldn't spend all of our effort for them and avoid other things. We shouldn't pick a solution because it is the fastest. A solution that is "fast enough" is, at the end of the day, fast enough.

Sunday, October 29, 2017

We have a problem

The Rust programming language has a problem.

The problem is one of compactness, or the lack thereof. This problem was brought to my attention by a blog post about the Unix 'yes' program.

In short, Rust requires a lot of code to handle a very simple task.

The simple task, in this case, is the "yes" program from Unix. This program feeds the string "y\n" ('y' with newline) to output as many times as possible.

Here's the program in C:

main(argc, argv)
char **argv;
{
  for (;;)
    printf("%s\n", argc>1? argv[1]: "y");
}

And here is an attempt in Rust:

use std::env;

fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  loop {
    println!("{}", expletive);
  }
}

The Rust version is quite slow compared to the C version, so the author and others made some "improvements" to Make It Go Fast:

use std::env;
use std::io::{self, Write};
use std::process;
use std::borrow::Cow;

use std::ffi::OsString;
pub const BUFFER_CAPACITY: usize = 64 * 1024;

pub fn to_bytes(os_str: OsString) -> Vec<u8> {
  use std::os::unix::ffi::OsStringExt;
  os_str.into_vec()
}

fn fill_up_buffer<'a>(buffer: &'a mut [u8], output: &'a [u8]) -> &'a [u8] {
  if output.len() > buffer.len() / 2 {
    return output;
  }

  let mut buffer_size = output.len();
  buffer[..buffer_size].clone_from_slice(output);

  while buffer_size < buffer.len() / 2 {
    let (left, right) = buffer.split_at_mut(buffer_size);
    right[..buffer_size].clone_from_slice(left);
    buffer_size *= 2;
  }

  &buffer[..buffer_size]
}

fn write(output: &[u8]) {
  let stdout = io::stdout();
  let mut locked = stdout.lock();
  let mut buffer = [0u8; BUFFER_CAPACITY];

  let filled = fill_up_buffer(&mut buffer, output);
  while locked.write_all(filled).is_ok() {}
}

fn main() {
  write(&env::args_os().nth(1).map(to_bytes).map_or(
    Cow::Borrowed(
      &b"y\n"[..],
    ),
    |mut arg| {
      arg.push(b'\n');
      Cow::Owned(arg)
    },
  ));
  process::exit(1);
}

Now, that's a lot of code. Really a lot. For a simple task.

To be fair, the author mentions that the GNU version of 'yes' weighs in at 128 lines, more that twice this monstrosity in Rust. But another blogger posted this code which improves performance:

#define LEN 2
#define TOTAL 8192
int main() {
    char yes[LEN] = {'y', '\n'};
    char *buf = malloc(TOTAL);
    int bufused = 0;
    while (bufused < TOTAL) {
        memcpy(buf+bufused, yes, LEN);
        bufused += LEN;
    }
    while(write(1, buf, TOTAL));
    return 1;
}

Programming languages should be saving us work. The high-performance solution in Rust is long, way too long, for such simple operations.

We have a problem. It may be in our programming languages. It may be in run-time libraries. It may be in the operating systems and their APIs. It may be in the hardware architecture. It may be a combination of several.

But a problem we have.

Wednesday, October 4, 2017

Performance, missing and found

One of the constants in technology has been the improvement of performance. More powerful processors, faster memory, larger capacity in physically smaller disks, and faster communications have been the results of better technology.

This increase in performance is mostly mythological. We are told that our processors are more powerful, we are told that memory and network connections are faster. Yet what is our experience? What are the empirical results?

For me, word processors and spreadsheets run just as fast as they did decades ago. Operating systems load just as fast -- or just as slow.

Linux on my 2006 Apple MacBook loads slower than 1980s-vintage systems with eight-bit processors and floppy disk drives. Windows loads quickly, sort of. It displays a log-on screen and lets me enter a name and password, but then it takes at least five minutes (and sometimes an hour) updating various things.

Compilers and IDEs suffer the same fate. Each new version of Visual Studio takes longer to load. Eclipse is no escape -- it has always required a short eternity to load and edit a file. Slow performance is not limited to loading; compilation times have improved but only slightly, and not by the orders of magnitude to match the advertised improvements in hardware.

Where are the improvements? Where is the blazing speed that our hardware manufacturers promise?

I recently found that "missing" performance. It was noted in an article on the longevity of the C language, of all things. The author clearly and succinctly describes C and its place in the world. On the way, he describes the performance of one of his own C programs:

"In 1987 this code took around half an hour to run, today 0.03 seconds."

And there it is. A description of the performance improvements we should see in our systems.

The performance improvements we expect from better hardware has gone into software.

We have "invested" that performance in our operating systems, our programming languages, and user interfaces. Instead of taking all the improvements for reduced running times, we have diverted performance to new languages and to "improvements" in older languages. We invested in STL over plain old C++, Java over C++ (with or without STL), Python over Java and C#.

Why not? Its better to prevent mistakes than to have fast-running programs that crash or -- worse -- don't crash but provide incorrect results. Our processors are faster, and our programming languages do more for us. Boot times, load times, and compile times may be about the same as from decades ago, but errors are far fewer, easily detected, and much less dangerous.

Yes, there are still defects which can be exploited to hack into systems. We have not achieved perfection.

Our systems are much better with operating systems and programming languages that do the checking that the now do, and businesses and individuals can rely on computers to get the job done.

That's worth some CPU cycles.

Sunday, August 13, 2017

Make it go faster

I've worked on a number of projects, and a (not insignificant) number of them had a requirement (or, more accurately, a request) to improve the performance of an existing system.

A computerized system consists of hardware and software. The software is a set of instructions that perform computations, and the hardware executes those instructions.

The most common method of improving performance is to use a faster computer. Leave the software unchanged, and run it on a faster processor. (Or if the system performs a lot of I/O, a faster disk, possibly an SSD instead of a spinning hard drive.) This method is simple, with no changes to the software and therefore low risk. It is the first method of reducing run time: perform computations faster.

Another traditional method is to change your algorithm. Some algorithms are faster than others, often by using more memory. This method has higher risk, as it changes the software.

Today's technology sees cloud computing as a way to reduce computing time. If your calculations are partitionable (that is, subsets can be computed independently) then you can break a large set of computations into a group of smaller computations, assign each smaller set to its own processor, and compute them in parallel. Effectively, this is computing faster, provided that the gain in parallel processing outweighs the cost of partitioning your data and combining the multiple results.

One overlooked method is using a different compiler. (I'm assuming that you're using a compiled language such as C or C++. If you are using Python or Ruby, or even Java, you may want to change languages.) Switching from one compiler to another can make a difference in performance. The code emitted by one compiler may be fine-tuned to a specific processor; the code from another compiler may be generic and intended for all processors in a family.

Switching from one processor to another may improve performance. Often, such a change requires a different compiler, so you are changing two things, not one. But a different processor may perform the computations faster.

Fred Brooks has written about essential complexity and accidental complexity. Essential complexity is necessary and unavoidable; accidental complexity can be removed from the task. There may be an equivalent in computations, with essential computations that are unavoidable and accidental computations that can be removed (or reduced). But removing accidental complexity is merely reducing the number of computations.

To improve performance you can either perform the computations faster or you can reduce the number of computations. That's the list. You can use a faster processor, you can change you algorithms, you can change from one processor and compiler to a different processor and compiler. But there is an essential number of computations, in a specific sequence. You can't exceed that limit.

Sunday, December 11, 2011

Tradeoffs

It used to be that we had to write small, fast programs. Processors were slow, storage media (punch cards, tape drives, disc drives) were even slower, and memory was limited. In such a world, programmers were rewarded for tight code, and DP managers were rewarded for maintaining systems at utilization rates of ninety to ninety-five percent of machine capacity. The reason was that a higher rate meant that you needed more equipment, and a lower rate meant that you had purchased (or more likely, leased) too much equipment.

In that world, programmers had to make tradeoffs when creating systems. Readable code might not be fast, and fast code might not be readable (and often the two were true). Fast code won out over readable (slower) code. Small code that squeezed the most out of the hardware won out over readable (less efficient) code. The tradeoffs were reasonable.

The world has changed. Computers have become more powerful. Networks are faster and more reliable. Databases are faster, and we have multiple choices of database designs -- not everything is a flat file or a set of related tables. Equipment is cheap, almost commodities.

This change means that the focus of costs now shifts. Equipment is not the big cost item. CPU time is not the big cost item. Telecommunications is not the big cost item.

The big problem of application development, the big expense that concerns managers, the thing that will get attention, will be maintenance: the time and cost to modify or enhance an existing system.

The biggest factor in maintenance costs, in my mind, is the readability of the code. Readable code is easy to change (possibly). Opaque code is impossible to change (certainly).

Some folks look to documentation, such as design or architecture documents. I put little value in documentation; I have always found the code to be the final and most accurate description of the system. Documents suffer from aging: they were correct some but the system has been modified. Documents suffer from imprecision: they specify some but not all of the details. Documents suffer from inaccuracy: they specify what the author thought the system was doing, not what the system actually does.

Sometimes documentation can be useful. The business requirements of a system can be useful. But I find "System architecture" and "Design overview" documents useless.

If the code is to be the documentation for itself, then it must be readable.

Readability is a slippery concept. Different programmers have different ideas about "readability". What is readable to me may not be readable to you. Over my career, my ideas of readability have changed, as I learned new programming techniques (structured programming, object-oriented programming, functional programming), and even as I learned more about a language (my current ideas of "readable" C++ code are very different from my early ideas of "readable" C++ code).

I won't define readability. I will let each project decide on a meaningful definition of readability. I will list a few ideas that will let teams improve the readability of their code (however they define it).

Version control for source code A shop that is not using version control is not serious about software development. There are several reliable, well-documented and well supported, popular systems for version control. Version control lets multiple team members work together and coordinate their changes.

Automated builds An automated build lets you build the system reliably, consistently, and at low effort. You want the product for the customer to be built with a reliable and consistent method.

Any developer can build the system Developers need to build the system to run their tests. They need a reliable, consistent, low-effort, method to do that. And it has to work with their development environment, allowing them to change code and debug the system.

Automated testing Like version control, automated testing is necessary for a modern shop. You want to test the product before you send it to your customers, and you want the testing to be consistent and reliable. (You also want it easy to run.)

Any developer can test the system Developers need to know that their changes affect only the behaviors that they intend, and no other parts of the system. They need to use the tests to ensure that their changes have no unintended side-effects. Low-effort automated tests let them run the tests often.

Acceptance of refactoring To improve code, complicated classes and modules must be changed into sets of smaller, simpler classes and modules. Refactoring changes the code without changing the external behavior of the code. If I start with a system that passes its tests (automated tests, right?) and I refactor it, it should pass the same tests. When I can rearrange code, without changing the behavior, I can make the code more readable.

Incentives for developers to use all of the above Any project that discourages developers from using automated builds or automated tests, either explicitly or implicitly, will see little or no improvements in readability.

But the biggest technique for readable code is that the organization -- its developers and managers -- must want readable code. If the organization is more concerned with "delivering a quality product" or "meeting the quarterly numbers", then they will trade off readability for those goals.

Wednesday, February 23, 2011

CPU time rides again!

A long time ago, when computers were large, hulking beasts (and I mean truly large, hulking beasts, the types that filled rooms), there was the notion of "CPU time". Not only was there "CPU time", but there was a cost associated with CPU usage. In dollars.

CPU time was expensive and computations were precious. So expensive and so precious, in fact, that early IBM programmers were taught that when performing a "multiply" operation, one should load registers with the larger number in one particular register and the smaller number in a different register. While the operations "3 times 5" and "5 times 3" yield the same results, the early processors did not consider them identical. The multiplication operation was a series of add operations, and "3 times 5" was performed as five "add" operations, while "5 times 3" was performed as three "add" operations. The difference was two "add" operations. Not much, but the difference was larger for larger numbers. Repeated through the program, the total difference was significant. (That is, measurable in dollars.)

Advances in technology and the PC changed that mindset. Personal computers didn't have the notion of "CPU time". In part because the hardware didn't support the capture of CPU time, but also because the user didn't care. People cared about getting the job done, not about minimizing CPU time and maximizing the number of jobs run. There was only one job the user (who was also the system administrator) cared about -- the program that they were running.

For the past thirty years, people have not known or cared about CPU usage and program efficiency. I should rephrase that to "people in the PC/DOS/Windows world". Folks in the web world have cared about performance and still care about performance. But let's focus on the PC folks.

The PC folks have had a free ride for the past three decades, not worrying about performance. Oh, a few folks have worried: developers from the "old world" who learned frugality and programmers with really large data processing needs. But the vast majority of PC users have gotten by with the attitude of "if the program is slow, buy a faster PC".

This attitude is in for a change. The cause of the change? Virtualization.

With virtualization, PCs cease to be stand-alone machines. They become an "image" running under a virtualization engine. (That engine could be Virtual PC, VMWare, Virtualbox, Xen, or a few others. The engine doesn't matter; this issue applies to all of them.)

By shifting from a stand-alone machine to a job in a virtualization host, the PC becomes a job in a datacenter. It also becomes someone else's headache. The PC user is no longer the administrator. (Actually, the role of administrator in corporations shifted long ago, with Windows NT, domain controllers, centralized authentication, and group policies. Virtualization shifts the burden of CPU management to the central support team.)

The system administrators for virtualized PCs are true administrators, not PC owners who have the role thrust upon them. Real sysadmins pay attention to lots of performance indicators, including CPU usage, disk activity, and network activity. They pay attention because the operations cost money.

With virtual PCs, the processing occurs in the datacenter, and sysadmins will quickly spot the inefficient applications. The programs that consume lots of CPU and I/O will make themselves known, by standing out from the others.

Here's what I see happening:

- The shift to virtual PCs will continue, with today's PC users migrating to low-cost PCs and using Remote Desktop Connection (for windows) and Virtual Network Computing (for Linux) to connect to virtualized hosts. Users will keep their current applications.

- Some applications will exhibit poor response through RDP and VNC. These will be the applications with poorly written GUI routines, programs that require the virtualization software to perform extra work to make them work.

- Users will complain to the system administrators, who will tweak settings but in general be unable to fix the problem.

- Some applications will consume lots of CPU or I/O operations. System administrators will identify them and ask users to fix their applications. Users (for the most part) will have no clue about performance of their applications, either because they were written by someone else or because the user has no experience with performance programming.

- At this point, most folks (users and sysadmins) are frustrated with the changes enforced by management and the lack of fixes for performance issues. But folks will carry on.

- System administrators will provide reports on resource usage. Reports will be broken down by subunits within the organization, and show the cost of resources consumed by each subgroup.

- Some shops will introduce charge-back systems, to allocate usage charges to organization groups. The charged groups may ignore the charges at first, or consider them an uncontrollable cost of business. I expect pressure to reduce expenses will get managers looking at costs.

- Eventually, someone will observe that application Y performs well under virtualization (that is, more cheaply) while application X does not. Applications X and Y provide the same functions (say, word processing) and are mostly equivalent.

- Once the system administrators learn about the performance difference, they will push for the more efficient application. Armed with statistics and cost figures, they will be in a good position to advocate the adoption of application Y as an organization standard.

- User teams and managers will be willing to adopt the proposed application, to reduce their monthly charges.

And over time, the market will reward those applications that perform well under virtualization. Notice that this change occurs without marketing. It also forces the trade-off of features against performance, something that has been absent from the PC world.

Your job, if you are building applications, is to build the 'Y' version. You want an application that wins on performance. You do not want the 'X' version.

You have to measure your application and learn how to write programs that are efficient. You need the tools to measure your application's performance, environments in which to test, and the desire to run these tests and improve your application. You will have a new set of requirements for your application: performance requirements. All while meeting the same (unreduced) set of functional requirements.

Remember, "3 times 5" is not the same as "5 times 3".

Fitzpatrick's Fabulous Future