Wednesday, February 23, 2011

CPU time rides again!

A long time ago, when computers were large, hulking beasts (and I mean truly large, hulking beasts, the types that filled rooms), there was the notion of "CPU time". Not only was there "CPU time", but there was a cost associated with CPU usage. In dollars.

CPU time was expensive and computations were precious. So expensive and so precious, in fact, that early IBM programmers were taught that when performing a "multiply" operation, one should load registers with the larger number in one particular register and the smaller number in a different register. While the operations "3 times 5" and "5 times 3" yield the same results, the early processors did not consider them identical. The multiplication operation was a series of add operations, and "3 times 5" was performed as five "add" operations, while "5 times 3" was performed as three "add" operations. The difference was two "add" operations. Not much, but the difference was larger for larger numbers. Repeated through the program, the total difference was significant. (That is, measurable in dollars.)

Advances in technology and the PC changed that mindset. Personal computers didn't have the notion of "CPU time". In part because the hardware didn't support the capture of CPU time, but also because the user didn't care. People cared about getting the job done, not about minimizing CPU time and maximizing the number of jobs run. There was only one job the user (who was also the system administrator) cared about -- the program that they were running.

For the past thirty years, people have not known or cared about CPU usage and program efficiency. I should rephrase that to "people in the PC/DOS/Windows world". Folks in the web world have cared about performance and still care about performance. But let's focus on the PC folks.

The PC folks have had a free ride for the past three decades, not worrying about performance. Oh, a few folks have worried: developers from the "old world" who learned frugality and programmers with really large data processing needs. But the vast majority of PC users have gotten by with the attitude of "if the program is slow, buy a faster PC".

This attitude is in for a change. The cause of the change? Virtualization.

With virtualization, PCs cease to be stand-alone machines. They become an "image" running under a virtualization engine. (That engine could be Virtual PC, VMWare, Virtualbox, Xen, or a few others. The engine doesn't matter; this issue applies to all of them.)

By shifting from a stand-alone machine to a job in a virtualization host, the PC becomes a job in a datacenter. It also becomes someone else's headache. The PC user is no longer the administrator. (Actually, the role of administrator in corporations shifted long ago, with Windows NT, domain controllers, centralized authentication, and group policies. Virtualization shifts the burden of CPU management to the central support team.)

The system administrators for virtualized PCs are true administrators, not PC owners who have the role thrust upon them. Real sysadmins pay attention to lots of performance indicators, including CPU usage, disk activity, and network activity. They pay attention because the operations cost money.

With virtual PCs, the processing occurs in the datacenter, and sysadmins will quickly spot the inefficient applications. The programs that consume lots of CPU and I/O will make themselves known, by standing out from the others.

Here's what I see happening:

- The shift to virtual PCs will continue, with today's PC users migrating to low-cost PCs and using Remote Desktop Connection (for windows) and Virtual Network Computing (for Linux) to connect to virtualized hosts. Users will keep their current applications.

- Some applications will exhibit poor response through RDP and VNC. These will be the applications with poorly written GUI routines, programs that require the virtualization software to perform extra work to make them work.

- Users will complain to the system administrators, who will tweak settings but in general be unable to fix the problem.

- Some applications will consume lots of CPU or I/O operations. System administrators will identify them and ask users to fix their applications. Users (for the most part) will have no clue about performance of their applications, either because they were written by someone else or because the user has no experience with performance programming.

- At this point, most folks (users and sysadmins) are frustrated with the changes enforced by management and the lack of fixes for performance issues. But folks will carry on.

- System administrators will provide reports on resource usage. Reports will be broken down by subunits within the organization, and show the cost of resources consumed by each subgroup.

- Some shops will introduce charge-back systems, to allocate usage charges to organization groups. The charged groups may ignore the charges at first, or consider them an uncontrollable cost of business. I expect pressure to reduce expenses will get managers looking at costs.

- Eventually, someone will observe that application Y performs well under virtualization (that is, more cheaply) while application X does not. Applications X and Y provide the same functions (say, word processing) and are mostly equivalent.

- Once the system administrators learn about the performance difference, they will push for the more efficient application. Armed with statistics and cost figures, they will be in a good position to advocate the adoption of application Y as an organization standard.

- User teams and managers will be willing to adopt the proposed application, to reduce their monthly charges.

And over time, the market will reward those applications that perform well under virtualization. Notice that this change occurs without marketing. It also forces the trade-off of features against performance, something that has been absent from the PC world.

Your job, if you are building applications, is to build the 'Y' version. You want an application that wins on performance. You do not want the 'X' version.

You have to measure your application and learn how to write programs that are efficient. You need the tools to measure your application's performance, environments in which to test, and the desire to run these tests and improve your application. You will have a new set of requirements for your application: performance requirements. All while meeting the same (unreduced) set of functional requirements.

Remember, "3 times 5" is not the same as "5 times 3".

No comments: