Fitzpatrick's Fabulous Future: virtualization

Showing posts with label virtualization. Show all posts

Friday, August 29, 2014

Virtual PCs are different from real PCs

Virtual PCs started as an elaborate game of "let's pretend", in which we simulated a real PC (that is, a physical-hardware PC) in software. A virtual PC doesn't exist -- at least not in any tangible form. It has a processor and memory and disk drives and all the things we normally associate with a PC, but they are all constructed in software. The processor is emulated in software. The memory is emulated in software. The disk drive... you get the idea.

Virtualization offers several advantages. We can create new virtual PCs by simply running another copy of the virtualization software. We can move virtual PCs from one host PC to another host PC. We can make back-up images of virtual PCs by simply copying the files that define the virtual PC. We can take snapshots of the virtual PC at a moment in time, and restore those snapshots at our convenience, which lets us run risky experiments that would "brick" a real PC.

We like to think that virtual PCs are the same as physical PCs, only implemented purely in software. But that is not the case. Virtual PCs are a different breed. I can see three areas that virtual PCs will vary from real PCs.

The first is storage (disk drives) and the file system. Disk drives hold our data; file systems organize that data and let us access it. In real PCs, a disk drive is a fixed size. This makes sense, because a physical disk drive *is* a fixed size. In the virtual world, a disk drive can grow or shrink as needed. I expect that virtual PCs will soon have these flexible disk drives. File systems will have to change; they are built with the assumption of a fixed-size disk. (A reasonable assumption, given that they have been dealing with physical, fixed-size disks.) Linux will probably get a file system called "flexvfs" or something.

The second area that virtual PCs vary from real PCs is virtual memory. The concept of virtual memory is older than virtual PCs or even virtual machines in general (virtual machines date back to the mainframe era). Virtual memory allows a PC to use more memory than it really has, by swapping portions of memory to disk. Virtual PCs currently implement virtual memory because they are faithfully duplicating the behavior of real PCs, but they don't have to. A virtual PC can assume that it has all memory addressable by the processor and let the hypervisor handle the virtualization of memory. Delegating the virtualization of memory to the hypervisor lets the "guest" operating system become simpler, as it does not have to worry about virtual memory.

A final difference between virtual PCs and real PCs is the processor. In a physical PC, the processor is rarely upgraded. An upgrade is an expensive proposition: one must buy a compatible processor, shut down the PC, open the case, remove the old processor, carefully install the new processor, close the case, and start the PC. In a virtual PC, the processor is emulated in software, so an upgrade is nothing more that a new set of definition files. It may be possible to upgrade a processor "on the fly" as the virtual PC is running.

These three differences (flexible file systems, lack of virtual memory, and updateable processors) show that virtual PCs are not the same as the "real" physical-hardware PCs. I expect that the two will diverge over time, and that operating systems for the two will also diverge.

Friday, January 31, 2014

Faster update cycles mean PC apps become expensive

Ah, for the good old days of slow hardware upgrades. It used to be that one could buy a computer system and use it for years, possibly even a decade. The software would be upgraded, but the hardware would last. One could run a business knowing the future of its IT (hardware and software) was predictable.

Today we have faster cycles for hardware upgrades. Cell phones, tablets, and some PCs (Apple) are updated in a matter of months, not decades. The causes are multiple: competition (especially among phone vendors), changes in technology, and a form of planned obsolescence (Apple) that sees existing customers buying new versions.

I expect that these faster cycles will move to the PC realm.

The change in the life span of PC hardware will affect consumers and businesses, with the greater impact on businesses. I expect individual consumers to move away from PCs and switch to phones, tablets, game consoles, and internet TV appliances.

Businesses have a challenge ahead. Corporate users typically don't want PCs; they want computing power. Specifically, they want computing power with a user interface that is consistent over time. (When a new version of Windows is introduced to a corporate environment, one of the first actions is to configure the user interface to look like the old version. The inability of Windows 8 to emulate Windows 7 exactly is probably the cause for corporate discomfort with it.)

But the challenge to business goes beyond the user interface. Corporations want stable computing platforms to hold their applications. They want to build a system (or buy one) and use it for a long time. Switching from one vendor's system to another's is an expensive proposition, and corporations amortize the conversion cost over a long life. A new system, or even a new version of a system, can impose changes to the user interface, interfaces to other systems, and interactions with the operating system and drivers. All of these changes are part of the cost of implementation.

In the corporation's mind, the fewer conversions, the better.

That philosophy is colliding with the faster pace of hardware. Apple is not alone in its rapid release of hardware and operating systems; Microsoft is releasing new versions of Windows at a rate much faster than the ten-year gap between Windows XP and Windows 7. (I'm ignoring Windows Vista.)

To adapt to the faster change, I expect corporations to shift from the PC platform to technologies that allow it to retain longer lifespans: virtual PCs and cloud computing. Virtual PCs are the easier change, allowing applications to be shifted directly onto the new platform. With remote access, a (fast-changing) real PC can access the (slow changing) "get the work done" virtual PC. In this case, virtualization and remote access act as a shock absorber for the change in technology.

Cloud computing offers a more efficient platform, but only after re-designing the application. The large monolithic PC applications must split into multiple services coordinated by (relatively) simple applications running on tablets and phones. In this case, the use of small, simple components on multiple platforms (server and tablet/phone) act as the buffer to changes in technology.

The PC platform will see faster update cycles and shorter life spans. Applications on this platform will be subject to more changes. A company's customer base will use more platforms, driving up the cost of development, testing, and support.

Moving to virtual PCs or to the cloud is a way of avoiding that increase in costs.

Sunday, September 15, 2013

Virtualization and small processors

From the beginning of time (for electronic data processing) we have desired bigger processors. We have wanted shorter clock cycles, more bits, more addressable memory, and more powerful instruction sets, all for processing data faster and more efficiently. With time-sharing we wanted additional controls to separate programs, which lead to more complex processors. With networks and malware we added additional complexity to monitor processes.

The history of processors as been a (mostly) steady upwards ramp. I say "mostly" because the minicomputer revolution (ca. 1965) and microcomputer revolution (1977) saw the adoption of smaller, simpler processors. Yet these smaller processors also increased in complexity, over time. (Microprocessors started with the humble 8080 and advanced to the Z-80, the 8086, the 80286, eventually leading to today's Pentium-derived processors.)

I think that virtualization gives us an opportunity for smaller, simpler processors.

Virtualization creates a world of two levels: the physical and the virtual. The physical processor has to keep the virtual processes running, and keep them isolated. The physical processor is a traditional processor and follows traditional rules: more is better, and keep users out of each others' hair.

But the virtual processors, they can be different. Where is it written that the virtual processor must be the same as the host processor? We've built our systems that way, but is it necessary?

The virtualized machine can be smaller than the physical host, and frequently is. It has less memory, smaller disks, and in general a slower (and usually simpler) processor. Yet a virtual machine is still a full PC.

We understand the computing unit known as a "PC". We've been virtualizing machine in these PC units because it has been easy.

A lot of that "standard PC" contains complexity to handle multiple users.

For cheap, easily created virtual machines, is that complexity really necessary?

It is if we use the virtual PC as we use a physical PC, with multiple users and multiple processes. If we run a web server, then we need that complexity.

But suppose with take a different approach to our use of virtual machines. Suppose that, instead of running a complex program like a web server or a database manager, we handle simple tasks. Let's go further and suppose that we create a virtual machine that is designed to handle only one specific task, and that one task is trivial in comparison to our normal workload.

Let's go even further and say that when the task is done, we destroy the virtual machine. Should we need it again, we can create another one to perform the task. Or another five. Or another five hundred. That's the beauty of virtual machines.

Such a machine would need less "baggage" in its operating system. It would need, at the very least, some code to communicate with the outside world (to get instruction and report the results), the code to perform the work, and... perhaps nothing else. All of the user permissions and memory management "stuff" becomes superfluous.

This virtual machine something that exists between our current virtual PC and an object in a program. This new thing is an entity of the virtualization manager, yet simpler (much simpler) than a PC with operating system and application program.

Being much simpler than a PC, this small, specialized virtual machine can use a much simpler processor design. It doesn't need virtual memory management -- we give the virtual processor enough memory. It doesn't need to worry about multiple user processes -- there is only one user process. The processor has to be capable of running the desired program, of course, but that is a lot simpler than running a whole operating system.

A regular PC is "complexity in a box". The designers of virtualization software (VMware, VirtualPC, VirtualBox, etc.) expend large efforts at duplicating PC hardware in the virtual world, and synchronizing that virtual hardware with the underlying physical hardware.

I suspect that in many cases, we don't want virtual PCs. We want virtual machines that can perform some computation and talk to other processors (database servers, web servers, queue servers, etc.).

Small, disposable, virtual machines can operate as one-time use machines. We can instantiate them, execute them, and then discard them. These small virtual machines become the Dixie cups of the processing world. And small virtual machines can use small virtual processors.

I think we may see a renewed interest in small processor design. For virtual processors, "small" means simple: a simple instruction set, a simple memory architecture, a simple system design.

Friday, July 12, 2013

In the cloud, simple will be big

The cloud uses virtualized computers, usually virtualized PCs or PC-based servers.

The temptation is to build (well, instantiate) larger virtualized PCs. More powerful processors, more cores, more memory, more storage. It is a temptation that is based on the ideas of the pre-cloud era, when computers stood alone.

In the mainframe era, bigger was better. It was also more expensive, which in addition to creating a tension between larger and smaller computers, defined a status ranking of computer owners. Similar thinking held in the PC era: a larger, more capable PC was better than a smaller one. (I suppose that similar thinking happens with car owners.)

In the cloud, the size of individual PCs is less important. The cloud is built of many (virtualized) computers, and more importantly, able to increase the number of these computers. This ability shifts the equation. Bigger is still better, but now the measure of bigger is the cloud, not an individual computer.

The desire to improve virtual PCs has merit. Our current virtual PCs duplicate the common PC architecture of several years ago. That design includes the virtual processor type, the virtual hard disk controller, and the virtual video card. They are copies of the common devices of the time, chosen for compatibility with existing software. As copies of those devices, they replicate not only the good attributes but the foibles as well. For example, the typical virtualized environment emulates IDE and SCSI disk controllers, but allows you to boot only from the IDE controllers. (Why? Because the real-world configurations of those devices worked that way.)

An improved PC for the cloud is not bigger but simpler. Cloud-based systems use multiple servers and "spin up" new instances of virtual servers when they need additional capacity. One does not need a larger server when one can create, on demand, more instances of that server and share work among them.

The design of cloud-based systems is subtle. I have asserted that simpler computers are better than complex ones. This is true, but only up to a point. A cloud of servers so simple that they cannot run the network stack would be useless. Clearly, a minimum level of computation is required.

Our first generation of virtual computers were clones of existing machines. Some vendors have explored the use of simpler systems running on a sophisticated virtualization environment. (VMware's ESX and ESXi offerings, for example.)

Future generations of cloud computers will blur the lines between the virtualization manager, the virtualized machine, the operating system, and what is now called the language run-time (the JVM or CLR).

The entire system will be complex, yet I believe the successful configurations will have simplicity in each of the layers.

Wednesday, February 23, 2011

CPU time rides again!

A long time ago, when computers were large, hulking beasts (and I mean truly large, hulking beasts, the types that filled rooms), there was the notion of "CPU time". Not only was there "CPU time", but there was a cost associated with CPU usage. In dollars.

CPU time was expensive and computations were precious. So expensive and so precious, in fact, that early IBM programmers were taught that when performing a "multiply" operation, one should load registers with the larger number in one particular register and the smaller number in a different register. While the operations "3 times 5" and "5 times 3" yield the same results, the early processors did not consider them identical. The multiplication operation was a series of add operations, and "3 times 5" was performed as five "add" operations, while "5 times 3" was performed as three "add" operations. The difference was two "add" operations. Not much, but the difference was larger for larger numbers. Repeated through the program, the total difference was significant. (That is, measurable in dollars.)

Advances in technology and the PC changed that mindset. Personal computers didn't have the notion of "CPU time". In part because the hardware didn't support the capture of CPU time, but also because the user didn't care. People cared about getting the job done, not about minimizing CPU time and maximizing the number of jobs run. There was only one job the user (who was also the system administrator) cared about -- the program that they were running.

For the past thirty years, people have not known or cared about CPU usage and program efficiency. I should rephrase that to "people in the PC/DOS/Windows world". Folks in the web world have cared about performance and still care about performance. But let's focus on the PC folks.

The PC folks have had a free ride for the past three decades, not worrying about performance. Oh, a few folks have worried: developers from the "old world" who learned frugality and programmers with really large data processing needs. But the vast majority of PC users have gotten by with the attitude of "if the program is slow, buy a faster PC".

This attitude is in for a change. The cause of the change? Virtualization.

With virtualization, PCs cease to be stand-alone machines. They become an "image" running under a virtualization engine. (That engine could be Virtual PC, VMWare, Virtualbox, Xen, or a few others. The engine doesn't matter; this issue applies to all of them.)

By shifting from a stand-alone machine to a job in a virtualization host, the PC becomes a job in a datacenter. It also becomes someone else's headache. The PC user is no longer the administrator. (Actually, the role of administrator in corporations shifted long ago, with Windows NT, domain controllers, centralized authentication, and group policies. Virtualization shifts the burden of CPU management to the central support team.)

The system administrators for virtualized PCs are true administrators, not PC owners who have the role thrust upon them. Real sysadmins pay attention to lots of performance indicators, including CPU usage, disk activity, and network activity. They pay attention because the operations cost money.

With virtual PCs, the processing occurs in the datacenter, and sysadmins will quickly spot the inefficient applications. The programs that consume lots of CPU and I/O will make themselves known, by standing out from the others.

Here's what I see happening:

- The shift to virtual PCs will continue, with today's PC users migrating to low-cost PCs and using Remote Desktop Connection (for windows) and Virtual Network Computing (for Linux) to connect to virtualized hosts. Users will keep their current applications.

- Some applications will exhibit poor response through RDP and VNC. These will be the applications with poorly written GUI routines, programs that require the virtualization software to perform extra work to make them work.

- Users will complain to the system administrators, who will tweak settings but in general be unable to fix the problem.

- Some applications will consume lots of CPU or I/O operations. System administrators will identify them and ask users to fix their applications. Users (for the most part) will have no clue about performance of their applications, either because they were written by someone else or because the user has no experience with performance programming.

- At this point, most folks (users and sysadmins) are frustrated with the changes enforced by management and the lack of fixes for performance issues. But folks will carry on.

- System administrators will provide reports on resource usage. Reports will be broken down by subunits within the organization, and show the cost of resources consumed by each subgroup.

- Some shops will introduce charge-back systems, to allocate usage charges to organization groups. The charged groups may ignore the charges at first, or consider them an uncontrollable cost of business. I expect pressure to reduce expenses will get managers looking at costs.

- Eventually, someone will observe that application Y performs well under virtualization (that is, more cheaply) while application X does not. Applications X and Y provide the same functions (say, word processing) and are mostly equivalent.

- Once the system administrators learn about the performance difference, they will push for the more efficient application. Armed with statistics and cost figures, they will be in a good position to advocate the adoption of application Y as an organization standard.

- User teams and managers will be willing to adopt the proposed application, to reduce their monthly charges.

And over time, the market will reward those applications that perform well under virtualization. Notice that this change occurs without marketing. It also forces the trade-off of features against performance, something that has been absent from the PC world.

Your job, if you are building applications, is to build the 'Y' version. You want an application that wins on performance. You do not want the 'X' version.

You have to measure your application and learn how to write programs that are efficient. You need the tools to measure your application's performance, environments in which to test, and the desire to run these tests and improve your application. You will have a new set of requirements for your application: performance requirements. All while meeting the same (unreduced) set of functional requirements.

Remember, "3 times 5" is not the same as "5 times 3".

Fitzpatrick's Fabulous Future