Friday, August 29, 2014

Virtual PCs are different from real PCs

Virtual PCs started as an elaborate game of "let's pretend", in which we simulated a real PC (that is, a physical-hardware PC) in software. A virtual PC doesn't exist -- at least not in any tangible form. It has a processor and memory and disk drives and all the things we normally associate with a PC, but they are all constructed in software. The processor is emulated in software. The memory is emulated in software. The disk drive... you get the idea.

Virtualization offers several advantages. We can create new virtual PCs by simply running another copy of the virtualization software. We can move virtual PCs from one host PC to another host PC. We can make back-up images of virtual PCs by simply copying the files that define the virtual PC. We can take snapshots of the virtual PC at a moment in time, and restore those snapshots at our convenience, which lets us run risky experiments that would "brick" a real PC.

We like to think that virtual PCs are the same as physical PCs, only implemented purely in software. But that is not the case. Virtual PCs are a different breed. I can see three areas that virtual PCs will vary from real PCs.

The first is storage (disk drives) and the file system. Disk drives hold our data; file systems organize that data and let us access it. In real PCs, a disk drive is a fixed size. This makes sense, because a physical disk drive *is* a fixed size. In the virtual world, a disk drive can grow or shrink as needed. I expect that virtual PCs will soon have these flexible disk drives. File systems will have to change; they are built with the assumption of a fixed-size disk. (A reasonable assumption, given that they have been dealing with physical, fixed-size disks.) Linux will probably get a file system called "flexvfs" or something.

The second area that virtual PCs vary from real PCs is virtual memory. The concept of virtual memory is older than virtual PCs or even virtual machines in general (virtual machines date back to the mainframe era). Virtual memory allows a PC to use more memory than it really has, by swapping portions of memory to disk. Virtual PCs currently implement virtual memory because they are faithfully duplicating the behavior of real PCs, but they don't have to. A virtual PC can assume that it has all memory addressable by the processor and let the hypervisor handle the virtualization of memory. Delegating the virtualization of memory to the hypervisor lets the "guest" operating system become simpler, as it does not have to worry about virtual memory.

A final difference between virtual PCs and real PCs is the processor. In a physical PC, the processor is rarely upgraded. An upgrade is an expensive proposition: one must buy a compatible processor, shut down the PC, open the case, remove the old processor, carefully install the new processor, close the case, and start the PC. In a virtual PC, the processor is emulated in software, so an upgrade is nothing more that a new set of definition files. It may be possible to upgrade a processor "on the fly" as the virtual PC is running.

These three differences (flexible file systems, lack of virtual memory, and updateable processors) show that virtual PCs are not the same as the "real" physical-hardware PCs. I expect that the two will diverge over time, and that operating systems for the two will also diverge.

Tuesday, August 26, 2014

With no clear IT leader, expect lots of changes

The introduction of the IBM PC was market-wrenching. Overnight, the small, rough-and-tumble market of microcomputers with diverse designs from various small vendors became large and centered around the PC standard.

From 1981 to 1987, IBM was the technology leader. IBM lead in sales and also defined the computing platform.

IBM's leadership fell to Compaq in 1987, when IBM introduced the PS/2 line with its new (incompatible) hardware. Compaq delivered old-style PCs with a faster buss (the EISA buss) and notably the Intel 80386 processor. (IBM stayed with the older 80286 and 8086 processors, eventually consenting to provide 80386-based PS/2 units.) Compaq even worked with Microsoft to deliver newer versions of MS-DOS that recognized larger memory capacity and optical disc readers.

But Compaq did not remain the leader. It's leadership declined gradually, to the clone makers and especially Dell, HP, and Gateway.

The mantle of leadership moved from a PC manufacturer to the Microsoft-Intel duopoly. The popularity of Windows, along with marketing skill and software development prowess led to a stable configuration for Microsoft and Intel. Together, they out-competed IBM's OS/2, Motorola's 68000 processor, DEC's Alpha processor, and Apple's Macintosh line.

That configuration held for two decades, roughly from 1990 to 2010, when Apple introduced the iPhone. The genius move was not the iPhone hardware, but the App Store and iTunes, which let one easily find and install apps on your phone (and pay for them).

Now Microsoft and Apple have the same problem: after years of competing in a well-defined market (the corporate PC market) they struggle to move into the world of mobile computing. Microsoft's attempts at mobile devices (Zune, Kin, Surface RT) have flopped. Intel is desperately attempting to design and build processors that are suitable for low-power devices.

I don't expect either Microsoft or Intel to disappear. (At least not for several years, possibly decades.) The PC market is strong, and Intel can sell a lot of its traditional (heat radiator that happen to compute data) processors. Microsoft is a competent player in the cloud arena with its Azure services.

But I will make an observation: for the first time in the PC era, we find that there is no clear leader for technology. The last time we were leaderless was prior to the IBM PC, in the "microcomputer era" of Radio Shack TRS-80 and Apple II computers. Back then, the market was fractured and tribal. Hardware ruled, and your choice of hardware defined your tribe. Apple owners were in the Apple tribe, using Apple-specific software and exchanging data on Apple-specific floppy disks. Radio Shack owners were in the Radio Shack tribe, using software specific to the TRS-80 computers and exchanging data on TRS-80 diskettes. Exchanging data between tribes was one of the advanced arts, and changing tribes was extremely difficult.

There were some efforts to unify computing: CP/M was the most significant. Built by Digital Research (a software company with no interest in hardware), CP/M ran on many different configurations. Yet even that effort could not span the differences in processors, memory layout, and video configurations.

Today we see tribes forming around multiple architectures. For cloud computing, we have Amazon.com's AWS, Microsoft's Azure, Google's App Engine. With virtualization we see VMware, Oracle's VirtualBox, the aforementioned cloud providers, and newcomer Docker as a rough analog of CP/M. Mobile computing sees Apple's iOS, Google's Android, and Microsoft's Windows RT as a (very) distant third.

With no clear leader and no clear standard, I expect each vendor to enhance their offerings and also attempt to lock in customers with proprietary features. In the mobile space, Apple's Swift and Microsoft's C# are both proprietary languages. Google's choice of Java puts them (possibly) at odds with Oracle -- although Oracle seems to be focussed on databases, servers, and cloud offerings, so there is no direct conflict. Things are a bit more collegial in the cloud space, with vendors supporting OpenStack and Docker. But I still expect proprietary enhancements, perhaps in the form of add-ons.

All of this means that the technology world is headed for change. Not just change from desktop PC to mobile/cloud, but changes in mobile/cloud. The competition from vendors will lead to enhancements and changes, possibly significant changes, in cloud computing and mobile platforms. The mobile/cloud platform will be a moving target, with revisions as each vendor attempts to out-do the others.

Those changes mean risk. As platforms change, applications and systems may break or fail in unexpected ways. New features may offer better ways of addressing problems and the temptation to use those new features will be great. Yet re-designing a system to take advantage of new infrastructure features may mean that other work -- such as new business features -- waits for resources.

One cannot ignore mobile/cloud computing. (Well, I suppose one can, but that is probably foolish.) But one cannot, with today's market, depend on a stable platform with slow, predictable changes like we had with Microsoft Windows.

With such an environment, what should one do?

My recommendations:

Build systems of small components  This is the Unix mindset, with small tools to perform specific tasks. Avoid large, monolithic systems.

Use standard interfaces  Use web services (either SOAP or REST) to connect components into larger systems. Use JSON and Unicode to exchange data, not proprietary formats.

Hedge your bets  Gain experience in at least two cloud platforms and two mobile platforms. Resist the temptation of "corporate standards". Standards are good with a predictable technology base. The current base is not predictable, and placing your eggs in one vendor's basket is risky.

Change your position  After a period of use, examine your systems, your tools, and your talent. Change vendors -- not for everything, but for small components. (You did build your system from small, connected components, right?) Migrate some components to another vendor; learn the process and the difficulties. You'll want to know them when you are forced to move to a different vendor.

Many folks involved in IT have been living in the "golden age" of a stable PC platform. They may have weathered the change from desktop to web -- which saw a brief period of uncertainty. More than likely, they think that the stable world is the norm. All that is fine -- except we're not in the normal world with mobile/cloud. Be prepared for change.

Sunday, August 17, 2014

Reducing the cost of programming

Different programming languages have different capabilities. And not surprisingly, different programming languages have different costs. Over the years, we have found ways of reducing those costs.

Costs include infrastructure (disk space for compiler, memory) and programmer training (how to write programs, how to compile, how to debug). Notice that the load on the programmer can be divided into three: infrastructure (editor, compiler), housekeeping (declarations, memory allocation), and business logic (the code that gets stuff done).

Symbolic assembly code was better than machine code. In machine code, every instruction and memory location must be laid out by the programmer. With a symbolic assembler, the computer did that work.

COBOL and FORTRAN reduced cost by letting the programmer not worry about the machine architecture, register assignment, and call stack management.

BASIC (and time-sharing) made editing easy, eliminated compiling, and made running a program easy. Results were available immediately.

Today we are awash in programming languages. The big ones today (C, Java, Objective C, C++, BASIC, Python, PHP, Perl, and JavaScript -- according to Tiobe) are all good at different things. That is perhaps not a coincidence. People pick the language best suited to the task at hand.

Still, it would be nice to calculate the cost of the different languages. Or if numeric metrics are not possible, at least rank the languages. Yet even that is difficult.

One can easily state that C++ is more complex than C, and therefore conclude that programming in C++ is more expensive that C. Yet that's not quite true. Small programs in C are easier to write than equivalent programs in C++. Large programs are easier to write in C++, since the ability to encapsulate data and group functions into classes helps one organize the code. (Where 'small' and 'large' are left to the reader to define.)

Some languages are compiled and some that are interpreted, and one can argue that a separate step to compile is an expense. (It certainly seems like an expense when I am waiting for the compiler to finish.) Yet languages with compilers (C, C++, Java, C#, Objective-C) all have static typing, which means that the editor built into an IDE can provide information about variables and functions. When editing a program written in one of the interpreted languages, on the other hand, one does not have that help from the editor. The interpreted languages (Perl, Python, PHP, and JavaScript) have dynamic typing, which means that the type of a variable (or function) is not constant but can change as the program runs.

Switching from an "expensive" programming language (let's say C++) to a "reduced cost" programming language (perhaps Python) is not always possible. Programs written in C++ perform better. (On one project, the C++ program ran for several hours; the equivalent program in Perl ran for several days.) C and C++ let one have access to the underlying hardware, something that is not possible in Java or C# (at least not without some add-in trickery, usually involving... C++.)

The line between "cost of programming" and "best language" quickly blurs, and nailing down the costs for the different dimensions of programming (program design, speed of coding, speed of execution, ability to control hardware) get in our way.

In the end, I find that it is easy to rank languages in the order of my preference rather than in an unbiased scheme. And even my preferences are subject to change, given the nature of the project. (Is there existing code? What are other team members using? What performance constraints must we meet?)

Reducing the cost of programming is really about trade-offs. What capabilities do we desire, and what capabilities are we willing to cede? To switch from C++ to C# may mean faster development but slower performance. To switch from PHP to Java may mean better organization of code through classes but slower development. What is it that we really want?

Monday, August 11, 2014

Agile is not compatible with silos

Agile development methods are very different from the traditional Waterfall methods. So different that they can affect the culture of the organization.

Agile make a different promise than Waterfall. Waterfall promises a specific deliverable on a specific date; Agile promises that you can ship whenever you want.

Agile discourages specialization. An iteration is short yet requires analysis, development, and testing. Such a short cycle does not allow for different individuals to perform different tasks.

Yet the biggest difference between Agile and Waterfall is the partitioning of tasks and the encapsulation of information. Waterfall strives for clean, discrete changes from one phase to another, with information flowing between phases in well-defined documents. The flow between the requirements phase and the development phase is the requirements document (or documents). The test results are presented in a specific document. And so on.

Information in each phase is encapsulated in that phase, and only a small set of information is allowed to transfer (one might say 'leak') to another phase.

The partitioning of tasks and the encapsulation of information leads to silos within the organization. Once separate teams are established for requirements, development, testing, and deployment, tensions arise between teams. The testing team identifies defects that reflect on the development team. The development team blames the requirements team for incomplete or ambiguous specifications.

Agile -- at least Agile for small teams -- has none of that. The fast cycles of feature selection, design, development, and test provide immediate feedback. An ambiguous requirement is spotted early, and it is obvious to everyone. Defects are identified and fixed before implementing the next feature set.

More importantly, an Agile project has one team, and the measurement of success for that team is the delivery of software. That focus on success and the inability to shift blame to another team means that it is harder to establish silos.

Which is not to say that Agile will eliminate all silos. An organization with many Agile projects can still have silos. A large company using an "Agile for large companies" process may develop silos.

But for the most part, I believe Agile processes are incompatible with silos. The involvement of necessary stakeholders; the coordinated work of design, development, and testing; and the fast cycle times all push against silo-ization.