Wednesday, May 31, 2017

How many computers?

Part of the lore of computing discusses the mistakes people make in predictions. Thomas J. Watson (president of IBM) predicted the need for five computers -- worldwide. Ken Olson, founder and president of DEC, thought that no one would want a computer in their home.

I suspect that the we repeat these stories for the glee that they bring. What could be more fun than seeing important, well-known people make predictions that turn out to be wrong.

Microsoft's goal, in contrast to the above predictions, was a computer in every home and on every desk, and each of them running Microsoft software. A comforting goal for those who fought in the PC clone wars against the mainframe empire.

But I'm not sure that T. J. Watson was wrong.

Now, before you point out that millions (billions?) of PCs have been sold, and that millions (billions?) of smartphones have been sold, and that those smartphones are really computers, here me out.

Computers are not quite what we think they are. We tend to think of them as small, stand-alone, general-purpose devices. PCs, laptops, smartphones, tablets... they are all computers, right?

Computers today are computing devices, but the border is not so clear. Computers are useful when they are part of a network, and connected to the internet. A computer that is not connected to the internet is not so useful. (Try an experiment: Take any computer, smartphone, or tablet and disconnect it from the network. Now use it. How long before you become bored?)

Without e-mail, instant messages, and web pages, computers are not that interesting -- or useful.

The boxes we think of as computers are really only parts of a larger construct. That larger construct is built from processors and network cards and communication equipment and disks and server rooms and software and protocols. That larger "thing" is the computer.

In that light, we could say that the entire world is running on one "computer" which happens to have lots of processors and multiple operating systems and many keyboards and displays. Parts of this "computer" are powered at different times, and sometimes entire segments "go dark" and then return. Sometimes individual components fail and are discarded, like dead skin cells. (New components are added, too.)

So maybe Mr. Watson was right, in the long run. Maybe we have only one computer.

Monday, May 29, 2017

Microsoft's GVFS for git makes git a different thing

Microsoft is rather proud of their GVFS filesystem for git, but I think they don't understand quite what it is that they have done.

GVFS, in short, changes git into a different thing. The plain git is a distributed version control system. When combined with GVFS, git becomes... well, let's back up a bit.

A traditional, non-distributed version control system consists of a central repository which holds files, typically source code. Users "check out" files, make changes, and "check in" the revised files. While users have copies of the files on their computers, the central repository is the only place that holds all of the files and all of the revisions to the files. It is the one place with all information, and is a single point of failure.

A distributed version control system, in contrast, stores a complete set of files and revisions on each user's computer. Each user has a complete repository. A new user clones a repository from an existing team member and has a a complete set of files and revisions, ready to go. The repositories are related through parent-child links; the new user in our example has a repository that is a child of the cloned repository. Each repository is a clone, except for the very first instance, which could be considered the 'root' repository. The existence of these copies provides redundancy and guards against a failure of the central repository in traditional version control systems.

Now let's look at GVFS and how it changes git.

GVFS replaces the local copy of a repository with a set of virtual files. The files in a repository are stored in a central location and downloaded only when needed. When checked in, the files are uploaded to the central location, not the local repository (which doesn't exist). From the developer's perspective, the changes made by GVFS are transparent. Git behaves just as it did before. (Although with GVFS, large repositories perform better than with regular git.)

Microsoft's GVFS changes the storage of repositories. It does not eliminate the multiple copies of the repository; each user retains their own copy. It does move those copies to the central server. (Or servers. The blog entry does not specify.)

I suppose you could achieve the same effect (almost) with regular git by changing the location of the .git directory. Instead of a local drive, you could use a directory on an off-premise server. If everyone did this, if every stored their git repository on the same server (say, a corporate server), you would have something similar to git with GVFS. (It is not exactly the same, as GVFS does some other things to improve performance.)

Moving the git repositories off of individual, distributed computers and onto a single, central server changes the idea of a distributed version control system. The new configuration is something in between the traditional version control system and a distributed version control system.

Microsoft had good reason to make this change. The performance of standard git was not acceptable for a very large team. I don't fault them for it. And I think it can be a good change.

Yet it does make git a different creature. I think Microsoft and the rest of the industry should recognize that.

Sunday, May 21, 2017

Parallel processing on the horizon

Parallel processing has been with us for years. Or at least attempts at parallel processing.

Parallel processing has failed due the numerous challenges it faces. It requires special (usually expensive) hardware. Parallel processing on convention CPUs is simply processing items serially, because conventional CPUs can process only serially. (Multi-core processors address this problem to a small degree.) Parallel processing requires support in compilers and run-time libraries, and often new data structures. Most importantly, parallel processing requires tasks that are partitionable. The classic example of "nine women producing a baby in one month" highlights a task that is not partitionable, not divisible, into smaller tasks.

Cloud computing offers a new twist on parallel processing.

First, it offers multiple processors. Not just multiple cores, but true multiple processors -- as many as you would like.

Second, it offers these processors cheaply.

Cloud computing is a platform that can handle parallel processing -- in some areas. It has its problems.

First, creating new cloud processing systems is expensive in terms of time. A virtual machine must be instantiated, started, and given software to handle the task. Then, data must be shipped to the server. After processing, the result must be sent back, or forward to another processor. The time for all of these tasks is significant.

Second, we still have the problems of partitioning tasks and representing the data and operations in a program.

There is one area of development that I believe is ready to leverage parallel processing. That area is testing.

The typical testing effort for a project can have multiple levels: unit tests, component tests, system tests, end-to-end tests, you name it. But each level of testing follows the same general pattern:

  • Get a collection of tests, complete with input data and expected results
  • For each test
  • 1) Set up a test environment (program and data)
  • 2) Run the test
  • 3) Compare output to expected output
  • 4) Record the results
  • Summarize the results and report

In this process, the sequence of steps I've labelled 1 through 4 is repeated for each test. Traditional testing puts all of these tests on one computer, performing each test in sequence. Parallel testing can put each test on its own cloud-based processor, effectively running all tests at once.

Testing has a series of well-defined and partitionable tasks. Modern testing methods use automated tests, so a test can run locally or remotely (as long as it has access to everything it needs). Testing can be a drain on resources and time, requiring lots of requests to servers and lots of time to complete all tests.

Testing in the cloud, and in parallel, addresses these issues. It reduces the time for tests and improves the feedback to developers. Cloud processing is cheap -- at least cheaper than paying developers to wait for tests to run.

I think one the next "process improvements" for software development will be the use of cloud processing to run tests. Look for new services and changes to testing frameworks to support this new mode of testing.

Thursday, May 18, 2017

An echo of Wordstar

In 1979, Wordstar was the popular word processor of the time. It boasted "what you see is what you get" (WYSIWYG) because it would reformat text on the screen as you typed.

Wordstar had a problem on some computers. It would, under the right conditions, miss characters as you were typing. The problem was documented in an issue of "Personal Computing", comparing Wordstar to another program called "Electric Pencil". The cause was the automatic reformatting of text. (The reformatting took time, and that's when characters were lost. Wordstar was busy redrawing text and not paying attention to the keyboard.)

At the time, computers were primitive. They ran CP/M on an 8080 or Z-80 processor with at most 64K RAM. Some systems used interrupts to detect keystrokes but others simply polled the keyboard from time to time, and it was easy to miss a typed character.

So here we are in 2017. Modern PCs are better then the early microcomputers, or so we like to think. We have more memory. We have larger disks for storage. We have faster processors. They cost less. Better in every way.

So we like to think.

From a blog in 2017:
Of course, I always need to switch over to Emacs to get real work done.  IntelliJ doesn't like it when you type fast.  Its completions can't keep up and you wind up with half-identifiers everywhere. 
I'm flabbergasted.

(I must also note that I am not a user of IntelliJ, so I have not seen this behavior myself. I trust the report from the blogger.)

But getting back to being flabbergasted...

We have, in 2017, an application that cannot keep up with human typing?

We may have made less progress than we thought.

Tuesday, May 16, 2017

Law comes to computing's Wild West

I see Windows 10 S as the future of Windows. The model of "software only through a store" works for phones and tablets, provides better security, and reduces administrative work. It is "good enough" for corporate users and consumers, and those two groups drive the market. ("Good enough" if the right applications are available in the store, that is.)

But.

The introduction of Windows 10 S is a step in the closing the frontier we fondly think of as "personal computing".

This "closing of the frontier" has been happening for some time.

The IBM PC was open to tinkerers, in both hardware and, to some extent, software. On the hardware side, the IBM PC was designed for adapter cards, and designed to allow individuals to open the case and insert them. IBM released technical specifications which allowed other manufacturers to create their own cards. It was a smart move by IBM, and helped ensure the success of the PC.

On the software side, there were three operating systems available for the IBM PC: DOS, CP/M-86, and UCSD p-System. These were less restrictive than today's operating systems, with no notion of "user" or "administrator", no notion of "user privileges" or "user account". The operating system (such as it was) managed files on disk and loaded programs into memory when requested.

It was a time akin to the "wild west" with no controls on users. Any user could attach any device or install any program. (Getting everything to work was not always easy, and not always possible, but users could try.)

How has the PC realm become closed?

First, let me say that it is not totally closed. Users still have a great deal of freedom, especially on PCs they purchase for themselves (as opposed to corporate-issued PCs).

But the freedom to do anything meant that users could break things, easily, and lose data and disable programs. It also meant that ill-meaning individuals could write virus programs and cause problems. Over time, we (as an industry and group of users) decided that restrictions were necessary.

One of the first things corporate support groups did, when preparing a new PC, was to remove the 'FORMAT' program. (Or rename it.) It was considered too "dangerous" for a non-technical user.

The next set of restrictions came with Windows NT. It provided the notion of 'user accounts' and logins and passwords -- and enforced them. Windows NT also provided the notion of 'user privileges' which meant that some users could adjust settings for the operating system and others could not. Some users could install software, and others could not. Some users could... you get the idea.

Restrictions have not been limited to software.

UEFI replaced the BIOS, and was not "flashable" as many BIOSes had been.

Smaller computers (laptops and tablets) are generally not openable. The IBM PC provided access to memory, adapter cards, DIP switches (remember those?), and the power supply. Today, most laptops allow access to memory chips... and little else. (DIP switches have disappeared from PCs entirely, and no one misses them.)

Which brings us to Windows 10 S.

Windows 10 S is a move to close the environment a little more. It makes a PC more like a phone, with an official "store" where one must buy software. You cannot install just any software. You cannot write your own software and install it.

The trend has been one of a gradual increase in "law" in our wild west. As in history, the introduction of these "laws" has meant the curtailment of individuals' freedoms. You cannot re-format your phone, at least not accidentally, and not to a blank disk. (Yes, you can reset your phone, which isn't quite the same thing.)

Another way to look at the situation is a change in the technology. We have shifted from the original PCs that required hardware and software configuration to meet the needs of the user (an individual or a larger entity). Instead of the early (incomplete) computers, we have well-defined and fully-functional computers that provide limited configuration capabilities. This is accepted because the changes that we want to make are within the "locked down" configuration of the PC. The vast majority of users don't need to set parameters for the COM port, or add memory, or install new versions of Lotus 1-2-3. In corporate settings, users run the assigned software and choose a photo for our desktop background; at home we install Microsoft Office and let it run as it comes "out of the box".

The only folks who want to make changes are either corporate sysadmins or individual tinkerers. And there are very few tinkerers, compared to the other users.

For the tinkerers and organizations that need "plain old Windows", it is still available. Windows 10-without-S works as it has before. You can install anything. You can adjust anything. Provided you have the privileges to do so.

I see Windows S as an experiment, testing the acceptance of such a change in the market. I expect a lot of noise from protesters, but the interesting aspect will be behavior. Will the price of Windows 10 S affect acceptance? Possibly. Windows 10 S is not sold separately -- only preloaded onto computers. So look for the purchasing behavior of low-cost Windows 10 S devices.

In the long term, I expect Windows 10 S or a derivative to become the popular version of Windows. Corporations and governments will install it for employees, and keep the non-S version of Windows for those applications that cannot run under Windows 10 S. Those instances of Windows (the non-S instances) will most likely be run on virtual machines in data centers, not on individuals' desks.

But those instances of "non-S Windows" will become few, and eventually fade into history, along with PC-DOS and Windows 95. Ans while a few die-hard enthusiasts will keep them running, the world will switch to a more protected, a more secure, and a less wild-west version of Windows.

Monday, May 8, 2017

Eventual Consistency

The NoSQL database technology has introduced a new term: eventual consistency.

Unlike traditional relational databases which promise atomicity, completeness, and durability, NoSQL databases promise that updates will be consistent at some point in the (near) future. Just not right now.

For some folks, this is bad. "Eventual" consistency is not as good as "right now" consistency. Worse, it means that for some amount of time, the system is in an inconsistent state, and inconsistencies make managers nervous.

But we've had systems with internal inconsistencies for some time. They exist today.

One example is from a large payroll-processing company.  They have been in business for decades and have a good reputation. Surely they wouldn't risk their (lucrative) business on something like inconsistency? Yet they do.

Their system consists of two subsystems: a large, legacy mainframe application and a web front-end. The mainframe system processes transactions, which includes sending information to the ACH network. (The ACH network feeds information to individual banks, which is how your paycheck is handled with direct deposit.)

Their web system interfaces to the processing system. It allows employees to sign in and view their paychecks, present and past. It is a separate system, mostly due to the differences in technologies.

Both systems run on schedules, with certain jobs running every night and some running during the day.
Inconsistencies arise when the payroll job runs on Friday. The payroll-processing system runs and sends money to the ACH network, but the web system doesn't get the update until Monday morning. Money appears in the employee's bank account, but the web system knows nothing about the transaction. That's the inconsistency, at least over the weekend and until Monday morning. Once the web system is updated, both systems are "in sync" and consistent.

This example shows us some things about inconsistencies.
  • Inconsistencies occur between systems. Each subsystem is consistent to itself.
  • Inconsistencies are designed. Our example payroll system could be modified to update the web subsystem every time a payroll job is run. The system designers chose to use a different solution, and they may have had good reasons.
  • Inconsistencies exist only when one is in a position to view them. We see the inconsistency in the payroll-processing system because we are outside of it, and we get data from both the core processing system and the web subsystem.

Eventual consistency is also a design. It also exists between subsystems (or between instances of a subsystem in a cloud system).

Eventual consistency is not necessarily a bad thing. It's not necessarily a good thing. It is an aspect of a system, a design choice between trade-offs. And we've had it for quite some time.

Monday, May 1, 2017

That old clunky system -- the smart phone

Mainframes probably have first claim on the title of "that old large hard-to-use system".

Minicomputers were smaller, easier to use, less expensive, and less fussy. Instead of an entire room, they could fit in the corner of an office. Instead of special power lines, they could use standard AC power.

Of course, it was the minicomputer users who thought that mainframes were old, big, and clunky. Why would anyone want that old, large, clunky thing when they could have a new, small, cool minicomputer?

We saw the same effect with microcomputers. PCs were smaller, easier to use, less expensive, and less fussy than minicomputers.

And of course, it was the PC users who thought that minicomputers (and mainframes) were old, big, and clunky. Why would anyone want that old, large, clunky thing when they could have a new, small, cool PC?

Here's the pattern: A technology gets established and adopted by a large number of people. The people who run the hardware devote time and energy to learning how to operate it. They read the manuals (or web pages), they try things, they talk with other administrators. They become experts, or at least comfortable with it.

The second phase of the pattern is this: A new technology comes along, one that does similar (although often not identical) work as the previous technology. Many times, the new technology does a few old things and lots of new things. Minicomputers could handle data-oriented applications like accounting, but were better at data input and reporting. PCs could handle input and reporting, but were really good at word processing and spreadsheets.

The people who adopt the later technology look back, often in disdain, at the older technology that doesn't do all of the cool new things. (And too often, the new-tech folks look down on the old-tech folks.)

Let's move forward in time. From mainframes to minicomputers, from minicomputers to desktop PCs, from desktop PCs to laptop PCs, from classic laptop PCs to MacBook Air-like laptops. Each transition has the opportunity to look back and ask "why would anyone want that?", with "that" being the previous cool new thing.

Of course, such effects are not limited to computers. There were similar feelings with the automobile, typewriters (and then electric typewriters), slide rules and pocket calculators, and lots more.

We can imagine that one day our current tech will be considered "that old thing". Not just ultralight laptops, but smartphones and tablets too. But what will the cool new thing be?

I'm not sure.

I suspect that it won't be a watch. We've had smartwatches for a while now, and they remain a novelty.

Ditto for smart glasses and virtual reality displays.

Augmented reality displays such as Microsoft's Halo, show promise, but also remain a diversion.

What the next big thing needs is a killer app. For desktop PCs, the killer app was the spreadsheet. For smartphones, the killer app was GPS and maps (and possibly Facebook and games). It wasn't the PC or the phone that people wanted, it was the spreadsheet and the ability to drive without a paper map.

Maybe we've been going about this search for the next big thing in the wrong way. Instead of searching for the device, we should search for the killer app. Find the popular use first, and then you will find the device.