Wednesday, April 9, 2014

Files exist on PCs but not in the cloud

We're quite familiar with the concept of data files and file systems. And why shouldn't we be? We've been using files to store data since the first PCs and PC-DOS, and even earlier than that. (CP/M and other predecessors to PC-DOS used files and file systems.)

But computer systems did not always store data in files, or provide a familiar API into the file system.

Prior to the PC revolution, there were non-PC systems used for various purposes. Mainframes, minicomputers, and a rather interesting device: the dedicated word processor. These word processing systems were small (ish) computers that were built for one purpose: the composition and editing (and eventually printing) of documents. They were larger than today's typical tower desktop PC system, and even larger than the original IBM PC -- but not much larger, and they fit comfortably in many offices.

While they were computers, they were single-purpose computers. One did not choose to run an application from a list; the only application was the word processing software. (Quite similar to today's Chromebooks which run only the Chrome browser.) They were made by different vendors (DEC, Wang, and even IBM) and each was its own separate island of computing power.

These dedicated word processors did not have a "Start" menu, they did not have Windows Explorer, they did not have a command-line interface. All of the "common file operations" that we associate with Windows Explorer were handled inside the word processing system, usually in a "utility" menu. One did not work with files but with documents; one created a new document, edited a document, or printed a document.

Documents had names -- nice, long names which allowed for spaces. Since this was a computer, and since it did have a file system, the word processing system mapped the document names to arbitrarily assigned file names. The user never saw these file names; the word processing system insulated the user from such minutiae.

These dedicated word processors stored your information and retrieved it for you, and you had no opportunity to work with the data outside of the system. Unlike personal computers (with data stored in the file system accessible to other programs), word processing systems shielded the details from the user. Users never had to worry about file formats, or parsing a file into another format. The word processor was the only program to access the data, and the user ran only the word processor. Your data was a captive of the system.

Those dedicated word processing systems with their captive data are things of the past. PC-DOS, MS-DOS, and Windows all use file systems -- even Linux uses file systems -- and these file systems allow us to always access our data with any program we choose. We can always examine the files which contain our data (provided we know the location of the files). What does this ancient history of computing have to do with today's technology?

I think that this shielding technique could make a comeback. It could be revived in the world of cloud computing, specifically in the Software-as-a-Service (SaaS) offerings.

Anyone who has used Google's "Drive" feature (or Google Apps) knows that they can create and edit documents and spreadsheets "in the cloud". Using a browser, one can create an account and easily work with one or more documents. (Or spreadsheets. Let's call all of these items "documents", instead of the verbose "documents, spreadsheets, presentations, and other things".)

While one can create and edit these documents, one cannot "see" them as one can "see" files on your local hard drive. One can manipulate them only within the Google applications. The PC files, in contrast, can be manipulated by Windows Explorer (or the Mac Finder, or the Linux Filer programs). A document created in Microsoft Word and stored on my local drive can be opened with Microsoft Word or with LibreOffice or with various other programs. That's not possible with the documents in the Google cloud. (I'm picking on Google here, but this concept holds for any of the Software-as-a-Service offerings. Microsoft's Office 365 works the same way.)

Documents stored in the cloud are documents, but not files as we know them. We cannot view or edit the file, except with the cloud provider's app. We cannot rename them, except with the cloud provider's app. We cannot e-mail them to friends, except with the cloud provider's app. We cannot see the bytes used to store the document, nor do we know the format of that stored form of the document.

In fact, we do not know that the document is stored in a file (as we know the term "file") at all!

The Software-as-a-Service systems do allow for documents to be uploaded from our PC and downloaded to our PC. On our PC, those documents are stored in files, and we can treat them like any other file. We can rename the file. We can open the file in alternate applications. We can write our own applications to "pick apart" the file. (I don't recommend that you devote much time to that endeavor.)

But those operations occur on the PC, not in the cloud. Those operations are possible because the cloud system allowed us to export our document to our PC. That export operation is not guaranteed. I could create a cloud-based word processing service that did not allow you to export documents, one that kept your documents in the cloud and did not permit you to move them to another location. (Such a service may be unwelcome today, but it is a possible offering.)

The ability to move documents from cloud-based systems is possible only when the cloud-based system permits you to export documents.

Even when SaaS systems allow you to export documents today, there is no guarantee that such capabilities will always be available. The system vendor could change their system and disable the exporting of documents.

Should that happen (and I'm not saying that it will happen, only that it could happen) then the cloud-based SaaS systems will operate very much like the dedicated word processing systems of the 1970s. They will hold your documents captive and not allow you to export them to other systems. This applies to any SaaS system: word processing, spreadsheets, presentations, emails, calendars, ... you name it. It applies to any SaaS vendor: Google, Apple, IBM, Microsoft, ... you name the vendor.

I'm going to think about that for a while.

No comments: