Wednesday, January 31, 2018

Optimizing in the wrong direction

Back in the late 200X years, I toyed with the idea of a new version control system. It wasn't git, or even git-like. In fact, it was the opposite.

At the time, version control was centralized. There was a single instance of the repository and you (the developer) had a single "snapshot" of the files. Usually, your snapshot was the "tip", the most recent version of each file.

My system, like other version control systems of the time, was a centralized system, with versions for each file stored as 'diff' packages. That was the traditional approach for version control, as storing a 'diff' was smaller than storing the entire version of the file.

Git changed the approach for version control. Instead of a single central repository, git is a distributed version control system. It replicates the entire repository in every instance and uses a sophisticated protocol to synchronize changes across instances. When you clone a repo in git, you get the entire repository.

Git can do what it does because disk space is now plentiful and cheap. Earlier version control systems worked on the assumption that disk space was expensive and limited. (Which, when SCCS was created in the 1970s, was true.)

Git is also directory-oriented, not file-oriented. Git looks at the entire directory tree, which allows it to optimize operations that move files or duplicate files in different directories. File-oriented version control systems, looking only at the contents of a single file at a time, cannot make those optimizations. That difference, while important, is not relevant to this post.

I called my system "Amnesia". My "brilliant" idea was to, over time, remove diffs from the repository and thereby use even less disk space. Deletion was automatic, and I let the use specify a set of rules for deletion, so important versions could be saved indefinitely.

My improvement was based on the assumption of disk space being expensive. Looking back, I should have known better. Disk space was not expensive, and not only was it not expensive it was not getting expensive -- it was getting cheaper.

Anyone looking at this system today would be, at best, amused. Even I can only grin at my error.

I was optimizing, but for the wrong result. The "Amnesia" approach reduced disk space, at the cost of time (it takes longer to compute diffs than it does to store the entire file), information (the removal of versions also removes information about who made the change), and development cost (for the auto-delete functions).

The lesson? Improve, but think about your assumptions. When you optimize something, do it in the right direction.

Wednesday, January 24, 2018

Cloud computing is repeating history

A note to readers: This post is a bit of a rant, driven by emotion. My 'code stat' project, hosted on Microsoft Azure's web app PaaS platform, has failed and I have yet to find a resolution.

Something has changed in Azure, and I can no longer deploy a new version to the production servers. My code works; I can test it locally. Something in the deployment sequence fails. This is a test project, using the free level of Azure, which means no monthly costs but also means no support -- other than the community help pages.

There are a few glorious advances in IT, advances which stand out above the others. They include the PC revolution (which saw individuals purchasing and using computers), the GUI (which saw people untrained in computer science using computers), and the smartphone (which saw lots more people using computers for lots more sophisticated tasks).

The PC revolution was a big change. Prior to personal computers (whether they were IBM PCs, Apple IIs, or Commodore 64s), computers were large, expensive, and complicated; they were especially difficult to administer. Mainframes and even minicomputers were large and expensive; an individual could afford one if they were an enormously wealthy individual and had lots of time to read manuals and try different configurations to make the thing work.

The consumer PCs changed all of that. They were expensive, but within the range of the middle class. They required little or no administration effort. (The Commodore 64 was especially easy: plug it in, attach to a television, and turn it on.)

Apple made the consumer PC easier to use with the Macintosh. The graphical user interface (lifted from Xerox PARC's Alto, and later copied by Microsoft Windows) made many operations and concepts consistent. Configuration was buried, and sometimes options were reduced to "the way Apple wants you to do it".

It strikes me that cloud computing is in a "mainframe phase". It is large and complex, and while an individual can create a an account (even a free account), the complexity and time necessary to learn and use the platform is significant.

My issue with Microsoft Azure is precisely that. Something has changed and it behaves differently than it did in the past. (It's not my code, the change is in the deployment of my app.) I don't think that I have changed something in Azure's configuration -- although I could have.

The problem is that once you go beyond the 'three easy steps to deploy a web app', Azure is a vast and intimidating beast with lots of settings, each with new terminology. I could poke at various settings, but will that fix the problem or make things worse?

From my view, cloud computing is a large, complex system that requires lots of knowledge and expertise. In other words, it is much like a mainframe. (Except, of course, you don't need a large room dedicated to the equipment.)

The "starter plans" (often free) are not the equivalent of a PC. They are merely the same, enterprise-level plans with certain features turned off.

A PC is different from a mainframe reduced to tabletop size. Both have CPUs and memory and peripheral devices and operating systems, but are two different creatures. PCs have fewer options, fewer settings, fewer things you (the user) can get wrong.

Cloud computing is still at the "mainframe level" of options and settings. It's big and complicated, and it requires a lot of expertise to keep it running.

If we repeat history, we can expect companies to offer smaller, simpler versions of cloud computing. The advantage will be an easier learning curve and less required expertise; the disadvantage will be lower functionality. (Just as minicomputers were easier and less capable than mainframes and PCs were easier and less capable than minicomputers.)

I'll go out on a limb and predict that the companies who offer simpler cloud platforms will not be the current big providers (Amazon.com, Microsoft, Google). Mainframes were challenged by minicomputers from new vendors, not the existing leaders. PCs were initially constructed by hobbyists from kits. Soon after companies such as Radio Shack, Commodore, and the newcomer Apple offered fully-assembled, ready-to-run computers. IBM offered the PC after the success of these upstarts.

The driver for simpler cloud platforms will be cost -- direct and indirect, mostly indirect. The "cloud computing is a mainframe" analogy is not perfect, as the billed costs for cloud platforms can be inexpensive. The expense is not in the hardware, but the time to make the thing work. Current cloud platforms require expertise, and expertise that is not cheap. Companies are willing to pay for that expertise... for now.

I expect that we will see competition to the big cloud platforms, and the marketing will focus on ease of use and low Total Cost of Ownership (TCO). The newcomers will offer simpler clouds, sacrificing performance for reduced administration cost.

My project is currently stuck. Deployments fail, so I cannot update my app. Support is not really available, so I must rely on the limited web pages and perhaps trial and error. I may have to create a new app in Azure and copy my existing code to it. I'm not happy with the experience.

I'm also looking for a simpler cloud platform.

Thursday, January 18, 2018

After Agile

The Agile project method was developed as an alternative (one might say, a rebuttal) of Waterfall. Waterfall was first, aside from the proto-process of "do whatever we want" that was used prior to Waterfall. Waterfall had a revolutionary idea: Let's think about what we will do before we do it.

Waterfall can work with small and large projects, and small and large project teams. If offers fixed cost, fixed schedule, and fixed features. Once started, a project plan can be modified, but only with change control, a bureaucratic process to limit changes in addition to broadcasting proposed changes to the entire team.

Agile, its initial incarnation, was for small teams and projects with flexible schedules. Schedule may be fixed, or may be variable; you can deliver a working product at any time. (Although you cannot know in advance which features will be in the delivered product.)

Agile has no no change control process -- or rather, Agile is all about change control, allowing revisions to features at any time. Each iteration (or "sprint", or "cycle") starts with a conversation that involved stakeholders who decide on the next set of features. Waterfall's idea of "think, talk, and agree before we act" is part of Agile.

So we have two methods for managing development projects. But two is an unreasonable number. In the universe, there are rarely two (and only two) of things. Some things, such as electrons and stars and apples, exist in large quantities. Some things, such as the Hope Diamond and our planet's atmosphere, exist as singletons. (A few things do exist in pairs. But the vast majority of objects are either singles or multitudes.)

If software management methods exist as a multitude (for they are clearly not a singleton) then we can expect a third method after Waterfall and Agile. (And a fourth, and a fifth...)

What are the attributes of this new methods? I don't know -- yet. But I have some ideas.

We need a management process for distributed teams, where the participants cannot meet in the same room. This issue is mostly about communication, and it includes differences in time zones.

We need a management process for large systems composed of multiple applications, or "systems of systems". Agile cannot handle projects of this size; waterfall has flaws with it.

Here are some techniques that I think will be in new management methods:
  • Automated testing
  • Automated deployment with automated roll-back
  • Automated evaluation of source code (lint, Robocop, etc.)
  • Automated recording (and transcribing) of meetings and conversations
It is possible that new methods will use other terms and avoid the "Agile" term. I tend to doubt that. We humans like to name things, and we prefer familiar names. "Agile" was called "Agile" and not "Express Waterfall" because the founders wanted to emphasize the difference from the even-then reviled Waterfall method.

The Waterfall brand was tarnished -- and still is. Few folks want to admit to using Waterfall; they prefer to claim Agile methods. So I'm not expecting a "new Waterfall" method.

Agile's brand is strong; developers want to work on Agile projects and managers want to lead Agile projects. Whatever methods we devise, we will probably call them "Agile". We will use "Distributed Agile" for distributed teams, "Large Agile" for large teams, and maybe "Layered Agile" for systems of systems.

Or maybe we will use other terms. If Agile falls out of favor, then we will pick a different term, such as "Coordinated".

Regardless of the names, I'm looking forward to new project management methods.

Monday, January 1, 2018

Predictions for tech in 2018

Predictions are fun! Let's have some for the new year!

Programming Languages

Java, C, and C# will remain the most popular languages, especially in large commercial efforts. Moderately popular languages such as Python and JavaScript will remain moderately popular. (JavaScript is one of the "three legs of web pages", along with HTML and CSS, so it is very popular for web page and front-end work.)

Interest in functional programming languages (Haskell, Erlang) will remain minimal, while I expect interest in Rust (which focuses on safety, speed, and concurrency) to increase.

Cloud and Mobile

The year 2017 was the year that cloud computing become the default for new applications, especially business applications. The platforms and tools available from the big providers (Amazon.com, Microsoft, Google, and IBM) make a convincing case. Building traditional web applications on in-house data centers will still be used for some specialty applications.

The front end for applications remains split between browsers and mobile devices. Mobile devices are the platform of choice for consumer applications, including banking, sales, games, and e-mail. Browsers are the platform of choice for internal commercial applications, which require larger screens.

Browsers

Chrome will remain the dominant browser, possibly gaining market share. Microsoft will continue to support its Edge browser, and it has the resources to keep it going. Other browsers such as Firefox and Opera will be hard-pressed to maintain viability.

PaaS (Platform as a Service)

The middle version of platforms for cloud computing, PaaS sits between IaaS (Infrastructure as a Service) and SaaS (Software as a Service). It offers a platform to run applications, handling the underlying operating system, database, and messaging layers and keeping them hidden from the developer.

I expect an increase in interest in these platforms, driven by the increase in cloud-based apps. PaaS removes a lot of administrative work, for development and deployment.

AI and ML (Artificial Intelligence and Machine Learning)

Most of AI is actually ML, but the differences are technical and obscure. The term "AI" has achieved critical mass, and that's what we'll use, even when we're talking about Machine Learning.

Interest in AI will remain high, and companies with large data sets will take advantage of it. Initial applications will include credit analysis and fraud analysis (such applications are already under development). The platforms offered by Google, Microsoft, and IBM (and others) will make experimentation with AI possible for many, although one needs large data sets in addition to the AI compute platform.

Containers

Interest in containers will remain strong. Containers ease deployment; if you deploy frequently (or even infrequently) you will want to at least evaluate them.

Big Data

The term "Big Data" will all but disappear in 2018. Like its predecessor "real time", it was a vague description of computing that was beyond the reach of typical (at the time) hardware and software. Hardware and software improved to the point that performance was good enough, and the term "real time" is now limited to a few very specialized situations. I expect the same for "big data".

Related terms, like "data science" and "analytics" will remain. Their continued existence will depend on their perceived value to organizations; I think the latter has secured a place, the former is still under scrutiny.

IoT

The "Internet of Things" will see a lot of hype in 2018. I expect a lot of internet-connected devices, from drones to dolls, from cameras to cars, and from bicycles to birdcages (really!).

The technology for connected devices has gotten ahead of our understanding, much like the original microcomputers before the IBM PC.

We don't know how to use connected things -- yet. I expect that we will experiment with a lot of uses before we find the "killer app" of IoT. Once we do, I expect that we will see a standardization of protocols for IoT devices, making the early devices obsolete.

Apple

I expect Apple to have a successful and profitable 2018. They remain, in my opinion, at risk of becoming the "iPhone company", with more than 80% of the income coming from phones. The other risk is from their aversion to cloud computing -- Apple puts compute power in its devices (laptops, tablets, phones, and watches) and does not leverage or offer cloud services.

The latter omission (lack of cloud services) will be a serious problem in the future. The other providers (Microsoft, Google, IBM, etc.) provide cloud services and development platforms. Apple stands alone, keeping developers on the local device and using cloud computing for its internal use.


These are my predictions for 2018. In short, I expect a rather dull year, focused more on exploring our current technology than creating new tech. We've got a lot of relatively new tech toys to play with, and they should keep us occupied for a while.

Of course, I could be wrong!