Wednesday, October 19, 2016

We prefer horizontal layers, not vertical stacks

Looking back at the 60-plus years of computer systems, we can see a pattern of design preferences. That pattern is an initial preference for vertical design (that is, a complete system from top to bottom) followed by a change to a horizontal divide between a platform and applications on that platform.

A few examples include mainframe computers, word processors, and smart phones.

Mainframe computers, in the early part of the mainframe age, were special-purpose machines. IBM changed the game with its System/360, which was a general-purpose computer. The S/360 could be used for commercial, scientific, or government organizations. It provided a common platform upon which ran application programs. The design was revolutionary, and it has stayed with us. Minicomputers followed the "platform and applications" pattern, as did microcomputers and later IBM's own Personal Computer.

When we think of the phrase "word processor", we think of software, most often Microsoft's "Word" application (which runs on the Windows platform). But word processors were not always purely software. The original word processors were smart typewriters, machines with enhanced capabilities. In the mid-1970s, a word processor was a small computer with a keyboard, display, processing unit, floppy disks for storage, a printer, and software to make it all go.

But word processors as hardware did not last long. We moved away from the all-in-one design. In its place we used the "application on platform" approach, using PCs as the hardware and a word processing application program.

More recently, smart phones have become the platform of choice for photography, music, and navigation. We have moved away from cameras (a complete set of hardware and software for taking pictures), moved away from MP3 players (a complete set of hardware and software for playing music), and moved away from navigation units (a complete set of hardware and software for providing directions). In their place we use smart phones.

(Yes, I know that some people still prefer discrete cameras, and some people still use discrete navigation systems. I myself still use an MP3 player. But the number of people who use discrete devices for these tasks is small.)

I tried thinking of single-use devices that are still popular, and none came to mind. (I also tried thinking of applications that ran on platforms that moved to single-use devices, and also failed.)

It seems we have a definite preference for the "application on platform" design.

What does this mean for the future? For smart phones, possibly not so much -- other than they will remain popular until a new platform arrives. For the "internet of things", it means that we will see a number of task-specific devices such as thermostats and door locks until an "internet of things" platform comes along, and then all of those task-specific devices will become obsolete (like the task-specific mainframes or word processor hardware).

For cloud systems, perhaps the cloud is the platform and the virtual servers are the applications. Rather than discrete web servers and database servers the cloud is the platform for web server and database server "applications" that will be containerized versions of the software. The "application on platform" pattern means that cloud and containers will endure for some time, and is a good choice for architecture.

Sunday, August 21, 2016


Software development requires an awareness of scale. That is, the knowledge of the size of the software, and the selection of the right tools and skills to manage the software.

Scale is present in just about every aspect of human activity. We humans have different forms of transportation. We can walk, ride bicycles, and drive automobiles. (We can also run, swim, ride busses, and fly hang gliders, but I will stick to the three most common forms.)

Walking is a simple activity. It requires little in the way of planning, little in the way of equipment, and little in the way of skill.

Riding a bicycle is somewhat higher in complexity. It requires equipment (the bicycle) and some skill (knowing how to ride the bicycle). It requires planning: when we arrive at our destination, what do we do with the bicycle? We may need a lock, to secure the bicycle. We may need a safety helmet. The clothes we wear must be tight-fitting, or at least designed to not interfere with the bicycle mechanism. At this level of complexity we must be aware of the constraints on our equipment.

Driving an automobile is more complex than riding a bicycle. It requires more equipment (the automobile) and more skills (driving). It requires more planning. In addition to the car we will need gasoline. And insurance. And registration. At this level of complexity we must be aware of the constraints on our equipment and also the requirements from external entities.

Translating this to software, we can see that some programs are simple and some are complex.

Tiny programs (perhaps a one-liner in Perl) are simple enough that we can write them, use them, and then discard them, with no other thoughts for maintenance or upkeep. Let's consider them the equivalent of walking.

Small programs (perhaps a page-long script in Ruby) require more thought to prepare (and test) and may need comments to describe some of the inner workings. They can be the equivalent of riding a bicycle.

Large programs (let's jump to sophisticated packages with hundreds of thousands of lines of code) require a language that helps us organize our code; comprehensive sets of unit, component, and system tests; and documentation to record the rationale behind design decisions. These are the analogue of driving an automobile.

But here is where software differs from transportation: software changes. The three modes of transportation (walking, bicycle, automobile) are static and distinct. Walking is walking. Driving is driving. But software is dynamic -- frequently, over time, it grows. Many large programs start out as small programs.

A small program can grow into a larger program. When it does, the competent developer changes the tools and practices used to maintain the code. Which means that a competent programmer must be aware of the scale of the project, and the changes in that scale. As the code grows, a programmer must change his (or her) tools and practice.

There's more, of course.

The growth effect of software extends to the management of project teams. A project may start with a small number of people. Over time, the project grows, and the number of people on the team increases.

The techniques and practices of a small team don't work for larger teams. Small teams can operate informally, with everyone in one room and everyone talking to everyone. Larger teams are usually divided into sub-teams, and the coordination of effort is harder. Informal methods work poorly, and the practices must be more structured, more disciplined.

Enterprise-class projects are even more complex. They require more discipline and more structure than the merely large projects. The structure and discipline is often expressed in bureaucracy, frustrating the whims of "lone cowboy" programmers.

Just as a competent developer changes tools and adjusts practices to properly manage a growing code base, the competent manager must also change tools and adjust practices to properly manage the team. Which means that a competent manager must be aware of the scale of the project, and the changes in that scale. As a project grows, a manager must lead his (or her) people through the changes.

Sunday, August 14, 2016

PC-DOS killed the variants of programming languages

BASIC was the last language with variants. Not "variant" in the flexible-value type known as "Variant", but in different implementations. Different dialects.

Many languages have versions. C# has had different releases, as has Java. Perl is transitioning from version 5 (which had multiple sub-versions) to version 6 (which will most likely have multiple sub-versions).  But that's not what I'm talking about.

Some years ago, languages had different dialects. There were multiple implementations with different features. COBOL and FORTRAN all had machine-specific versions. But BASIC had the most variants. For example:

- Most BASICs used the "OPEN" statement to open files, but HP BASIC and GE BASIC used the "FILES" statement which listed the names of all files used in the program. (An OPEN statement lists only one file, and a program may use multiple OPEN statements.)

- Most BASICs used parentheses to enclose variable subscripts, but some used square brackets.

- Some BASICS had "ON n GOTO" statements but some used "GOTO OF n" statements.

- Some BASICS allowed the apostrophe as a comment indicator; others did not.

- Some BASICS allowed for statement modifiers, such as "FOR" or "WHILE" at the end of a statement and others did not.

These are just some of the differences in the dialects of BASIC. There were others.

What interests me is not that BASIC had so many variants, but that languages since then have not. The last attempt at a dialect of a language was Microsoft's Visual J++ as a variant of Java. They were challenged in court by Sun, and no one has attempted a special version of a language since. Because of this, I place the demise of variants in the year 2000.

There are two factors that come to mind. One is standards, the other is open source.
BASIC was introduced to the industry in the 1960s. There was no standard for BASIC, except perhaps for the Dartmouth implementation, which was the first implementation. The expectation of standards has risen since then, with standards for C, C++, Java, C#, JavaScript, and many others. With clear standards, different implementations of languages would be fairly close.

The argument that open source prevented the creation of variants of languages makes some sense. After all, one does not need to create a new, special version of a language when the "real" language is available for free. Why invest effort into a custom implementation? And the timing of open source is coincidental with the demise of variants, with open source rising just as language variants disappeared.

But the explanation is different, I think. It was not standards (or standards committees) and it was not open source that killed variants of languages. It was the PC and Windows.

The IBM PC and PC-DOS saw the standardization and commoditization of hardware, and the separation of software from hardware.

In the 1960s and 1970s, mainframe vendors and minicomputer vendors competed for customer business. They sold hardware, operating systems, and software. They needed ways to distinguish their offerings, and BASIC was one way that they could do that.

Why BASIC? There were several reasons. It was a popular language. It was easily implemented. It had no official standard, so implementors could add whatever features they wanted. A hardware manufacturer could offer their own, special version of BASIC as a productivity tool. IBM continued this "tradition" with BASIC in the ROM of the IBM PC and an advanced BASIC with PC-DOS.

But PC compatibles did not offer BASIC, and didn't need to. When manufacturers figured out how to build compatible computers, the factors for selecting a PC compatible were compatibility and price, not a special version of BASIC. Software would be acquired separately from the hardware.

Mainframes and minicomputers were expensive systems, sold with operating systems and software. PCs were different creatures, sold with an operating system but not software.

It's an idea that holds today.

With software being sold (or distributed, as open source) separately from the hardware, there is no need to build variants. Commercial languages (C#, Java, Swift) are managed by the company, which has an incentive for standardization of the language. Open source languages (Perl, Python, Ruby) can be had "for free", so why build a special version -- especially when that special version will need constant effort to match the changes in the "original"? Standard-based languages (C, C++) offer certainty to customers, and variants on them offer little advantage.

The only language that has variants today seems to be SQL. That makes sense, as the SQL interpreter is bundled with the database. Creating a variant is a way of distinguishing a product from the competition.

I expect that the commercial languages will continue to evolve along consistent lines. Microsoft will enhance C#, but there will be only the Microsoft implementation (or at least, the only implementation of significance). Oracle will maintain Java. Apple will maintain Swift.

The open source languages will evolve too. But Perl, Python, and Ruby will continue to see single implementations.

SQL will continue be the outlier. It will continue to see variants, as different database vendors supply them. It will be interesting to see what happens with the various NoSQL databases.

Monday, August 8, 2016

Agile is all about code quality

Agile promises clean code. That's the purpose of the 'refactor' phase. After creating a test and modifying the code, the developer refactors the code to eliminate compromises made during the changes.

But how much refactoring is enough? One might flippantly say "as much as it takes" but that's not an answer.

For many shops, the answer seems to be "as much as the developer thinks is needed". Other shops allow refactoring until the end of the development cycle. The first is subjective and opens the development team to the risk of spending too much time on refactoring and not enough on adding features. The second is arbitrary and risks short-changing the refactoring phase and allowing messy code to remain in the system.

Agile removes risk by creating automated tests, creating them before modifying the code, and having developers run those automated tests after all changes. Developers must ensure that all tests pass; they cannot move on to other changes while tests are failing.

This process removes judgement from the developer. A developer cannot say that the code is "good enough" without the tests confirming it. The tests are the deciders of completeness.

I believe that we want the same philosophy for code quality. Instead of allowing a developer to decide when refactoring has reached "good enough", we will instead use an automated process to make that decision.

We already have code quality tools. C and C++ have had lint for decades. Other languages have tools as well. (Wikipedia has a page for static analysis tools.) Some are commercial, others open source. Most can be tailored to meet the needs of the team, placing more weight on some issues and ignoring others. My favorite at the moment is 'Rubocop', a style-checking tool for Ruby.

I expect that Agile processes will adopt a measured approach to refactoring. By using one (or several) code assessors, a team can ensure quality of the code.

Such a change is not without ramifications. This change, like the use of automated tests, takes judgement away from the programmer. Code assessment tools can consider many things, some of which are style. They can examine indentation, names of variables or functions, the length or complexity of a function, or the length of a line of code. They can check the number of layers of 'if' statements or 'while' loops.

Deferring judgement to the style checkers will affect managers as well as programmers. If a developer must refactor code until it passes the style checker, then a manager cannot cut short the refactoring phase. Managers will probably not like this change -- it takes away some control. Yet it is necessary to maintain code quality. By ending refactoring before the code is at an acceptable quality, managers allow poor code to remain in the system, which will affect future development.

Agile is all about code quality.

Sunday, July 31, 2016

Agile pushes ugliness out of the system

Agile differs from Waterfall in many ways. One significant way is that Agile handles ugliness, and Waterfall doesn't.

Agile starts by defining "ugliness" as an unmet requirement. It could be a new feature or a change to the current one. The Agile process sees the ugliness move through the system, from requirements to test to code to deployment. (Waterfall, in contrast, has the notion of requirements but not the concept of ugliness.)

Let's look at how Agile considers ugliness to be larger than just unmet requirements.

The first stage is an unmet requirement. With the Agile process, development occurs in a set of changes (sometimes called "sprints") with a small set of new requirements. Stakeholders may have a long list of unmet requirements, but a single sprint handles a small, manageable set of them. The "ugliness" is the fact that the system (as it is at the beginning of the sprint) does not perform them.

The second stage transforms the unmet requirements into tests. By creating a test -- an automated test -- the unmet requirement is documented and captured in a specific form. The "ugliness" has been captured and specified.

After capture, changes to code move the "ugliness" from a test to code. A developer changes the system to perform the necessary function, and in doing so changes the code. But the resulting code may be "ugly" -- it may duplicate other code, or it may be difficult to read.

The fourth stage (after unmet requirements, capture, and coding) is to remove the "ugliness" of the code. This is the "refactoring" stage, when code is improved without changing the functions it performs. Modifying the code to remove the ugliness is the last stage. After refactoring, the "ugliness" is gone.

The ability to handle "ugliness" is the unique capability of Agile methods. Waterfall has no concept of code quality. It can measure the number of defects, the number of requirements implemented, and even the number of lines of code, but it doesn't recognize the quality of the code. The quality of the code is simply its ability to deliver functionality. This means that ugly code can collect, and collect, and collect. There is nothing in Waterfall to address it.

Agile is different. Agile recognizes that code quality is important. That's the reason for the "refactor" phase. Agile transforms requirements into tests, then into ugly code, and finally into beautiful (or at least non-ugly) code. The result is requirements that are transformed into maintainable code.

Tuesday, July 26, 2016

The changing role of IT

The original focus of IT was efficiency and accuracy. Today, the expectation still includes efficiency and accuracy, yet adds increased revenue and expanded capabilities for customers.

IT has been with us for more than half a century, if you count IT as not only PCs and servers but also minicomputers, mainframes, and batch processing systems for accounting and finance.

Computers were originally large, expensive, and fussy beasts. They required an whole room to themselves. Computers cost a lot of money. Mainframes cost hundreds of thousands of dollars (if not millions). They needed a coterie of attendants: operators, programmers, service technicians, and managers.

Even the early personal computers were expensive. A PC in the early 1980s cost three to five thousand dollars. They didn't need a separate room, but they were a significant investment.

The focus was on efficiency. Computers were meant to make companies more efficient, processing transactions and generating reports faster and more accurately than humans.

Because of their cost, we wanted computers to operate as efficiently as possible. Companies who purchased mainframes would monitor CPU and disk usage to ensure that they were operating in the ninety-percent range. If usage was higher than that, they knew they needed to expand their system; if less, they had spent too much on hardware.

Today, we focus less on efficiency and more on growing the business. We view automation and big data as mechanisms for new services and ways to acquire new customers.

That's quite a shift from the "spend just enough to print accounting reports" mindset. What changed?

I can think of two underlying changes.

First, the size and cost of computers have dropped. A cell phone that fits in your pocket and costs less than a thousand dollars. Laptop PCs can be acquired for similar prices; Chromebooks for significantly less. Phones, tablets, Chromebooks, and even laptops can be operated by a single person.

The drop in cost means that we can worry less about internal efficiency. Buying a mainframe computer that was too large was an expensive mistake. Buying an extra laptop is almost unnoticed. Investing in IT is like any other investment, with a potential return of new business.

Yet there is another effect.

In the early days of IT (from the 1950s to the 1980s), computers were mysterious and almost magical devices. Business managers were unfamiliar with computers. Many people weren't sure that computers would remain tame, and some feared that they would take over (the company, the country, the world). Managers didn't know how to leverage computers to their full extent. Investors were wary of the cost. Customers resisted the use of computer-generated cards that read "do not fold, spindle, or mutilate".

Today, computers are not mysterious, and certainly not magical. They are routine. They are mundane. And business managers don't fear them. Instead, managers see computers as a tool. Investors see them as equipment. Customers willingly install apps on their phones.

I'm not surprised. The business managers of the 1950s grew up with manual processes. Senior managers might have remembered an age without electricity.

Today's managers are comfortable with computers. They used them as children, playing video games and writing programs in BASIC. The thought that computers can assist the business in various tasks is a natural extension of that experience.

Our view of computers has shifted. The large, expensive, magical computation boxes have shrunk and become cheaper, and are now small, flexible, and powerful computation boxes. Simply owning (or leasing) a mainframe would provide strategic advantage through intimidation; now everyone can leverage server farms, networks, cloud computing, and real-time updates. But owning (or leasing) a server farm or a cloud network isn't enough to impress -- managers, customers, and investors look for business results.

With a new view of computers as mundane, its no surprise that businesses look at them as a way to grow.

Thursday, July 21, 2016

Spaghetti in the Cloud

Will cloud computing eliminate spaghetti code? The question is a good one, and the answer is unclear.

First, let's understand the term "spaghetti code". It is a term that dates back to the 1970s according to Wikipedia and was probably an argument for structured programming techniques. Unstructured programming was harder to read and understand, and the term introduced an analogy of messy code.

Spaghetti code was bad. It was hard to understand. It was fragile, and small changes led to unexpected failures. Structured programming was, well, structured and therefore (theoretically) spaghetti programming could not occur under the discipline of structured programming.

But theory didn't work quite right, and even with the benefits of structured programming, we found that we had code that was difficult to maintain. (In other words, spaghetti code.)

After structured programming, object-oriented programming was the solution. Object-oriented programming, with its ability to group data and functions into classes, was going to solve the problems of spaghetti code.

Like structured programming before it, object-oriented programming didn't make all code easy to read and modify.

Which brings us to cloud computing. Will cloud computing suffer from "spaghetti code"? Will we have difficult to read and difficult to maintain systems in the cloud?

The obvious answer is "yes". Companies and individuals who transfer existing (difficult to read) systems into the cloud will have ... difficult-to-understand code in the cloud.

The more subtle answer is... "yes".

The problems of difficult-to-read code is not the programming style (unstructured, structured, or object-oriented) but in mutable state. "State" is the combination of values for all variables and changeable entities in a program. For a program with mutable state, these variables change over time. For one to read and understand the code, one must understand the current state, that is, the current value of all of those values. But to know the current value of those variables, one must understand all of the operations that led to the current state, and that list can be daunting.

The advocates of functional programming (another programming technique) doesn't allow for mutable variables. Variables are fixed and unchanging. Once created, they exist and retain their value forever.

With cloud computing, programs (and variables) do not hold state. Instead, state is stored in databases, and programs run "stateless". Programs are simpler too, with a cloud system using smaller programs linked together with databases and message queues.

But that doesn't prevent people from moving large, complicated programs into the cloud. It doesn't prevent people from writing large, complicated programs in the cloud. Some programs in the cloud will be small and easy to read. Others will be large and hard to understand.

So, will spaghetti code exist in the cloud? Yes. But perhaps not as much as in previous technologies.