Monday, January 16, 2017

Discipline in programming

Programming has changed over the years. We've created new languages and added features to existing languages. Old languages that many consider obsolete are still in use, and still changing. (COBOL and C++ are two examples.)

Looking at individual changes, it is difficult to see a general pattern. But stepping back and getting a broader view, we can see that the major changes have increased discipline and rigor.

The first major change was the use of high-level languages in place of assembly language. Using high-level languages provided some degree of portability across different hardware (one could, theoretically, run the same FORTRAN program on IBM, Honeywell, and Burroughs mainframes). It meant a distant relationship with the hardware and a reliance on the compiler writers.

The next change was structured programming. It changed our notions of flow control, using "while", "if/then/else", and "for" structures and discouraged the use of "goto".

Then we adopted relational databases, separate from the application program. It required using an API (later standardized as SQL) rather than accessing data directly, and it required thought and planning for the database.

Relational databases forced us to organize data stored on disk. Object-oriented programming forced us to organize data in memory. We needed object models and for very large projects, separate teams to manage the models.

Each of these changes added discipline to programming. The shift to compilers required reliable compilers and reliable vendors to support them. Structured programming applied rigor to the sequence of computation. Relational databases applied rigor to the organization of data stored outside of memory, that is, on disk. Object-oriented programming applied rigor to the organization of data stored in memory.

I should note that each of these changes was opposed. Each had naysayers, usually basing their arguments on performance. And to be fair, the initial implementation of each change did have lower performance than the old way. Yet each change has a group of advocates (I call them "the Pascal crowd" after the early devotees to that language) who pushed for the change. Eventually, the new methods were improved and accepted.

The overall trend is towards rigor and discipline. In other words, the Pascal crowd has consistently won the debates.

Which is why, when looking ahead, I think future changes will keep moving in the direction of rigor and discipline. There may be minor deviations from this path, with new languages introducing undisciplined concepts, but I suspect that they will languish. The successful languages will require more thought, more planning, and prevent more "dangerous" operations.

Functional programming is promising. It applies rigor to the state of our program. Functional programming languages use immutable objects, which once made cannot be changed. As the state of the program is the sum of the state of all variables, functional programming demands more thought given to the state of our system. That fits in with the overall trend.

So I expect that functional languages, like structured languages and object-oriented languages, will be gradually adopted and their style will be accepted as normal. And I expect more changes, all in the direction of improved rigor and discipline.

Wednesday, January 11, 2017

Microsoft's last "we do it all" project

Today's Microsoft is different from the "evil empire" of yesteryear. Today, Microsoft embraces open source and supports non-Microsoft operating systems, languages, databases, and tools.

But it wasn't always that way.

In an earlier age, Microsoft was the empire. They were big, but the reason people considered them an empire was their attitude. Microsoft had answers for all of your computing needs. Operating system. Utilities. Office suite, including e-mail. Database. Development tools. Accounting packages. Project Management. Browser. The goal was for Microsoft to be the sole source for your computing needs.

Microsoft ensured this by making its tools more capable, more performant (is that a word?), more reliable, and more integrated with other Microsoft technologies than the competition's offerings.

One weakness was the command-line shell, CMD.exe or as it was known early on, the "DOS box". CMD.exe was a direct clone of the command-line interface from MS-DOS, which was initially a clone of the CP/M command line interface (itself a copy of DEC's command line interfaces). Microsoft extended the MS-DOS interface over the years, and even added feature in the Windows version.

But Microsoft had to stay compatible with earlier versions, and features had to be inserted into the shell "language", often resulting in a clunky syntax. The decision in MS-DOS to allow the slash character as an option specifier meant that directories had to be separated by backslash. That meant that backslashes could not be used as escape characters (until they could, but only for the double-quote character). Variable names had to be signified with a percent sign, as a dollar sign was allowed as part of a file name (that, too, dated back to CP/M). The compromises cascaded over the years, and the result was a lot of complaints to Microsoft, mostly from developers. (Microsoft gave weight to the opinions of developers, as it knew they were important for future applications.)

Microsoft needed an answer to the complaints. As an empire, Microsoft needed to provide a better shell. They had to provide the best, a product better than the competition. To meet that need, they invented PowerShell.

PowerShell was Microsoft's bigger, better, comprehensive shell. It would fix the problems of CMD and it would offer all needed capabilities. You would not need a competing shell. It had everything, and it was better than all other shells. Its commands were descriptive, not cryptic. Options to commands were consistent. It could run scripts. It had variables (with the 'proper' syntax of dollar signs). It had multiple scopes for variables (something lacking in other shells). It allowed for "pipelining" of commands, and it could pass not just text streams but full .NET objects in the pipeline. It allowed for hooks into the .NET framework.

PowerShell was a shell "done right", with everything you could possibly need.

And it was the last product of Microsoft's "we do it all" strategy.

The problem for Microsoft (and any empire) is that no matter how large you get, the world is always bigger. And since the world is bigger, you cannot provide everything for everyone. No matter how fast or powerful you make your products, someone will want something else, perhaps something small and light. All-encompassing empires are expensive to build and expensive to maintain. Microsoft has come to terms with that concept, and changed its product offerings. Microsoft Azure allows for non-Microsoft technologies such as Linux and Python and PHP. Windows now includes a WSL (Windows Subsytem for Linux) component that runs bash, a popular shell for Linux.

I think this change is good for Microsoft, good for its customers, and good for the industry. For Microsoft, they no longer have to build (and maintain and support) products for everything -- they can focus on their strengths and deliver well-designed and well-supported products without being distracted. Microsoft's customers have a little more work to do, analyzing non-Microsoft products as part of their technology stack. (They can choose to remain with all Microsoft products, but they may miss out on some opportunities.)

The industry, too, benefits. For too long, Microsoft's strategy of supplying everything intimidated people from entering the market. Why invest time and money in a new product for Windows only to see modest success be met with tough competition from Microsoft? I believe that many folks left the Microsoft ecosystem for that reason.

Of course, now Microsoft can concentrate its efforts on its key products and services -- which may change over time. Microsoft may move into markets; don't think that they will ignore opportunities. But they will enter as a competitor, not as "the evil empire".

Tuesday, January 3, 2017

Predictions for 2017

What will happen in the new year? Let's make some predictions!

Cloud computing and containers remain popular.

Ransomware will become more prevalent, with a few big name companies (and a number of smaller companies) suffering infections. Individuals will be affected as well. Companies may be spurred to improve their security; "traditional" malware was annoying but ransomware stops operations and costs actual money. Earlier virus programs would require effort from the support team to resolve, and that expense could be conveniently ignored by managers. But this new breed of malware requires an actual payment, and that is harder to ignore. I expect a louder cry for secure operating systems and applications, but effective changes will take time (years).

Artificial Intelligence and Machine Learning will be discussed. A few big players will advertise projects. They will have little effect on "the little guy", small companies, and slow-moving organizations.

Apple will continue to lead the design for laptops and phones. Laptop computers from other manufacturers will lose DVD readers and switch to USB-C (following Apple's design for the MacBook). Apple itself will look for ways to distinguish its MacBooks from laptops.

Tablet sales will remain weak. We don't know what to do with tablets, at home or in the office. They fill a niche between phones and laptops, but if you have those two you don't need a tablet. If you have a phone and are considering an additional device, the laptop is the better choice. If you have a laptop and are considering an additional device, the phone is the better choice. Tablets offer no unique abilities.

Laptop sales will remain strong. Desktop sales will decline. There is little need for a tower PC, and the prices for laptops are in line with prices for desktops. Laptops offer portability, which is good for telework or group meetings. Tower PCs offer expansion slots, which are good for... um, very little in today's offices.

Tower PCs won't die. They will remain the PC of choice for games, and for specific applications that need the processing power of GPUs. Some manufacturers may drop the desktop configurations, and the remaining manufacturers will be able to raise prices. I won't guess at who will stay in the desktop market.

Amazon.com will grow cloud services but lose market share to Microsoft and Google, who will grow at faster rates. Several small cloud providers will cease operations. If you're using a small provider of cloud services, be prepared to move.

Programming languages will continue to fracture. (Witness the decline on http://www.tiobe.com/tiobe-index/.) The long trend has been to move away from a few dominant languages and towards a collection of mildly popular languages. This change makes life uncomfortable for managers, because there is no one "safe" language that is "the best" for corporate development. But fear not, because...

Vendor relationships will continue to define the best programming languages for your projects: Java with Oracle, C# with Microsoft, Swift with Apple. If you are a Microsoft shop, your best language is C#. (You may consider F# for special projects.) If you are developing iOS applications, your best language is Swift. For Android apps, you want Java. Managers need not worry too much about difficult decisions for programming languages.

Those are my ideas for the new year. Let's see what really happens!

Wednesday, December 28, 2016

Moving to the cloud requires a lot. Don't be surprised.

Moving applications to the cloud is not easy. Existing applications cannot be simply dropped onto cloud servers and leverage the benefits of cloud computing. And this should not surprise people.

The cloud is a different environment than a web server. (Or a Windows desktop.) Moving to the cloud is a change in platform.

The history of IT has several examples of such changes. Each transition from one platform to another required changes to the code, and often changes to how we *think* about programs.

The operating system

The first changes occurred in the mainframe age. The very first was probably the shift from a raw hardware platform to hardware with an operating system. With raw hardware, the programmer has access to the entire computing system, including memory and devices. With an operating system, the program must request such access through the operating system. It was no longer possible to write directly to the printer; one had to request the use of each device. This change also saw the separation of tasks between programmers and system operators, the latter handling the scheduling and execution of programs. One could not use the older programs; they had to be rewritten to call the operating system rather that communicate with devices.

Timesharing and interactive systems

Timesharing was another change in the mainframe era. In contrast to batch processing (running one program at a time, each program reading and writing data as needed but with no direct interaction with the programmer), timeshare systems interacted with users. Timeshare systems saw the use of on-line terminals, something not available for batch systems. The BASIC language was developed to take advantage of these terminals. Programs had to wait for user input and verify that the input was correct and meaningful. While batch systems could merely write erroneous input to a 'reject' file, timeshare systems could prompt the user for a correction. (If they were written to detect errors.) One could not use a batch program in an interactive environment; programs had to be rewritten.

Minicomputers

The transition from mainframes to minicomputers was, interestingly, one of the simpler conversions in IT history. In many respects, minicomputers were smaller versions of mainframes. IBM minicomputers used the batch processing model that matched its mainframes. Minicomputers from manufacturers like DEC and Data General used interactive systems, following the lead of timeshare systems. In this case, is *was* possible to move programs from mainframes to minicomputers.

Microcomputers

If minicomputers allowed for an easy transition, microcomputers were the opposite. They were small and followed the path of interactive systems. Most ran BASIC in ROM with no other possible languages. The operating systems available (CP/M, MS-DOS, and a host of others) were limited and weak compared to today's, providing no protection for hardware and no multitasking. Every program for microcomputers had to be written from scratch.

Graphical operating systems

Windows (and OS/2 and other systems, for those who remember them) introduced a number of changes to programming. The obvious difference between Windows programs and the older DOS programs was, of course, the graphical user interface. From the programmer's perspective, Windows required event-driven programming, something not available in DOS. A Windows program had to respond to mouse clicks and keyboard entries anywhere on the program's window, which was very different from the DOS text-based input methods. Old DOS programs could not be simply dropped into Windows and run; they had to be rewritten. (Yes, technically one could run the older programs in the "DOS box", but that was not really "moving to Windows".)

Web applications

Web applications, with browsers and servers, HTML and "submit" requests, with CGI scripts and JavaScript and CSS and AJAX, were completely different from Windows "desktop" applications. The intense interaction of a window with fine-grained controls and events was replaced with the large-scale request, eventually getting smaller AJAX and AJAX-like web services. The separation of user interface (HTML, CSS, JavaScript, and browser) from "back end" (the server) required a complete rewrite of applications.

Mobile apps

Small screen. Touch-based. Storage on servers, not so much on the device. Device processor for handling input; main processing on servers.

One could not drop a web application (or an old Windows desktop application) onto a mobile device. (Yes, you can run Windows applications on Microsoft's Surface tablets. But the Surface tablets are really PCs in the shape of tablets, and they do not use the model used by iOS or Android.)

You had to write new apps for mobile devices. You had to build a collection of web services to be run on the back end. (Not too different from the web application back end, but not exactly the same.)

Which brings us to cloud applications

Cloud applications use multiple instances of servers (web servers, database servers, and others) each hosting services (called "microservices" because the service is less that a full application) communicating through message queues.

One cannot simply move a web application into the cloud. You have to rewrite them to split computation and coordination, the latter handled by queues. Computation must be split into small, discrete services. You must write controller services that make requests to multiple microservices. You must design your front-end apps (which run on mobile devices and web browsers) and establish an efficient API to bridge the front-end apps with the back-end services.

In other words, you have to rewrite your applications. (Again.)

A different platform requires a different design. This should not be a surprise.


Wednesday, December 14, 2016

Steps to AI

The phrase "Artificial Intelligence" (AI) has been used to describe computer programs that can perform sophisticated, autonomous operations, and it has been used for decades. (One wag puts it as "artificial intelligence is twenty years away... always".)

Along with AI we have the term "Machine Learning" (ML). Are they different? Yes, but the popular usages make no distinction. And for this post, I will consider them the same.

Use of the term waxes and wanes. The AI term was popular in the 1980s and it is popular now. Once difference between the 1980s and now: we may have enough computing power to actually pull it off.

Should anyone jump into AI? My guess is no. AI has preconditions, things you should be doing before you start with a serious commitment to AI.

First, you need a significant amount of computing power. Second, you need a significant amount of human intelligence. With AI and ML, you are teaching the computer to make decisions. Anyone who has programmed a computer can tell you that this is not trivial.

It strikes me that the necessary elements for AI are very similar to the necessary elements for analytics. Analytics is almost the same as AI - analyzing large quantities of data - except it uses humans to interpret the data, not computers. Analytics is the predecessor to AI. If you're successful at analytics, then you are ready to move on to AI. If you haven't succeeded (or even attempted) at analytics, you're not ready for AI.

Of course, one cannot simply jump into analytics and expect to be successful. Analytics has its own prerequisites. Analytics needs data, the tools to analyze the data and render it for humans, and smart humans to interpret the data. If you don't have the data, the tools, and the clever humans, you're not ready for analytics.

But we're not done with levels of prerequisites! The data for analytics (and eventually AI) has its own set of preconditions. You have to collect the data, store the data, and be able to retrieve the data. You have to understand the data, know its origin (including the origin date and time), and know its expiration date (if it has one). You have to understand the quality of your data.

The steps to artificial intelligence are through data collection, metadata, and analytics. Each step has to be completed before you can advance to the next level. (Much like the Capability Maturity Model.) Don't make the mistake of starting a project without the proper experience in place.

Sunday, November 20, 2016

Matters of state

One difference between functional programming and "regular" programming is the use of mutable state. In traditional programming, objects or programs hold state, and that state can change over time. In functional programming, objects are immutable and do not change their state over time.

One traditional beginner's exercise for object-oriented programming is to simulate an automated teller machine (ATM). It is often used because it maps an object of the physical world onto an object in the program world, and the operations are nontrivial yet well-understood.

It also defines an object (the ATM) which exists over time and has different states. As people deposit and withdraw money, the state of the ATM changes. (With enough withdrawals the state becomes "out of cash" and therefore "out of service".)

The ATM model is also a good example of how we in the programming industry have been focussed on changing state. For more than half a century, our computational models have used state -- mutable state -- often to the detriment of maintenance and clarity.

Our fixation on mutable state is clear to those who use functional programming languages. In those languages, state is not mutable. Programs may have objects, but objects are fixed and unchanging. Once created, an object may contain state but cannot change. (If you want an object to contain a different state, then you must create a new object with that different state.)

Programmers in the traditional languages of Java and C# got an exposure to this notion with the immutable strings in those languages. A string in Java is immutable; you cannot change its contents. If you want a string with a different content, such as all lower-case letters, you have to create a new object.

Programming languages such as Haskell and Erlang make that notion the norm. Every object is immutable, every object may contain state but cannot be changed.

Why has it taken us more than fifty years to arrive at this, um, well, state?

I have a few ideas. As usual with my explanations, we have to understand our history.

One reason has to do with efficiency. The other reason has to do with mindset.

Reason one: Objects with mutable state were more efficient.

Early computers were less powerful than those of today. With today's computers, we can devote some percentage of processing to memory management and garbage collection. We can afford the automatic memory management. Earlier computers were less powerful, and creating and destroying objects were operations that took significant amounts of time. It was more efficient to re-use the same object and simply change its state rather than create a new object with the new state, point to that new object, and destroy the old object and return its memory to the free pool.

Reason two: Objects with mutable state match the physical world

Objects in the real world hold physical state. Whether it is an ATM or an automobile or an employee's file, the physical version of the object is one that changes over time. Books at a library, in the past, contained a special pocket glued to the back cover used to hold a card which indicated the borrower and the date due back at the library. That card held different state over time; each lending would be recorded -- until the card was filled.

The physical world has few immutable objects. (Technically all objects are mutable, as they wear and fade over time. But I'm not talking about those kinds of changes.) Most objects, especially objects for computation, change and hold state. Cash registers, ATMs, dresser draws that hold t-shirts, cameras (with film that could be exposed or unexposed), ... just about everything holds state. (Some things do not change, such as bricks and stones used for houses and walkpaths, but those are not used for computation.)

We humans have been computing for thousands of years, and we've been doing it with mutable objects for all of that time. From tally sticks with marks cut by a knife to mechanical adding machines, we've used objects with changing states. It's only in the past half-century that it has been possible to compute with immutable objects.

That's about one percent of the time, which considering everything we're doing, isn't bad. We humans advance our calculation methods slowly. (Consider how long it took to change from Roman numerals to Arabic, and how long it took to accept zero as a number.)

I think the lesson of functional programming (with its immutable objects) is this: We are still in the early days of human computing. We are still figuring out how to calculate, and how to represent those calculations for others to understand. We should not assume that we are "finished" and that programming is "done". We have a long journey ahead of us, one that will bring more changes. We learn as we travel on this journey, and they end -- or even the intermediate points -- is not clear. It is an adventure.

Saturday, October 29, 2016

For compatibility, look across and not down

The PC industry has always had an obsession with compatibility. Indeed, the first question many people asked about computers was "Is it PC compatible?". A fair question at the time, as most software was written for the IBM PC and would not run on other systems.

Over time, our notion of "PC compatible" has changed. Most people today think of a Windows PC as "an IBM PC compatible PC" when in fact the hardware has changed so much that any current PC is not "PC compatible". (You cannot attach any device from an original IBM PC, including the keyboard, display, or adapter card.)

Compatibility is important -- not for everything but for the right things.

The original IBM PCs were, of course, all "PC compatible" (by definition) and the popular software packages (Lotus 1-2-3, Wordstar, WordPerfect, dBase III) were all "PC compatible" too. Yet one could not move data from one program to another. Text in Wordstar was in Wordstar format, numbers and formulas in Lotus 1-2-3 was in Lotus format, and data in dBase was in dBase format.

Application programs were compatible "downwards" but not across". That is, they were compatible with the underlying layers (DOS, BIOS, and the PC hardware) but not with each other. To move data from one program to another it was necessary to "print" to a file and read the file into the destination program. (This assumes that both programs had the ability to export and import text data.)

Windows addressed that problem, with its notion of the clipboard and the ability to copy and paste text. The clipboard was not a complete solution, and Microsoft worked on other technologies to make programs more compatible (DDE, COM, DCOM, and OLE). This was the beginning of compatibility between programs.

The networked applications and the web gave us more insight to compatibility. The first networked applications for PCs were the client/server applications such as Powerbuilder. One PC hosted the database and other PCs sent requests to store, retreive, and update data. At the time, all of the PCs were running Windows.

The web allowed for variation between client and server. With web servers and capable network software, it was no longer necessary for all computers to use the same hardware and operating systems. A Windows PC could request a web page from a server running Unix. A Macintosh PC could request a web page from a Linux server.

Web services use the same mechanisms for web pages, and allow for the same variation between client and server.

We no longer need "downwards" compatibility -- but we do need compatibility "across". A server must understand the incoming request. The client must understand the response. In today's world we ensure compatibility through the character set (UNICODE), and the data format (commonly HTML, JSON, or XML).

This means that our computing infrastructure can vary. It's no longer necessary to ensure that all of our computers are "PC compatible". I expect variation in computer hardware, as different architectures are used for different applications. Large-scale databases may use processors and memory designs that can handle the quantities of data. Small processors will be used for "internet of things" appliances. Nothing requires them to all use a single processor design.