Tuesday, July 26, 2016

The changing role of IT

The original focus of IT was efficiency and accuracy. Today, the expectation still includes efficiency and accuracy, yet adds increased revenue and expanded capabilities for customers.

IT has been with us for more than half a century, if you count IT as not only PCs and servers but also minicomputers, mainframes, and batch processing systems for accounting and finance.

Computers were originally large, expensive, and fussy beasts. They required an whole room to themselves. Computers cost a lot of money. Mainframes cost hundreds of thousands of dollars (if not millions). They needed a coterie of attendants: operators, programmers, service technicians, and managers.

Even the early personal computers were expensive. A PC in the early 1980s cost three to five thousand dollars. They didn't need a separate room, but they were a significant investment.

The focus was on efficiency. Computers were meant to make companies more efficient, processing transactions and generating reports faster and more accurately than humans.

Because of their cost, we wanted computers to operate as efficiently as possible. Companies who purchased mainframes would monitor CPU and disk usage to ensure that they were operating in the ninety-percent range. If usage was higher than that, they knew they needed to expand their system; if less, they had spent too much on hardware.

Today, we focus less on efficiency and more on growing the business. We view automation and big data as mechanisms for new services and ways to acquire new customers.

That's quite a shift from the "spend just enough to print accounting reports" mindset. What changed?

I can think of two underlying changes.

First, the size and cost of computers have dropped. A cell phone that fits in your pocket and costs less than a thousand dollars. Laptop PCs can be acquired for similar prices; Chromebooks for significantly less. Phones, tablets, Chromebooks, and even laptops can be operated by a single person.

The drop in cost means that we can worry less about internal efficiency. Buying a mainframe computer that was too large was an expensive mistake. Buying an extra laptop is almost unnoticed. Investing in IT is like any other investment, with a potential return of new business.

Yet there is another effect.

In the early days of IT (from the 1950s to the 1980s), computers were mysterious and almost magical devices. Business managers were unfamiliar with computers. Many people weren't sure that computers would remain tame, and some feared that they would take over (the company, the country, the world). Managers didn't know how to leverage computers to their full extent. Investors were wary of the cost. Customers resisted the use of computer-generated cards that read "do not fold, spindle, or mutilate".

Today, computers are not mysterious, and certainly not magical. They are routine. They are mundane. And business managers don't fear them. Instead, managers see computers as a tool. Investors see them as equipment. Customers willingly install apps on their phones.

I'm not surprised. The business managers of the 1950s grew up with manual processes. Senior managers might have remembered an age without electricity.

Today's managers are comfortable with computers. They used them as children, playing video games and writing programs in BASIC. The thought that computers can assist the business in various tasks is a natural extension of that experience.

Our view of computers has shifted. The large, expensive, magical computation boxes have shrunk and become cheaper, and are now small, flexible, and powerful computation boxes. Simply owning (or leasing) a mainframe would provide strategic advantage through intimidation; now everyone can leverage server farms, networks, cloud computing, and real-time updates. But owning (or leasing) a server farm or a cloud network isn't enough to impress -- managers, customers, and investors look for business results.

With a new view of computers as mundane, its no surprise that businesses look at them as a way to grow.

Thursday, July 21, 2016

Spaghetti in the Cloud

Will cloud computing eliminate spaghetti code? The question is a good one, and the answer is unclear.

First, let's understand the term "spaghetti code". It is a term that dates back to the 1970s according to Wikipedia and was probably an argument for structured programming techniques. Unstructured programming was harder to read and understand, and the term introduced an analogy of messy code.

Spaghetti code was bad. It was hard to understand. It was fragile, and small changes led to unexpected failures. Structured programming was, well, structured and therefore (theoretically) spaghetti programming could not occur under the discipline of structured programming.

But theory didn't work quite right, and even with the benefits of structured programming, we found that we had code that was difficult to maintain. (In other words, spaghetti code.)

After structured programming, object-oriented programming was the solution. Object-oriented programming, with its ability to group data and functions into classes, was going to solve the problems of spaghetti code.

Like structured programming before it, object-oriented programming didn't make all code easy to read and modify.

Which brings us to cloud computing. Will cloud computing suffer from "spaghetti code"? Will we have difficult to read and difficult to maintain systems in the cloud?

The obvious answer is "yes". Companies and individuals who transfer existing (difficult to read) systems into the cloud will have ... difficult-to-understand code in the cloud.

The more subtle answer is... "yes".

The problems of difficult-to-read code is not the programming style (unstructured, structured, or object-oriented) but in mutable state. "State" is the combination of values for all variables and changeable entities in a program. For a program with mutable state, these variables change over time. For one to read and understand the code, one must understand the current state, that is, the current value of all of those values. But to know the current value of those variables, one must understand all of the operations that led to the current state, and that list can be daunting.

The advocates of functional programming (another programming technique) doesn't allow for mutable variables. Variables are fixed and unchanging. Once created, they exist and retain their value forever.

With cloud computing, programs (and variables) do not hold state. Instead, state is stored in databases, and programs run "stateless". Programs are simpler too, with a cloud system using smaller programs linked together with databases and message queues.

But that doesn't prevent people from moving large, complicated programs into the cloud. It doesn't prevent people from writing large, complicated programs in the cloud. Some programs in the cloud will be small and easy to read. Others will be large and hard to understand.

So, will spaghetti code exist in the cloud? Yes. But perhaps not as much as in previous technologies.

Tuesday, July 19, 2016

How programming languages change

Programming languages change. That's not news. Yet programming languages cannot change arbitrarily; the changes are constrained. We should be aware of this, and pick our technology with this in mind.

If we think of a programming language as a set of features, then programming languages can change in three ways:

Add a feature
Modify a feature
Remove a feature

The easiest change (that is, the type with the least resistance from users) is adding a feature. That's no surprise; it allows all of the old programs to continue working.

Modifying an existing feature or removing a feature is a difficult business. It means that some programs will no longer work. (If you're lucky, they won't compile, or the interpreter will reject them. If you're not lucky, the compiler or interpreter will accept them but process them differently.)

So as a programming language changes, the old features remain. Look inside a modern Fortran compiler and you will find FORMAT statements and arithmetic IF constructs, elements of Fortran's early days.

When a programming language changes enough, we change its name. We (the tech industry) modified the C language to mandate prototypes and in doing so we called the revised language "ANSI C". When Stroustup enhanced C to handle object-oriented concepts, he called it "C with Classes". (We've since named it "C++".)

Sometimes we change not the name but the version number. Visual Basic 4 was quite different from Visual Basic 3, and Visual Basic 5 was quite different from Visual Basic 4 (two of the few examples of non-compatible upgrades). Yet the later versions retained the flavor of Visual Basic, so keeping the name made sense.

Perl 6 is different from Perl 5, yet it still runs old code with a compatibility layer.

Fortran can add features but must remain "Fortranish", otherwise we call it "BASIC" or "FOCAL" or something else. Algol must remain Algol or we call it "C". An enhanced Pascal is called "Object Pascal" or "Delphi".

Language names bound a set of features for the language. Change the feature set beyond the boundary, and you also change the name of the language. Which means that a language can change only so much, in only certain dimensions, while remaining the same language.

When we start a project and select a programming language, we're selecting a set of features for development. We're locking ourselves into a future, one that may expand over time -- or may not -- but will remain centered over its current point. COBOL will always be COBOL, C++ will always be C++, and Ruby will always be Ruby. A COBOL program will always be a COBOL program, a C++ program will always be a C++ program, and a Ruby program will always be a Ruby program.

A lot of this is psychology. We certainly could make radical changes to a programming language (any language) and keep the name. But while we *could* do this, we don't. We make small, gradual changes. The changes to programming languages (I hesitate to use the words "improvements" or "progress") are glacial in nature.

I think that tells us something about ourselves, not the technology.

Sunday, July 10, 2016

Oracle's Java Headaches

Oracle, after its purchase of Sun Microsystems, has found that it owns a number of things including MySQL and Java. MySQL presents obvious problems, as it competes with Oracle's big, expensive database. But Java also presents problems.

Oracle has two challenges with Java. The first (and obvious) challenge is money. Specifically, how does Oracle "monetize" Java? Extracting money from programming languages is possible; Microsoft succeeded with BASIC, Visual Basic, and C#. (Although the last was really profitable through Visual Studio, not the language itself.)

Extracting money from Java remains elusive. Oracle's latest attempt to sue Google has failed. It was a "whale" strategy, designed to obtain a large amount from a single entity. With the loss in court, will Oracle look for a different strategy, perhaps one that looks for fees from more (and smaller) entities?

Revenue is one headache for Oracle. A second headache exists, one that is less obvious, and may show up in the expense, not revenue column.

Oracle's history has been with SQL, a language designed in the 1970s for accessing data. As a programming language, it has been remarkably stable, with only a few changes since its inception. In contrast, programming languages like Visual Basic, C, C++, and C# have seen frequent and sometimes significant changes. Java, too, has seen changes, and it has the "JCP", the Java Community Process, which allows just about anyone to recommend changes to the Java language.

This second challenge is more subtle, and possibly larger, for Oracle. After decades of a stable, unchanging language, is it capable of managing a fast-moving programming language? After decades of maintaining the Oracle database (which I'm sure had lots of internal changes and lots of changes requested by Oracle's relatively few high-paying customers) is Oracle ready to maintain a product used by "the rest of us"?

This is the bigger issue for Oracle. Maintaining the Java code base, adapting it to new platforms, adding features to the language, and putting up with all of the pesky requests from pipsqueaks (highly opinionated pipsqueaks, some of us are) is going to be expensive.

So Oracle is in a squeeze. On one side, Java has no significant revenue. On the other, it has expenses (possibly higher than the expenses for the Oracle database). How will Oracle navigate these straights?

I see a few possible ways forward:

Find a funding mechanism Perhaps licenses, perhaps advertising. Perhaps a version of Java for the Internet of Things.

Tie Java to the Oracle database Make Oracle the easiest database to use in Java.

Keep Java but stop development A fast was to reduce costs, but also a fast way to anger users. (On the other hand, Oracle seems to care little about user opinion.)

Spin off Java If Java doesn't fit into Oracle's strategy, why bother to keep it?

The last is an interesting idea. IBM might be interested in Java. Google almost certainly would. Microsoft probably not so much -- except perhaps to prevent Google from acquiring it.

Thursday, July 7, 2016

DOS, Windows, sharing, and mobile

Windows, when it arrived on the scene, changed the world of computing. Prior to Windows, DOS ruled the computing world, and it was a limited world. Windows expanded that world with new capabilities. In some ways, mobile operating systems (Android and iOS) move us back towards the ways of DOS.

IBM PCs (or compatibles, or near-compatibles) running DOS were simple devices. They could run one program at a time; running multiple programs at once was not possible. (Technically, it was possible with a "terminate and stay resident" function call, but such programs were few. For this essay, I'll stick to the "regular" programs.)

Windows brought us an expanded view of computing. Instead of running a single program at a time, Windows allowed for multiple. Windows provided a common way to present text and graphics on the screen, to print to printers, and to share data. Windows was a large step upward from the world of DOS.

Mobile operating systems -- Android and iOS -- provide a different experience. Instead of multiple applications and multiple windows, these operating systems present one application (or "app") at a time. Multiple apps may run, but only one has the screen. Thus, you can listen to music, check e-mail, and get a text message all at the same time. Mobile apps keep the multitasking aspect, but reduce the interaction to one app at a time.

Reducing the number of interactive apps to one is a reduction in capabilities, although it is a simpler experience, and one that makes sense for a phone. (I think it also makes sense for a tablet.)

What I don't see in the mobile operating systems (and what I don't see in Windows, either) is improvements in the ability to share data across applications. DOS had files and pipes (concepts lifted from Unix). Windows added the clipboard, and then Dynamic Data Exchange (DDE) and later, drag-and-drop. The clipboard was popular and is still used today. DDE never got traction, and drag-and-drop is limited to files.

Sharing data across applications is difficult. Each application has its own ideas about data. Word processors hold characters, words, sentences, paragraphs, and documents. Spreadsheets hold numeric values, formulas, cells, rows, columns, and sheets. Databases hold rows and columns -- or "documents" (different from word processor documents) for NoSQL databases. The transfer of data from one application to another is not obvious, and therefore the programming of such transfers is not obvious.

But Windows has had the clipboard for thirty years, and DDE and drag-and-drop for almost as long. Have we had no ideas in that time?

Perhaps our current mobile operating systems are the DOSes of today, waiting for a new, bold operating system to provide new capabilities.

Tuesday, June 21, 2016

Compilers and interpreters

Programming languages (with a few exceptions) fall into one of two categories: compiled or interpreted.

Compilers are the natural descendants of assemblers. Assemblers convert text representations of processor-specific operation codes into machine-readable form; compilers convert high-level programs into machine-readable form. Interpreters, on the other hand, read high-level programs and process them, without producing an "executable".

Both forms have advantages. Compiled programs execute faster, and the source code can remain hidden from users, who need only the executable form. Interpreted programs may be slower, but the process of writing (and debugging) tends to be faster and interpreted languages have flexibilities not available in compiled languages.

Programming languages are sometimes created by individuals working without specific sponsorship and direction from a corporation (I call them "enthusiasts"). Other languages are created by corporations, in large, well-planned and well-justified projects.

But is one technique more popular than another? Let's look at the list of popular (according to tiobe.com) languages. Here are the top languages, who created them, whether they are compiled or interpreted, and when they were created:

Java: corporation (Sun); compiled; 1990s
C: enthusiasts (Kernighan and Ritchie); compiled; 1970s
C++: enthusiast (Stroustrup); compiled, derived from C; 1980s
Python: enthusiast (van Rossum); interpreted; 1990s
C#: corporation (Microsoft); compiled; 2000s
PHP: enthusiast (Lerdorf); interpreted; 1990s
JavaScript: individual (Eich); interpreted; 1990s
Perl: enthusiast (Larry Wall); interpreted; 1980s
VB.NET: corporation (Microsoft); compiled; 2000s
Ruby: enthusiast (Matsumoto); interpreted; 1990s
Delphi: corporation (Borland); compiled, derived from Pascal; 1990s
Swift: corporation (Apple); compiled; 2010s
Objective-C: enthusiasts (Cox and Love); compiled, derived from C; 1980s
R: enthusiasts (Ihaka and Gentleman); interpreted, derived from S; 1990s
Matlab: enthusiast (Moler); interpreted; 1970s
SQL: enthusiast (Codd); interpreted; 1970s
D: corporation (Digital Mars); compiled; 2000s
COBOL: government consortium; compiled; 1950s

From this list, a few things are obvious. First, we've invented both compiled and interpreted languages. Second, we've invented both over the age of computers, and continue to do so. It's not that a particular type of language was a fad or has fallen out of favor.

Look at the relationship between the type of creator and the language. Enthusiasts create interpreted languages and corporations to create compiled languages. The list above would match this rule perfectly, except for C. (C++ and Objective-C, derived from C, would naturally be compiled.)

But this is a short list, and small sample sizes may be deceptive. Let's look at some more:

APL: enthusiast (Iverson); interpreted; 1950s
BASIC: enthusiasts (Kemeny and Kurtz); interpreted; 1960s
S: enthusiasts (Becker, Wilks, Chambers); interpreted; 1970s
Fortran: corporation (IBM): compiled, derived from assembly language; 1950s
Pascal: enthusiast (Wirth); compiled; 1960s
Eiffel: enthusiast (Meyer); compiled; 1990s
Forth: enthusiast (Moore); interpreted; 1960s
dBase: enthusiast (Ratliff); interpreted; 1970s
Ada: government agency: compiled; 1970s
PL/I: corporation (IBM); compiled; 1960s
Prolog: enthusiasts (Colmerauer, et al.); interpreted; 1970s
AWK: enthusiasts (Aho, Weinberger, and Kernighan); interpreted; 1970s
DIBOL: corporation (DEC); compiled; 1970s
FOCAL: enthusiast (Merrill); interpreted; 1960s

This expanded shows that enthusiasts *tend* to create interpreted languages but not always. Corporations create compiled languages, though. The only interpreted language created by a corporation might be SQL, created by IBM but I've assigned it to E.F. Codd as an enthusiast.

I'm not sure why enthusiasts would create interpreted languages. Perhaps its more fun that way. Perhaps its easier. Interpreted languages let you stop a running program, examine the innards of your interpreter, adjust things, and continue running, all useful when debugging the interpreter.

Astute readers will note that my assignment of "enthusiast" or "corporation" to languages may be a bit loose. The designation is sometimes difficult. Kernighan and Ritchie, when creating C, were working for AT&T's Bell Labs. Are they corporation employees or enthusiasts? E.F. Codd worked for IBM when publishing his thoughts on relational databases. Is he an employee or an enthusiast? Wayne Ratliff was working for NASA's JPL when he wrote the first version of dBase and was part of Ashton-Tate when he wrote dBase II. Does that make him an employee? In all of these cases, I feel the individuals involved were doing what they did more as enthusiasts than employees.

On the flip side, I've placed Java and C# in the "corporation" side. Neither of these languages have individuals strongly associated with their origins. Java was a thing presented to us by Sun; C# was presented by Microsoft. Did the creation of these languages involve passionate individuals? Certainly. Were those individuals working on these projects independent of the corporation's needs? I see no evidence of that. (Yet I can easily see Kernighan and Ritchie working late at night to add features to their C compiler.)

I don't know if the assignment of "corporation" or "enthusiast" to a language's origin is important -- but I don't know that it isn't. It may be that enthusiasts will continue to create interpreted languages, and corporations will continue to create compiled languages.

I do think it significant that Java and C# live in between, Java with its JVM and C# with its CLR. Perl and Python have moved in that direction, too. They gain some benefits of interpreted languages and retain some benefits of compiled languages. I expect we will see more languages that use these techniques.

One more thing. Two other recently developed languages:

Go: corporation (Google); compiler; 2010s
Checked C: corporation (Microsoft); compiled, derived from C; 2010s

So maybe everyone isn't jumping on the "semi-interpreted" wagon.

Thursday, June 2, 2016

The big improvement in programming forty years ago

Programming has been around since the beginning of computers, and seen lots of improvements: symbolic assembly, high-level compilers (COBOL and FORTRAN), structured programming (Pascal), object-oriented programming (Smalltalk, C++), virtual machines (Java, C#), scripting languages (Perl, Python, Ruby)... the list goes on.

Yet a significant improvement in programming occurred forty years ago. It made programming simple -- so simple that a non-programmer could do it. And it was ignored by the programming community.

That improvement was... the electronic spreadsheet.

Programming, at its core, is the organization of data and the processing of that data with a sequence of instructions. The niceties of data structures, objects, and just-in-time compilation are just that: niceties. They are there for the convenience of the programmer.

So how do spreadsheets come into it? Spreadsheets, at their core, organize data and process that data with a series of instructions. (Sound familiar?)

Spreadsheets -- the basic grid of numbers and formulas, without the charts, pivot tables, and VBA code -- are programs. Any spreadsheet can be converted into just about any language, from Fortran or BASIC to Java or Python. (The reverse is not true; only a few simple programs in BASIC or Python can be converted into spreadsheets.)

The improvement that spreadsheets made to programming was immediacy. The "programmer" could see the results of a change right after making a change. That immediate feedback was not available in compiled languages, which require the programmer to save the file, compile the program, and then run it. (IDEs like Turbo Pascal and Visual Studio make those steps easy, but there is still a delay.) Even interpreted languages like BASIC or Ruby require the steps of saving and running.

This improvement in programming, the immediate results of a change in the program, went unnoticed by the programming community. Visicalc was created in 1979, almost forty years ago. At the time, popular programming languages were BASIC, COBOL, Fortran, and Pascal.

Instead of building on the innovation of the spreadsheet, programmers have gone in other directions. Programmers focused on maintainability (structured programming), larger programs (object-oriented programming), version control, automated testing, and response to changing requirements (agile methods).

There has been no (or very little) effort for the immediate feedback that we get with spreadsheets.

For forty years.

At some point, we are going to invent a new programming language, one that provides immediate feedback. (Perhaps a language, editor, and run-time environment, which is what a spreadsheet is.) The advantages are great, as anyone who works with a spreadsheet can attest.