Fitzpatrick's Fabulous Future

Sunday, July 10, 2016

Oracle's Java Headaches

Oracle, after its purchase of Sun Microsystems, has found that it owns a number of things including MySQL and Java. MySQL presents obvious problems, as it competes with Oracle's big, expensive database. But Java also presents problems.

Oracle has two challenges with Java. The first (and obvious) challenge is money. Specifically, how does Oracle "monetize" Java? Extracting money from programming languages is possible; Microsoft succeeded with BASIC, Visual Basic, and C#. (Although the last was really profitable through Visual Studio, not the language itself.)

Extracting money from Java remains elusive. Oracle's latest attempt to sue Google has failed. It was a "whale" strategy, designed to obtain a large amount from a single entity. With the loss in court, will Oracle look for a different strategy, perhaps one that looks for fees from more (and smaller) entities?

Revenue is one headache for Oracle. A second headache exists, one that is less obvious, and may show up in the expense, not revenue column.

Oracle's history has been with SQL, a language designed in the 1970s for accessing data. As a programming language, it has been remarkably stable, with only a few changes since its inception. In contrast, programming languages like Visual Basic, C, C++, and C# have seen frequent and sometimes significant changes. Java, too, has seen changes, and it has the "JCP", the Java Community Process, which allows just about anyone to recommend changes to the Java language.

This second challenge is more subtle, and possibly larger, for Oracle. After decades of a stable, unchanging language, is it capable of managing a fast-moving programming language? After decades of maintaining the Oracle database (which I'm sure had lots of internal changes and lots of changes requested by Oracle's relatively few high-paying customers) is Oracle ready to maintain a product used by "the rest of us"?

This is the bigger issue for Oracle. Maintaining the Java code base, adapting it to new platforms, adding features to the language, and putting up with all of the pesky requests from pipsqueaks (highly opinionated pipsqueaks, some of us are) is going to be expensive.

So Oracle is in a squeeze. On one side, Java has no significant revenue. On the other, it has expenses (possibly higher than the expenses for the Oracle database). How will Oracle navigate these straights?

I see a few possible ways forward:

Find a funding mechanism Perhaps licenses, perhaps advertising. Perhaps a version of Java for the Internet of Things.

Tie Java to the Oracle database Make Oracle the easiest database to use in Java.

Keep Java but stop development A fast was to reduce costs, but also a fast way to anger users. (On the other hand, Oracle seems to care little about user opinion.)

Spin off Java If Java doesn't fit into Oracle's strategy, why bother to keep it?

The last is an interesting idea. IBM might be interested in Java. Google almost certainly would. Microsoft probably not so much -- except perhaps to prevent Google from acquiring it.

Thursday, July 7, 2016

DOS, Windows, sharing, and mobile

Windows, when it arrived on the scene, changed the world of computing. Prior to Windows, DOS ruled the computing world, and it was a limited world. Windows expanded that world with new capabilities. In some ways, mobile operating systems (Android and iOS) move us back towards the ways of DOS.

IBM PCs (or compatibles, or near-compatibles) running DOS were simple devices. They could run one program at a time; running multiple programs at once was not possible. (Technically, it was possible with a "terminate and stay resident" function call, but such programs were few. For this essay, I'll stick to the "regular" programs.)

Windows brought us an expanded view of computing. Instead of running a single program at a time, Windows allowed for multiple. Windows provided a common way to present text and graphics on the screen, to print to printers, and to share data. Windows was a large step upward from the world of DOS.

Mobile operating systems -- Android and iOS -- provide a different experience. Instead of multiple applications and multiple windows, these operating systems present one application (or "app") at a time. Multiple apps may run, but only one has the screen. Thus, you can listen to music, check e-mail, and get a text message all at the same time. Mobile apps keep the multitasking aspect, but reduce the interaction to one app at a time.

Reducing the number of interactive apps to one is a reduction in capabilities, although it is a simpler experience, and one that makes sense for a phone. (I think it also makes sense for a tablet.)

What I don't see in the mobile operating systems (and what I don't see in Windows, either) is improvements in the ability to share data across applications. DOS had files and pipes (concepts lifted from Unix). Windows added the clipboard, and then Dynamic Data Exchange (DDE) and later, drag-and-drop. The clipboard was popular and is still used today. DDE never got traction, and drag-and-drop is limited to files.

Sharing data across applications is difficult. Each application has its own ideas about data. Word processors hold characters, words, sentences, paragraphs, and documents. Spreadsheets hold numeric values, formulas, cells, rows, columns, and sheets. Databases hold rows and columns -- or "documents" (different from word processor documents) for NoSQL databases. The transfer of data from one application to another is not obvious, and therefore the programming of such transfers is not obvious.

But Windows has had the clipboard for thirty years, and DDE and drag-and-drop for almost as long. Have we had no ideas in that time?

Perhaps our current mobile operating systems are the DOSes of today, waiting for a new, bold operating system to provide new capabilities.

Tuesday, June 21, 2016

Compilers and interpreters

Programming languages (with a few exceptions) fall into one of two categories: compiled or interpreted.

Compilers are the natural descendants of assemblers. Assemblers convert text representations of processor-specific operation codes into machine-readable form; compilers convert high-level programs into machine-readable form. Interpreters, on the other hand, read high-level programs and process them, without producing an "executable".

Both forms have advantages. Compiled programs execute faster, and the source code can remain hidden from users, who need only the executable form. Interpreted programs may be slower, but the process of writing (and debugging) tends to be faster and interpreted languages have flexibilities not available in compiled languages.

Programming languages are sometimes created by individuals working without specific sponsorship and direction from a corporation (I call them "enthusiasts"). Other languages are created by corporations, in large, well-planned and well-justified projects.

But is one technique more popular than another? Let's look at the list of popular (according to tiobe.com) languages. Here are the top languages, who created them, whether they are compiled or interpreted, and when they were created:

Java: corporation (Sun); compiled; 1990s
C: enthusiasts (Kernighan and Ritchie); compiled; 1970s
C++: enthusiast (Stroustrup); compiled, derived from C; 1980s
Python: enthusiast (van Rossum); interpreted; 1990s
C#: corporation (Microsoft); compiled; 2000s
PHP: enthusiast (Lerdorf); interpreted; 1990s
JavaScript: individual (Eich); interpreted; 1990s
Perl: enthusiast (Larry Wall); interpreted; 1980s
VB.NET: corporation (Microsoft); compiled; 2000s
Ruby: enthusiast (Matsumoto); interpreted; 1990s
Delphi: corporation (Borland); compiled, derived from Pascal; 1990s
Swift: corporation (Apple); compiled; 2010s
Objective-C: enthusiasts (Cox and Love); compiled, derived from C; 1980s
R: enthusiasts (Ihaka and Gentleman); interpreted, derived from S; 1990s
Matlab: enthusiast (Moler); interpreted; 1970s
SQL: enthusiast (Codd); interpreted; 1970s
D: corporation (Digital Mars); compiled; 2000s
COBOL: government consortium; compiled; 1950s

From this list, a few things are obvious. First, we've invented both compiled and interpreted languages. Second, we've invented both over the age of computers, and continue to do so. It's not that a particular type of language was a fad or has fallen out of favor.

Look at the relationship between the type of creator and the language. Enthusiasts create interpreted languages and corporations to create compiled languages. The list above would match this rule perfectly, except for C. (C++ and Objective-C, derived from C, would naturally be compiled.)

But this is a short list, and small sample sizes may be deceptive. Let's look at some more:

APL: enthusiast (Iverson); interpreted; 1950s

BASIC: enthusiasts (Kemeny and Kurtz); interpreted; 1960s

S: enthusiasts (Becker, Wilks, Chambers); interpreted; 1970s
Fortran: corporation (IBM): compiled, derived from assembly language; 1950s
Pascal: enthusiast (Wirth); compiled; 1960s
Eiffel: enthusiast (Meyer); compiled; 1990s
Forth: enthusiast (Moore); interpreted; 1960s
dBase: enthusiast (Ratliff); interpreted; 1970s
Ada: government agency: compiled; 1970s
PL/I: corporation (IBM); compiled; 1960s
Prolog: enthusiasts (Colmerauer, et al.); interpreted; 1970s
AWK: enthusiasts (Aho, Weinberger, and Kernighan); interpreted; 1970s

DIBOL: corporation (DEC); compiled; 1970s
FOCAL: enthusiast (Merrill); interpreted; 1960s

This expanded shows that enthusiasts *tend* to create interpreted languages but not always. Corporations create compiled languages, though. The only interpreted language created by a corporation might be SQL, created by IBM but I've assigned it to E.F. Codd as an enthusiast.

I'm not sure why enthusiasts would create interpreted languages. Perhaps its more fun that way. Perhaps its easier. Interpreted languages let you stop a running program, examine the innards of your interpreter, adjust things, and continue running, all useful when debugging the interpreter.

Astute readers will note that my assignment of "enthusiast" or "corporation" to languages may be a bit loose. The designation is sometimes difficult. Kernighan and Ritchie, when creating C, were working for AT&T's Bell Labs. Are they corporation employees or enthusiasts? E.F. Codd worked for IBM when publishing his thoughts on relational databases. Is he an employee or an enthusiast? Wayne Ratliff was working for NASA's JPL when he wrote the first version of dBase and was part of Ashton-Tate when he wrote dBase II. Does that make him an employee? In all of these cases, I feel the individuals involved were doing what they did more as enthusiasts than employees.

On the flip side, I've placed Java and C# in the "corporation" side. Neither of these languages have individuals strongly associated with their origins. Java was a thing presented to us by Sun; C# was presented by Microsoft. Did the creation of these languages involve passionate individuals? Certainly. Were those individuals working on these projects independent of the corporation's needs? I see no evidence of that. (Yet I can easily see Kernighan and Ritchie working late at night to add features to their C compiler.)

I don't know if the assignment of "corporation" or "enthusiast" to a language's origin is important -- but I don't know that it isn't. It may be that enthusiasts will continue to create interpreted languages, and corporations will continue to create compiled languages.

I do think it significant that Java and C# live in between, Java with its JVM and C# with its CLR. Perl and Python have moved in that direction, too. They gain some benefits of interpreted languages and retain some benefits of compiled languages. I expect we will see more languages that use these techniques.

One more thing. Two other recently developed languages:

Go: corporation (Google); compiler; 2010s
Checked C: corporation (Microsoft); compiled, derived from C; 2010s

So maybe everyone isn't jumping on the "semi-interpreted" wagon.

Thursday, June 2, 2016

The big improvement in programming forty years ago

Programming has been around since the beginning of computers, and seen lots of improvements: symbolic assembly, high-level compilers (COBOL and FORTRAN), structured programming (Pascal), object-oriented programming (Smalltalk, C++), virtual machines (Java, C#), scripting languages (Perl, Python, Ruby)... the list goes on.

Yet a significant improvement in programming occurred forty years ago. It made programming simple -- so simple that a non-programmer could do it. And it was ignored by the programming community.

That improvement was... the electronic spreadsheet.

Programming, at its core, is the organization of data and the processing of that data with a sequence of instructions. The niceties of data structures, objects, and just-in-time compilation are just that: niceties. They are there for the convenience of the programmer.

So how do spreadsheets come into it? Spreadsheets, at their core, organize data and process that data with a series of instructions. (Sound familiar?)

Spreadsheets -- the basic grid of numbers and formulas, without the charts, pivot tables, and VBA code -- are programs. Any spreadsheet can be converted into just about any language, from Fortran or BASIC to Java or Python. (The reverse is not true; only a few simple programs in BASIC or Python can be converted into spreadsheets.)

The improvement that spreadsheets made to programming was immediacy. The "programmer" could see the results of a change right after making a change. That immediate feedback was not available in compiled languages, which require the programmer to save the file, compile the program, and then run it. (IDEs like Turbo Pascal and Visual Studio make those steps easy, but there is still a delay.) Even interpreted languages like BASIC or Ruby require the steps of saving and running.

This improvement in programming, the immediate results of a change in the program, went unnoticed by the programming community. Visicalc was created in 1979, almost forty years ago. At the time, popular programming languages were BASIC, COBOL, Fortran, and Pascal.

Instead of building on the innovation of the spreadsheet, programmers have gone in other directions. Programmers focused on maintainability (structured programming), larger programs (object-oriented programming), version control, automated testing, and response to changing requirements (agile methods).

There has been no (or very little) effort for the immediate feedback that we get with spreadsheets.

For forty years.

At some point, we are going to invent a new programming language, one that provides immediate feedback. (Perhaps a language, editor, and run-time environment, which is what a spreadsheet is.) The advantages are great, as anyone who works with a spreadsheet can attest.

Sunday, May 22, 2016

Small check-ins saved me

With all of the new technology, from cloud computing to tablets to big data, we can forget the old techniques that help us.

This week, I was helped by one of those simple techniques. The technique that helped was frequent, small check-ins to version control systems. I was using Microsoft's TFS, but this technique works with any system: TFS, Subversion, git, CVS, ... even SourceSafe!

Small, frequent changes are easier to review and easier to revert than large changes. Any version control system accepts small changes; the decision to make large or small changes is up to the developer.

After a number of changes, the team with whom I work discovered a defect, one that had escaped our tests. We knew that it was caused by a recent change -- we tested releases and found that it occurred only in the most recent release. That information limited the introduction of the defect to the most recent forty check-ins.

Forty check-ins may seem like a long list, but we quickly identified the specific check-in by using a binary search technique: get the source from the middle revision; if the error occurs move to the earlier half, if not move to the later half and start in that half's middle.

The real benefit occurred when we found the specific check-in. Since all check-ins were small, this check-in was too. (It was a change of five different lines.) It was easy to review the five individual lines and find the error.

Once we found the error, it was easy to make the correction to the latest version of the code, run our tests (which now included an addition test for the specific problem we found), verify that the fix was correct, and continue our development.

A large check-in would have required much more examination, and more time.

Small check-ins cost little and provide easy verification. Why not use them?

Sunday, May 15, 2016

Agile values clean code; waterfall may but doesn't have to

Agile and Waterfall are different in a number of ways.

Agile promises that your code is always ready to ship. Waterfall promises that the code will be ready on a specific date in the future.

Agile promises that your system passes the tests (at least the tests for code that has been implemented). Waterfall promises that every requested feature will be implemented.

There is another difference between Agile and Waterfall. Agile values clean code; Waterfall values code that performs as intended but has no notion of code quality. The Agile cycle includes a step for refactoring, a time for developers to modify the code and improve its design. The Waterfall method has no corresponding step or phase.

Which is not to say that Waterfall projects always result in poorly designed code. It is possible to build well-designed code with Waterfall. Agile explicitly recognizes the value of clean code and allocates time for correcting design errors. Waterfall, in contrast, has its multiple phases (analysis, design, coding, testing, and deployment) with the assumption that working code is clean code -- or code of acceptable quality.

I have seen (and participated in) a number of Waterfall projects, and the prevailing attitude is that code improvements can always be made later, "as time allows". The problem is that time never allows.

Many project managers have the mindset that developers should be working on features with "business value". Typically these changes fall into one of three categories: feature to increase revenue, features to reduce costs, and defect corrections. The mindset also considers any effort outside of those areas to be not adding value to the business and therefore not worthy of attention.

Improving code quality is an investment in the future. It is positioning the code to handle changes -- in requirements or staff or technology -- and reducing the effort and cost of those changes. In this light, Agile is looking to the future, and waterfall is looking to the past (or perhaps only the current release).

Thursday, May 5, 2016

Where have all the operating systems gone?

We used to have lots of operating systems. Every hardware manufacturer built their own operating systems. Large manufacturers like IBM and DEC had multiple operating systems, introducing new ones with new hardware.

(It's been said that DEC became a computer company by accident. They really wanted to write operating systems, but they needed processors to run the them and compilers and editors to give them something to do, so they ended up building everything. It's a reasonable theory, given the number of operating systems they produced.)

In the 1970s CP/M was an attempt at an operating system for different hardware platforms. It wasn't the first; Unix was designed for multiple platforms prior. It wasn't the only one; the UCSD p-System used a virtual processor quite like the virtual machine in the Java JVM and ran on various hardware.

Today we also see lots of operating systems. Commonly used ones include Windows, Linux, Mac OS, iOS, Android, Chrome OS, and even watchOS. But are they really different?

Android and Chrome OS are really variants on Linux. Linux itself is a clone of Unix. Mac OS is derived from NetBSD which in turn is derived from the Berkeley System Distribution of Unix. iOS and watchOS are, according to Wikipedia, "Unix-like", and I assume that they are slim versions of NetBSD with added components.

Which means that our list of commonly-used operating systems becomes:

Windows
Unix

That's a rather small list. (I'm excluding the operating systems used for special purposes, such as embedded systems in automobiles or machinery or network routers.)

I'm not sure that this reduction in operating systems, this approach to a monoculture, is a good thing. Nor am I convinced that it is a bad thing. After all, a common operating system (or two commonly-used operating systems) means that lots of people know how they work. It means that software written for one variant can be easily ported to another variant.

I do feel some sadness at the loss of the variety of earlier years. The early days of microcomputers saw wide variations of operating systems, a kind of Cambrian explosion of ideas and implementations. Different vendors offered different ideas, in hardware and software. The industry had a different feel from today's world of uniform PCs and standard Windows installations. (The variances between versions of Windows, or even between the distros of Linux, and much smaller than the differences between a Data General minicomputer and a DEC minicomputer.)

Settling on a single operating system is a way of settling on a solution. We have a problem, and *this* operating system, *this* solution, is how we address it. We've settled on other standards: character sets, languages (C# and Java are not that different), storage devices, and keyboards. Once we pick a solution and make it a standard, we tend to not think about it. (Is anyone thinking of new keyboard layouts? New character sets?) Operating systems seem to be settling.