Sunday, November 22, 2009

Open Source Microsoft

A lot has been written about Microsoft's latest moves to open source.

I don't expect Microsoft to turn itself into Google. Or Apache. Or even Sun or Novell. I expect Microsoft to remain Microsoft. I expect them to remain a for-profit business. I expect them to keep some amount of software as closed source.

Here's what happens if Microsoft opens its source code in a significant manner:

First, the notion of open source software becomes legitimate. People who avoided open source software because it was "not what Microsoft does" will have no reason to avoid it. They may start to laud the principles of open source. Many companies, large and small, will look at the non-Microsoft offerings and consider them. (I expect a number of shops to remain dedicated to Microsoft solutions, open or closed.)

Second, the open source community takes a hit. Not the entire community, but a major portion of it. The blow is psychological, not technical. The openness of open source defines the "open source community" and separates it from the large commercial shops like Microsoft. If Microsoft adopts open source (even in part), then the traditional open source community (many of whom are Microsoft-bashers) suffer an identity crisis.

Third, the open source folks who depended on the notion of "we're not Microsoft" will substitute some other mechnism for differentiating themselves from Microsoft. Look for renewed language wars (tricky with Microsoft funding things like IronPython and IronRuby) and possibly the notion of "pure" open source. The latter may catch companies that use a dual approach to software, such as Novell and MySQL.

Microsoft will stay focussed on its goals. The open source community may become splintered, with some folks searching for ways to bash Microsoft, some folks trying to blend Microsoft into their current solutions, and others remaining on their current path.

Could it be that Microsoft has found a way to neutralize the threat of open source software?

Sunday, November 15, 2009

With more than toothpicks

On one rather large, multi-decade project, the developers proclaimed that their program was object-oriented. Yet when I asked to see a class hierarchy chart, they could not provide one. I found this odd, since a hierarchy chart is useful, especially for new members of the team. The developers claimed that they didn't need one, and that new team members picked up the code without it. (The statement was true, although the system was so large and so complex that new members needed about six months to become productive.)

I was suspicious of the code's object-oriented-ness. I suspected them of not using object-oriented techniques.

It turns out that their code was object-oriented, but only to a small degree. They had lots of classes, all derived from framework classes. Their code was a thin layer built atop the framework. Their 'hierarchy' was exactly one layer deep. (Or tall, depending on how you look at it.)

This kind of design is akin to building a house (the application) on a good foundation (the framework) but then building everything out of toothpicks. Well, maybe not toothpicks, but small stones and pieces of wood. Rather that using studs and pre-assembled windows, this team built everything above the foundation, and built it with only what was in the foundation. They created no classes to help them -- nothing that was the equivalent of pre-made cabinets or carpeting.

The code was difficult to follow, for many reasons. One of the reasons was the constant shifting of context. Some functions were performed in classes, others were performed in code. Different levels of "height" were mixed in the same code. Here's a (small, made-up) example:

    function print_invoice(Items * items, Customer customer, Terms * terms)
    {
        // make sure customer is valid
        if (!customer.valid()) return ERR_CUST_NOT_VALID;

        // set up printer
        PrinterDialog pdlg;
        if (pdlg.DoModel() == S_OK)
        {
            Printer printer = new Printer(pdlg.GetName());

            char * buffer = NULL;

            buffer = customer.GetName();
            buffer[30] = '\0';
            printer.Print(buffer);
            delete buffer[];
            if (customer.IsBusiness())
            {
                 buffer = customer.GetCompany());
                 buffer[35] = '\0';
                 printer.Print(buffer);
            }
            // more lines to print customer info

            for (int i = 0; i < items.Count(); i++)
            {
                 int item_size = item[i].GetSize();
                 char *buffer2 = new char[item_size];
                 buffer2[item_size] = '\0';

                 printer.Print(buffer);

                 delete buffer2[];
            }

            // more printing stuff for terms and totals

        }
    }

This fictitious code captures the spirit of the problem: A relatively high-level function (printing an invoice) has to deal with very low-level operations (memory allocation). This was not an isolated example -- the entire system was coded in this manner.

The problem with this style of code is the load that it places on the programmer. The poor sap who has to maintain this code (or worse, enhance it) has to mentally bounce up and down thinking in high-level business functions and low-level technical functions. Each of these is a context switch, in which the programmer must stop thinking about one set of things and start thinking about another set of things. Context switches are expensive. You want to minimize them. If you force programmers to go through them, they will forget things. (For example, in the above code the programmer did not delete the memory allocated for printing the company name. You probably didn't notice it either -- you were to busy shifting from detail-to-general mode.)

Object-oriented programming lets us organize our code, and lets us organize it on our terms -- we get to define the classes and objects. But so few people use it to their advantage.

To be fair, in all of the programming courses and books, I have seen very little advocacy for programmers. It's not a new concept. Gerry Weinberg wrote about "the readability of programs" in his The Psychology of Computer Programming in the mid 1970s. And Perl offers many ways to to the same thing, with the guiding principle of "using the one that makes sense". But beyond that, I have seen nothing in courses that strive for making a programmer's job easier. Nor have I seen any management tracts on measuring the complexity of code and designing systems to reduce long-term maintenance costs.

Consequently, new programmers start writing code and group everything into the obvious classes, but stop there. They don't (most of the time) create hierarchies of classes. And why should they? None of their courses covered such a concept. Examples in courses have the same mix of high-level and low-level functions, so programmers have been trained to mix them. The systems they build work -- that is they produce the desired output -- with mixed contexts, so it can't be that big of a problem.

In one sense, they are right. Programs with mixed contexts can produce the desired output. Of course so can non-OO programs using structured programming. And so can spaghetti code, using neither OO or structured programming.

Producing the right output is necessary but not sufficient. The design of the program affects future enhancements and defect corrections. I believe -- but have no evidence -- that mixed-context programs have more defects than well-organized programs. I believe this because a well-organized program should be easier to read, and defects should be easier to spot. High-level functions can contain just business logic and low-level functions can contain just technical details, and a reader of either can focus on the task at hand and not switch between the two.

I think that it is time we focus on the readability of the code, and the stress load that bad code puts on programmers. We have the techniques (object-oriented programming) to organize code into readable form. We have the motive (readable code is easier to maintain). We have the computing power to "afford" what some might consider to be "inefficient" code designs.

All we need now is the will.


Wednesday, November 11, 2009

Oh say can you C?

Programmers have two favorite past-times: arguing about languages and inventing new languages. (Arguing about editors is probably a close third.) When we're not doing one, we're probably doing the other.

I've written about the demise of C++. Yet its predecessor, C, is doing well. So well that people have re-invented C to look more like modern object-oriented languages. Two new languages are "Brace" and "OOC". Brace recasts C syntax to match that of Python, removing braces and using indentation for blocking. OOC is an object-oriented language that is compiled to C.

Improvements to the C language are not new. Objective C was developed in the early 1980s, and C++ itself is a "better" version of C. The early implementations of C++ were source-to-source compiled with a program called 'cfront'.

Improvements of this nature happen a lot. Borland improved Pascal, first extending standard Pascal with useful I/O functions and later morphing it into the Delphi product. Microsoft made numerous changes to BASIC, adding features, converting to Visual Basic, and continuing to add (and often change) features. Even FORTRAN was remade into RATFOR, a name derived from 'rational Fortran'. ('Rational Fortran' meant 'looks like C'.)

I'm not sure that Brace will have much in the way of success. Recasting C into Python gets you ... well, something very close to Python. Why exert the effort? If you wanted Python, you should have started with it. Brace does include support for coroutines, something that may appeal to a very narrow audience, and has support for graphics which may appeal to a broader group. But I don't see a compelling reason to move to it. OOC is in a similar situation. My initial take is that OOC is Ruby but with static typing. And if you wanted Ruby... well, you know the rest.

Improvements to C are nice, but I think the improvers miss an important point: C is small enough to fit inside our heads. The C language is simple and can be understood with four concepts: variables, structs, functions, and pointers. Everything in C is built from these four elements, and can be understood in these terms. You can look at C code and compile it with your "cortex compiler". (I'm ignoring atrocities committed by the preprocessor.) The improved versions of C are more complex and understanding a code fragment requires broader knowledge of the program. Every feature of C++ hid something of the code, and made the person reading the code go off and look at other sections.

The most important aspect of a programming language is readability. Programmers read code more often than you think, and they need to understand. C had this quality. Its derivatives do not. Therefore, there is a cost to using the derivatives. There are also benefits, such as better program organization with object-oriented techniques. The transition from C to C++, or Objective C, or Brace, or OOC is a set of trade-offs, and should be made with care.


Sunday, November 8, 2009

Microsoft Shares its Point but Google Waves

Microsoft and Google are the same, yet different. For example, they both offer collaboration tools. Microsoft offers Sharepoint and Google has announced 'Waves'.

Microsoft Sharepoint is a web-based repository for documents (and anything that passes as a document in the Microsoft universe, such as spreadsheets and presentations). Sharepoint also has a built-in list that has no counterpart in the desktop world. And Sharepoint can be extended with programs written on the .NET platform.

Google Waves is a web based repository for conversations -- e-mail threads -- with the addition of anything that passes for a document in the Google universe.

Sharepoint and Waves are similar in that they are built for collaboration. They are also similar in that they use version control to keep previous revisions of documents.

Sharepoint and Waves are different, and their differences say a lot about their respective companies.

Sharepoint is an extension of the desktop. It provides a means for sharing documents, yet it ties in to Microsoft Office neatly. It is a way for Microsoft to step closer to the web and help their customers move.

Waves is an extension of the web forum thread model, tying in to Google documents. It is a way for Google to step closer to the desktop (or functions that are performed on the desktop) and help their customers.

I've used Microsoft Sharepoint and seen demonstration of Waves. I generally discount demonstrations -- anyone can have a nice demo -- but Google's impressed me.

The big difference is in the approach. Microsoft has introduced Sharepoint as a way for people who use desktops and the desktop metaphor to keep using them. Google, on the other hand, has positioned Waves as a replacement for e-mail.

Why should I mention e-mail? Because e-mail is a big problem for most organizations. E-mail is a model of the paper-based mail system, and not effective in the computer world. We know the problems with e-mail and e-mail threads  (reading messages from the bottom up, losing attachments, getting dropped from lists) and the problems are not small. Yet we believed that the problem was in ourselves, not the e-mail concept.

Google has a better way. They move away from e-mail and use a different model, a model of a conversation. People can join and leave as they wish. New joiners can review older messages quickly. Everyone has the latest versions of documents.

And here is the difference between Microsoft and Google. Microsoft created a tool -- Sharepoint -- to address a problem. Sharepoint is nice but frustrating to use; it is an extension of the desktop operating system and expensive to administrate. It offers little for the user and has no concept of e-mail or conversations. Google has taken the bold step of moving to a new concept, thinking (rightfully so in my opinion) that the problems of collaboration cannot be solved with the old metaphors. Google has started with the notion of conversation and built from there.

Just as EDSAC was an electronic version of a mechanical adding machine and EDVAC was a true electronic computer, e-mail is an electronic version of paper mail and Waves is a conversation system. Microsoft is apparently content with e-mail; Google is willing to innovate.


Friday, October 16, 2009

The end of the C++ party

In the future, historians of programming languages will draw a line and say: "this is the point that C++ began it's decline". And that point will be prior to today. The party is over for C++, although many of the partygoers are still drinking punch and throwing streamers in the air.

Peter Seibel's blog excerpts comments from the just-released Coders and Work. He lists multiple comments about the C++ language, all of them from detractors.

C++ has had a history of negative comments. It's early history, as a quiet project and before the internet and related twitterness, saw comments about C++ through e-mails and usenet. As people became interested in C++, there were more comments (some positive and some negative) but there was the feeling that C++ was the future and it was the place to go. Negative comments, when made, were either directed to the difficultly of learning a new paradigm (object-oriented programming), the implementation (the compiler and libraries), or the support tools (the IDE and debugger). C++ was the shiny new thing.

The arrival of IBM OS/2 and Microsoft Windows also made C++ attractive. OS/2 and Windows use an event-driven model, and object-oriented programs fare better than procedural programs. Microsoft's support for C++ (among other languages) also made it a "safe" choice.

The novelty of a new programming language is a powerful drug, and C++ was a new language. Managers may have been reluctant to move to it (the risks of unknown territory and longer ramp-up for developers) and some programmers too (charges of larger executables and "inefficient generated code") but eventually we (as an industry) adopted it. The euphoria of the new was replaced with the optimism of the next release: "Yes," we told ourselves, "we're having difficulties, but the problem is in our compiler, or our own expertise. Next year will be better!"

And for a while, the next year was better. And the year after that one was better too, because we were becoming better object-oriented programmers and the compilers were getting better.

But there were those who complained. And those who doubted. And there were those who took action.

Sun introduced Java, another object-oriented programming language. For a while, it held the allure of "the new thing". It had its rough spots (performance, IDE) but we overcame them and newer versions were better. And C++ was no longer the one and only choice for object-oriented programming. (I'm ignoring the earlier languages such as LISP and Scheme. They never entered the mainstream.)

Once we had Java, we could look at C++ in a different light. C++ was not the shining superhero that we desired. He was just another shlub that happened to do some things well. C++ was demoted from "all-wonderful" to "just another tool", much to the delight of the early complainers.

Other languages emerged. Python. Ruby. Objective-C. Haskell. All were object-oriented, but none powerful enough to dislodge C++. The killer (for C++) was Microsoft's C# language. The introduction of C# (and .NET) struck two blows against C++.

First, C# was viewed as a Java clone. Microsoft failed at embracing and extending Java, so they created a direct competitor. By doing so, they gave Java (and its JVM) the stamp of legitimacy.

Second, Microsoft made C# their premier language, demoting C++ below Visual Basic. (Count the number of sample code fragments on the Microsoft web site.) Now Microsoft was saying that C++ wasn't the shiny new thing.

We (in the programming industry) examined our problems with C++, discussed them, debated, them, and arrived at a conclusion: problems have been solved, but the one problem remaining is that C++ is a difficult language. The next version of the compiler will not fix that problem. Nor will more design patterns. Nor will user groups.

The C++ party is over. People are leaving. Not just the folks in Coders at Work, but regular programmers. Companies are finding it hard to hire C++ programmers. Recruiters tell me that C++ programmers want to move on to other things. We as a profession have decided to if not abandon C++, at least give it a smaller role.

Which presents a problem for the owners of C++ systems.

The decision to leave C++ has been made at the programmer level. Programmers want out. Very few college graduates learn C++ (or want to learn it).

But the owners of systems (businessmen and managers) have not made the decision to leave C++. For the most part, they want to keep their (now legacy) applications running. They see nothing wrong with C++, just as they saw nothing wrong with C and FORTRAN and COBOL and dBase V. C++ works for them.

In a bizarre, almost Marxist twist, the workers are leaving owners with the means of production (the compilers and IDEs of C++) and moving on to other tools.

C++ has been elevated to the rank of "elder language", joining COBOL and possibly FORTRAN. From this point on, I expect that the majority of comments on C++ will be negative. We have decided to put it out to pasture, to retire it.

There is too much code written in C++ to simply abandon it. Businesses have to maintain their code. Some open source projects will continue to use it. But it will be used grudgingly, as a concession to practicalities. Linux won't be converted to a new language... but the successor to Linux will use something other than C++.


Friday, October 9, 2009

Glass houses

I just went through the experience of renewing my IEEE (and IEEE Computer Society) membership with the IEEE web pages. The transaction was, in a word, embarrassing.

Here is my experience:

- After I logged in, the web site complained that I was attempting to start a second session and left me with an empty window. I had to re-load the renewal page to continue. (Not simple press the "reload" button, but re-select the IEEE URL again.)

- The few pages to process the renewal were straightforward, until I reached the "checkout" page. This page had a collection of errors.

- After entering my credit card number, the site informed me that I had too many characters in the number. I had entered the number with spaces, just as it appears on my credit card and my statements. The site also erased my entry, forcing me to re-enter the entire number.

- I used the "auto-fill" button to retrieve the stored address. The auto-fill did not enter a value for the country, however, and nor could I, as the field was disabled. Only after adjusting the street address could I select a country.

- After clicking the "process" button, the web site informed me that I had an invalid value in the "state/province" field. I dutifully reviewed the value supplied by the auto-fill routine, changed it from "MD" to "MD".

- That action fixed the problem with the state/province field, but the web site then erased my credit card number. After entering the credit card number again (the third time), I was able to renew my membership.

If the IEEE (and by association the IEEE Computer Society) cannot create and maintain a check-out web site, a function that has been with us for the past ten years and is considered elementary, then they have little credibility for advice on software design and construction. More than that, if the IEEE cannot get "the basics" right, how can anyone trust them for the advanced concepts?


Thursday, October 8, 2009

A cell phone is not a land-line phone

When you call a land line, you call a place. When you call a cell phone, you call a person.

I heard this idea at a recent O'Reilly conference. (It was either E-Tech or OSCON, but I don't remember. Nor do I remember the speaker.)

In the good ole days, calling a place was the same as calling a person. Mostly. A typical (working-class) person could have two locations: home and office. To discuss business, you called them at their office. To discuss other matters, you called them at their home.

A funny thing happened on the way to the Twenty-first Century: people became mobile, and technology became mobile too.

Mobility is not a new idea. Indeed, one can look at the technological and social changes of the Twentieth Century to see the trend of increasing mobility. Trains, airplanes, hotels, reservation systems... the arrow points from "stay in one place" to "move among locations". Modern-day cell phones and portable internet tablets are logical steps in a long chain.

People have become mobile and businesses will become mobile too.

Yet many people (and many organizations) cling to the old notion of "a person has a place and only one place". Even stronger is the idea "a business has a place and only one place (except for branch offices and subsidiaries)". Our state and federal governments have coded these notions into laws, with concepts of "state of residence" and "permanent address". Many businesses tie their customers to locations, and then build an internal organization based on that assumption (regional sales reps, for example). For customers that have large physical assets such as factories and warehouses, this makes some sense. But for the lightweight customer, one without the anchoring assets, it does not. (Yet businesses -- and governments -- will insist on a declared permanent address because their systems need it.)

Newer businesses are not encumbered with this idea. Twitter and LiveJournal, for example, don't care about your location. They don't have to assess your property, send tax bills, or deliver physical goods. Facebook does allow you to specify a location, but as a convenience for finding other people in your social network. (Facebook does limit you to one physical location, though, so I cannot add my summer home.)

Some businesses go so far as to tie an account to a physical location. Land line phones for one, from the old billing practices of charging based on distance called. At least one large shipping company uses the "you are always in this place" concept, since it also uses "charge based on distance" model.

For moving physical boxes in the real world, this may make some sense, but telephone service has all but completely moved to the "pure minutes" model, with no notion of distance. (Calling across country borders is more expensive, but this is a function of politics and rate tariffs and not technology.)

We have separated a person from a single location. Soon we will detach businesses from single locations.