Sunday, December 27, 2009

The "nook" should be called the "no"

The Kindle and the Nook are intriguing devices. Marketed as e-book readers, the intent is to replace paper books with electronic ones. They both look interesting. I've seen people using the Kindle, and this week-end I stopped by the local Barnes and Noble to check out the Nook.

I find it nice, but not compelling.

My biggest complaint is probably the name. Barnes and Noble gets the "dumbest name of a consumer product" award. (The previous holder of said award was Amazon.com, for "Kindle". So the unappealing names of the two devices cancel.)

Despite the name, I find I prefer the Nook over the Kindle. Partly because of the virtual keyboard, partly because of the Android operating system, and partly because of the feel of the device. The virtual keyboard is more flexible, and the Nook uses the space for other menus and commands. I like the idea of an open-source operating system. And the Nook feels nice in my hands -- its a comfortable fit.

But the Nook web site has too many "no" answers. It plays MP3 audio files, but not Ogg Vorbis audio files. (And on an Android O/S!) It doesn't let me share notes with friends. (I can lend books to friends, but not my annotations.) It doesn't let me update my Facebook or LiveJournal pages. It doesn't let me surf the web. Its a device with narrow functionality.

I'm not sure that the Kindle is any better. (I've looked at the Nook more than the Kindle, so I can't really speak about the Kindle.)

I understand the reasoning for limiting the devices. They (Amazon.com and B&N) want to use it to drive book sales (e-book sales directly and paper book sales indirectly) and also want to minimize network traffic. Users don't pay for airtime and connections; B&N and Amazon.com pay for them. Also, a complex device is hard for a lot of non-techies to use. Most people don't care about the format of their music, or Facebook updates, or sharing notes. At least, not yet.

Barnes and Noble and Amazon.com are missing the point. They have designed their e-readers for the baby-boomer generation, people focussed on themselves. The Kindle and the Nook are excellent devices for sitting in a corner, reading to yourself, and not interacting with others. But that's not the device for me. I want to share (think of  LiveJournal, DOPPLR, Facebook, and Twitter) and have little use for a "me only" device.


Thursday, December 17, 2009

e-Books are not books

The statement "e-Books are not books" is, on its face, a tautology. Of course they're not books. The exist as bits and can be viewed only by a "reader", a device or program that renders pixels to a person.

My point is beyond the immediate observation. e-Books are not books, and will have capabilities we do not associate with books today. e-Books are a new form, just as the automobile was not a horseless carriage and a word processor is not a typewriter.

We humans need time to understand a new thing. We didn't "get" electronic computing right away. ENIAC was an electronic version of a mechanical adding machine; a few years later EDVAC was a computer.

Shifts in technology can be big or small. Music was distributed on paper; the transition to 78s and LPs was a major shift. It took us some time to fully appreciate the possibilities of recorded music. The shift to compact discs (small shiny plastic instead of large, warping vinyl) was a small one; little changed in our consumption or in the economic model. The shift to digital music on forms other than shiny plastic discs is a big one, and the usage and economic model will change.

An on-line newspaper is not a newspaper. It will become a form of journalism -- but not called a newspaper, nor will it have the same capabilities or limitations of ink slapped onto dead trees.

e-Books are not books. The stories and information presented to us on Kindle and Nook readers is the same information as in printed books, but that will change. For example, I expect that annotations will become the norm for e-books. Multiple readers will provide annotations, with comments for themselves and for others. (Think of it as today's e-book with Twitter and Google.) One person has blogged about their method for reading books (http://www.freestylemind.com/how-to-read-a-book) and how they keep notes and re-read portions of books for better understanding. Their system uses post-it notes. I predict that future e-Book readers will allow for the creation and storage of personal notes, and the sharing of notes with friends and the world.

Or perhaps e-Books will let us revise books and make corrections. (Think "e-Books today combined with Wikipedia".)

And that is why an e-book is not a book.


Sunday, December 13, 2009

Code first, then design, and then architecture

Actually, the sequence is: Tests first, then code, then design, and finally architecture.

On a recent project, I worked the traditional project sequence backwards. Rather than starting with an architecture and developing a design and then code, we built the code and then formed a design, and later evolved an architecture. We used an agile process, so we had tests supporting us as we worked, and those tests were valuable.

Working backwards seems wrong to many project managers. It breaks with the metaphor of programming as building a house. It goes against the training in project management classes. It goes against the big SDLC processes.

Yet it worked for us. We started with a rudimentary set of requirements. Very rudimentary. Something along the lines of "the program must read these files and produce this output", and nothing more formal or detailed. Rather than put the details in a document, we left the details in the sample files.

Our first task was to create a set of tests. Given the nature of the program, we could use simple scripts to run the program and compare output against a set of expected files. The 'diff' program was our test engine.

After we had some tests, we wrote some code and ran the tests. Some passed, but most failed. We weren't discouraged; we expected most of them to fail. We slowly added features to the code and got our tests to pass. As we coded, we thought of more tests and added them to our scripts.

Eventually, we had a program that worked. The code wasn't pretty -- we have made several design compromises as we coded -- but it did provide the desired output. The "standard" process would advance at this point, to formal testing and then deployment. But we had other plans.

We wanted to improve the code. There were several classes that were big and hard to maintain. We knew this by looking at the code. (Even during our coding sessions, we told ourselves "this is ugly".) So we set out to improve the code.

Managers of typical software development efforts might cringe at such an effort. They've probably seen efforts to improve code, many times which fail without delivering any improvement. Or perhaps the programmers say that the code is better, but the manager has no evidence of improvement.

We had two things that helped us. First was our tests. We were re-factoring the code, so we knew that the behavior would not change. (If you're re-factoring code and you want the behavior to change, then you are not re-factoring the code -- you're changing the program.) Our tests kept us honest, by finding changes to behavior. When we were done, we had new code that passed all of the old tests.

The second thing we had was class reference diagrams. Not class hierarchy diagrams, but reference diagrams. Class hierarchy diagrams show you the inheritance and container relationships of classes. Reference diagrams give you a different picture, showing you which classes are used by other classes. The difference is subtle but important.) The reference diagrams gave us a view of the design. They showed all of our classes, with arrows diagramming the connections between classes. We had several knots of code -- sets of classes with tightly-coupled relationships -- and we wanted a cleaner design.

We got our cleaner design, and we kept the "before" and "after" diagrams. The non-technical managers could see the difference, and commented that the "after" design was a better one.

We repeated this cycle of code-some, refactor-some, and an architecture evolved. We're pretty happy with it. It's easy to understand, allows for changes, and gives us the performance that we need.

Funny thing, though. At the start, a few of us had definite ideas about the "right" architecture for the programs. (Full disclosure: I was one such individual.) Our final architecture, the one that evolved to meet the specific needs of the program as we went along and learned about the task, looked quite different from the initial ideas. If we had picked the initial architecture and stayed with it, our resulting program would be complicated and hard to maintain. Instead, by working backwards, we ended with a better design and better code.

Sometimes, the way forward is to go in reverse.


Saturday, December 5, 2009

Limits to App Growth

Long ago (when dinosaurs roamed the Earth), applications were limited in size. Today, applications are limited in size, but from different causes.

Old constraints were hardware and software: the physical size of the computer (memory and disk), the capabilities of the operating system, and the capacities of the compiler. For example, some compilers had a fixed size to the symbol table.

Over the decades, physical machines became more capable, and the limits from operating systems and compilers have become less constraining. So much so that they no longer limit the size of applications. Instead, a different factor is the limiting one. That factor is upgrades to tools.

How can upgrades limit the size of an application? After all, new versions of compilers are always "better" than the old. New operating systems give us more features, not fewer.

The problem comes not from the release of new tools, but from the deprecation of the old ones.

New versions of tools often break compatibility with the old version. Anyone who programmed in Microsoft's Visual Basic saw this as Microsoft rolled out version 4, which broke a lot of code. And then again as version 5 broke a lot of code. And then again as VB.NET broke a ... well, you get the idea.

Some shops avoid compiler upgrades. But you can't avoid the future, and at some point you must upgrade. Possibly because you cannot buy new copies of the old compiler. Possibly because another tool (like the operating system) forces you to the new compiler. Sometimes a new operating system requires the use of new features (Windows NT, for example, with its "Ready for Windows NT" logo requirements).

Such upgrades are problematic for project managers. They divert development resources from other initiatives with no increase in business capabilities. They're also hard to predict, since they occur infrequently. One can see that the effort is related to the size of the code, but little beyond that. Will all modules have to change, or only a few? Does the code use a language feature or library call that has changed? Are all of the third-party libraries compatible with the new compiler?

The project team is especially challenged when there is a hard deadline. This can come from the release of a new platform ("we want to be ready for Windows 7 on its release date!") or the expiration of an old component ("Visual Studio 6 won't be supported on Windows Vista"). In these situations, you *have* to convert your system to the new component/compiler/platform by a specific date.

This is the factor that limits your system size. Small systems can be adapted to a new compiler or platform with some effort. Larger systems require more effort. Systems of a certain size will require so much effort that they cannot be converted in time. What's the crossover point? That depends on your code, your tools, and your team's talent. I think that every shop has its own factors. But make no mistake, in every shop there is a maximum size to a system, a size that once crossed will be too large to upgrade before the deadline.

What are the deadlines? That's the evil part to this situation. You're not in control of these deadlines; your vendors create them. For most shops, that's Microsoft, or Sun, or IBM.

Here's the problem for Microsoft shops: MFC libraries.

Lots of applications use MFC. Big systems and small. Commonly used systems and rarely-used ones. All of them dependent on the MFC libraries.

At some point, Microsoft will drop support for MFC. After they drop support, their new tools will not support MFC, and using MFC will become harder. Shops will try to keep the old tools, or try to drag the libraries into new platforms, but the effort won't be small and won't be pretty.

The sunset of MFC won't be a surprise. I'm sure that Microsoft will announce it well in advance. (They've made similar announcements for other tools.) The announcement will give people notice and let them plan for a transition.

But here's the thing: Some shops won't make the deadline. Some applications are so big that their maintainers will be unable to convert them in time. Even if they start on the day Microsoft announces their intent to "sunset" MFC. Their system is too large to meet the deadline.

That's the limit to systems. Not the size of the physical machine, not the size of the compiler's symbol table, but the effort to "stay on the treadmill" of new versions. Or rather, the ability of the development team to keep from falling off the end of the treadmill.

I've picked MFC as the bogeyman in this essay, but there are other dependencies. Compilers, operating systems, third-party libraries, IP4, UNICODE, mobile-ness in apps, the iPhone, Microsoft Office file formats, Windows as a dominant platform, ... the list goes on.

All projects are built on foundations. These foundations can change. You must be prepared to adapt to changes. Are you and your team ready?


Sunday, November 22, 2009

Open Source Microsoft

A lot has been written about Microsoft's latest moves to open source.

I don't expect Microsoft to turn itself into Google. Or Apache. Or even Sun or Novell. I expect Microsoft to remain Microsoft. I expect them to remain a for-profit business. I expect them to keep some amount of software as closed source.

Here's what happens if Microsoft opens its source code in a significant manner:

First, the notion of open source software becomes legitimate. People who avoided open source software because it was "not what Microsoft does" will have no reason to avoid it. They may start to laud the principles of open source. Many companies, large and small, will look at the non-Microsoft offerings and consider them. (I expect a number of shops to remain dedicated to Microsoft solutions, open or closed.)

Second, the open source community takes a hit. Not the entire community, but a major portion of it. The blow is psychological, not technical. The openness of open source defines the "open source community" and separates it from the large commercial shops like Microsoft. If Microsoft adopts open source (even in part), then the traditional open source community (many of whom are Microsoft-bashers) suffer an identity crisis.

Third, the open source folks who depended on the notion of "we're not Microsoft" will substitute some other mechnism for differentiating themselves from Microsoft. Look for renewed language wars (tricky with Microsoft funding things like IronPython and IronRuby) and possibly the notion of "pure" open source. The latter may catch companies that use a dual approach to software, such as Novell and MySQL.

Microsoft will stay focussed on its goals. The open source community may become splintered, with some folks searching for ways to bash Microsoft, some folks trying to blend Microsoft into their current solutions, and others remaining on their current path.

Could it be that Microsoft has found a way to neutralize the threat of open source software?

Sunday, November 15, 2009

With more than toothpicks

On one rather large, multi-decade project, the developers proclaimed that their program was object-oriented. Yet when I asked to see a class hierarchy chart, they could not provide one. I found this odd, since a hierarchy chart is useful, especially for new members of the team. The developers claimed that they didn't need one, and that new team members picked up the code without it. (The statement was true, although the system was so large and so complex that new members needed about six months to become productive.)

I was suspicious of the code's object-oriented-ness. I suspected them of not using object-oriented techniques.

It turns out that their code was object-oriented, but only to a small degree. They had lots of classes, all derived from framework classes. Their code was a thin layer built atop the framework. Their 'hierarchy' was exactly one layer deep. (Or tall, depending on how you look at it.)

This kind of design is akin to building a house (the application) on a good foundation (the framework) but then building everything out of toothpicks. Well, maybe not toothpicks, but small stones and pieces of wood. Rather that using studs and pre-assembled windows, this team built everything above the foundation, and built it with only what was in the foundation. They created no classes to help them -- nothing that was the equivalent of pre-made cabinets or carpeting.

The code was difficult to follow, for many reasons. One of the reasons was the constant shifting of context. Some functions were performed in classes, others were performed in code. Different levels of "height" were mixed in the same code. Here's a (small, made-up) example:

    function print_invoice(Items * items, Customer customer, Terms * terms)
    {
        // make sure customer is valid
        if (!customer.valid()) return ERR_CUST_NOT_VALID;

        // set up printer
        PrinterDialog pdlg;
        if (pdlg.DoModel() == S_OK)
        {
            Printer printer = new Printer(pdlg.GetName());

            char * buffer = NULL;

            buffer = customer.GetName();
            buffer[30] = '\0';
            printer.Print(buffer);
            delete buffer[];
            if (customer.IsBusiness())
            {
                 buffer = customer.GetCompany());
                 buffer[35] = '\0';
                 printer.Print(buffer);
            }
            // more lines to print customer info

            for (int i = 0; i < items.Count(); i++)
            {
                 int item_size = item[i].GetSize();
                 char *buffer2 = new char[item_size];
                 buffer2[item_size] = '\0';

                 printer.Print(buffer);

                 delete buffer2[];
            }

            // more printing stuff for terms and totals

        }
    }

This fictitious code captures the spirit of the problem: A relatively high-level function (printing an invoice) has to deal with very low-level operations (memory allocation). This was not an isolated example -- the entire system was coded in this manner.

The problem with this style of code is the load that it places on the programmer. The poor sap who has to maintain this code (or worse, enhance it) has to mentally bounce up and down thinking in high-level business functions and low-level technical functions. Each of these is a context switch, in which the programmer must stop thinking about one set of things and start thinking about another set of things. Context switches are expensive. You want to minimize them. If you force programmers to go through them, they will forget things. (For example, in the above code the programmer did not delete the memory allocated for printing the company name. You probably didn't notice it either -- you were to busy shifting from detail-to-general mode.)

Object-oriented programming lets us organize our code, and lets us organize it on our terms -- we get to define the classes and objects. But so few people use it to their advantage.

To be fair, in all of the programming courses and books, I have seen very little advocacy for programmers. It's not a new concept. Gerry Weinberg wrote about "the readability of programs" in his The Psychology of Computer Programming in the mid 1970s. And Perl offers many ways to to the same thing, with the guiding principle of "using the one that makes sense". But beyond that, I have seen nothing in courses that strive for making a programmer's job easier. Nor have I seen any management tracts on measuring the complexity of code and designing systems to reduce long-term maintenance costs.

Consequently, new programmers start writing code and group everything into the obvious classes, but stop there. They don't (most of the time) create hierarchies of classes. And why should they? None of their courses covered such a concept. Examples in courses have the same mix of high-level and low-level functions, so programmers have been trained to mix them. The systems they build work -- that is they produce the desired output -- with mixed contexts, so it can't be that big of a problem.

In one sense, they are right. Programs with mixed contexts can produce the desired output. Of course so can non-OO programs using structured programming. And so can spaghetti code, using neither OO or structured programming.

Producing the right output is necessary but not sufficient. The design of the program affects future enhancements and defect corrections. I believe -- but have no evidence -- that mixed-context programs have more defects than well-organized programs. I believe this because a well-organized program should be easier to read, and defects should be easier to spot. High-level functions can contain just business logic and low-level functions can contain just technical details, and a reader of either can focus on the task at hand and not switch between the two.

I think that it is time we focus on the readability of the code, and the stress load that bad code puts on programmers. We have the techniques (object-oriented programming) to organize code into readable form. We have the motive (readable code is easier to maintain). We have the computing power to "afford" what some might consider to be "inefficient" code designs.

All we need now is the will.


Wednesday, November 11, 2009

Oh say can you C?

Programmers have two favorite past-times: arguing about languages and inventing new languages. (Arguing about editors is probably a close third.) When we're not doing one, we're probably doing the other.

I've written about the demise of C++. Yet its predecessor, C, is doing well. So well that people have re-invented C to look more like modern object-oriented languages. Two new languages are "Brace" and "OOC". Brace recasts C syntax to match that of Python, removing braces and using indentation for blocking. OOC is an object-oriented language that is compiled to C.

Improvements to the C language are not new. Objective C was developed in the early 1980s, and C++ itself is a "better" version of C. The early implementations of C++ were source-to-source compiled with a program called 'cfront'.

Improvements of this nature happen a lot. Borland improved Pascal, first extending standard Pascal with useful I/O functions and later morphing it into the Delphi product. Microsoft made numerous changes to BASIC, adding features, converting to Visual Basic, and continuing to add (and often change) features. Even FORTRAN was remade into RATFOR, a name derived from 'rational Fortran'. ('Rational Fortran' meant 'looks like C'.)

I'm not sure that Brace will have much in the way of success. Recasting C into Python gets you ... well, something very close to Python. Why exert the effort? If you wanted Python, you should have started with it. Brace does include support for coroutines, something that may appeal to a very narrow audience, and has support for graphics which may appeal to a broader group. But I don't see a compelling reason to move to it. OOC is in a similar situation. My initial take is that OOC is Ruby but with static typing. And if you wanted Ruby... well, you know the rest.

Improvements to C are nice, but I think the improvers miss an important point: C is small enough to fit inside our heads. The C language is simple and can be understood with four concepts: variables, structs, functions, and pointers. Everything in C is built from these four elements, and can be understood in these terms. You can look at C code and compile it with your "cortex compiler". (I'm ignoring atrocities committed by the preprocessor.) The improved versions of C are more complex and understanding a code fragment requires broader knowledge of the program. Every feature of C++ hid something of the code, and made the person reading the code go off and look at other sections.

The most important aspect of a programming language is readability. Programmers read code more often than you think, and they need to understand. C had this quality. Its derivatives do not. Therefore, there is a cost to using the derivatives. There are also benefits, such as better program organization with object-oriented techniques. The transition from C to C++, or Objective C, or Brace, or OOC is a set of trade-offs, and should be made with care.


Sunday, November 8, 2009

Microsoft Shares its Point but Google Waves

Microsoft and Google are the same, yet different. For example, they both offer collaboration tools. Microsoft offers Sharepoint and Google has announced 'Waves'.

Microsoft Sharepoint is a web-based repository for documents (and anything that passes as a document in the Microsoft universe, such as spreadsheets and presentations). Sharepoint also has a built-in list that has no counterpart in the desktop world. And Sharepoint can be extended with programs written on the .NET platform.

Google Waves is a web based repository for conversations -- e-mail threads -- with the addition of anything that passes for a document in the Google universe.

Sharepoint and Waves are similar in that they are built for collaboration. They are also similar in that they use version control to keep previous revisions of documents.

Sharepoint and Waves are different, and their differences say a lot about their respective companies.

Sharepoint is an extension of the desktop. It provides a means for sharing documents, yet it ties in to Microsoft Office neatly. It is a way for Microsoft to step closer to the web and help their customers move.

Waves is an extension of the web forum thread model, tying in to Google documents. It is a way for Google to step closer to the desktop (or functions that are performed on the desktop) and help their customers.

I've used Microsoft Sharepoint and seen demonstration of Waves. I generally discount demonstrations -- anyone can have a nice demo -- but Google's impressed me.

The big difference is in the approach. Microsoft has introduced Sharepoint as a way for people who use desktops and the desktop metaphor to keep using them. Google, on the other hand, has positioned Waves as a replacement for e-mail.

Why should I mention e-mail? Because e-mail is a big problem for most organizations. E-mail is a model of the paper-based mail system, and not effective in the computer world. We know the problems with e-mail and e-mail threads  (reading messages from the bottom up, losing attachments, getting dropped from lists) and the problems are not small. Yet we believed that the problem was in ourselves, not the e-mail concept.

Google has a better way. They move away from e-mail and use a different model, a model of a conversation. People can join and leave as they wish. New joiners can review older messages quickly. Everyone has the latest versions of documents.

And here is the difference between Microsoft and Google. Microsoft created a tool -- Sharepoint -- to address a problem. Sharepoint is nice but frustrating to use; it is an extension of the desktop operating system and expensive to administrate. It offers little for the user and has no concept of e-mail or conversations. Google has taken the bold step of moving to a new concept, thinking (rightfully so in my opinion) that the problems of collaboration cannot be solved with the old metaphors. Google has started with the notion of conversation and built from there.

Just as EDSAC was an electronic version of a mechanical adding machine and EDVAC was a true electronic computer, e-mail is an electronic version of paper mail and Waves is a conversation system. Microsoft is apparently content with e-mail; Google is willing to innovate.


Friday, October 16, 2009

The end of the C++ party

In the future, historians of programming languages will draw a line and say: "this is the point that C++ began it's decline". And that point will be prior to today. The party is over for C++, although many of the partygoers are still drinking punch and throwing streamers in the air.

Peter Seibel's blog excerpts comments from the just-released Coders and Work. He lists multiple comments about the C++ language, all of them from detractors.

C++ has had a history of negative comments. It's early history, as a quiet project and before the internet and related twitterness, saw comments about C++ through e-mails and usenet. As people became interested in C++, there were more comments (some positive and some negative) but there was the feeling that C++ was the future and it was the place to go. Negative comments, when made, were either directed to the difficultly of learning a new paradigm (object-oriented programming), the implementation (the compiler and libraries), or the support tools (the IDE and debugger). C++ was the shiny new thing.

The arrival of IBM OS/2 and Microsoft Windows also made C++ attractive. OS/2 and Windows use an event-driven model, and object-oriented programs fare better than procedural programs. Microsoft's support for C++ (among other languages) also made it a "safe" choice.

The novelty of a new programming language is a powerful drug, and C++ was a new language. Managers may have been reluctant to move to it (the risks of unknown territory and longer ramp-up for developers) and some programmers too (charges of larger executables and "inefficient generated code") but eventually we (as an industry) adopted it. The euphoria of the new was replaced with the optimism of the next release: "Yes," we told ourselves, "we're having difficulties, but the problem is in our compiler, or our own expertise. Next year will be better!"

And for a while, the next year was better. And the year after that one was better too, because we were becoming better object-oriented programmers and the compilers were getting better.

But there were those who complained. And those who doubted. And there were those who took action.

Sun introduced Java, another object-oriented programming language. For a while, it held the allure of "the new thing". It had its rough spots (performance, IDE) but we overcame them and newer versions were better. And C++ was no longer the one and only choice for object-oriented programming. (I'm ignoring the earlier languages such as LISP and Scheme. They never entered the mainstream.)

Once we had Java, we could look at C++ in a different light. C++ was not the shining superhero that we desired. He was just another shlub that happened to do some things well. C++ was demoted from "all-wonderful" to "just another tool", much to the delight of the early complainers.

Other languages emerged. Python. Ruby. Objective-C. Haskell. All were object-oriented, but none powerful enough to dislodge C++. The killer (for C++) was Microsoft's C# language. The introduction of C# (and .NET) struck two blows against C++.

First, C# was viewed as a Java clone. Microsoft failed at embracing and extending Java, so they created a direct competitor. By doing so, they gave Java (and its JVM) the stamp of legitimacy.

Second, Microsoft made C# their premier language, demoting C++ below Visual Basic. (Count the number of sample code fragments on the Microsoft web site.) Now Microsoft was saying that C++ wasn't the shiny new thing.

We (in the programming industry) examined our problems with C++, discussed them, debated, them, and arrived at a conclusion: problems have been solved, but the one problem remaining is that C++ is a difficult language. The next version of the compiler will not fix that problem. Nor will more design patterns. Nor will user groups.

The C++ party is over. People are leaving. Not just the folks in Coders at Work, but regular programmers. Companies are finding it hard to hire C++ programmers. Recruiters tell me that C++ programmers want to move on to other things. We as a profession have decided to if not abandon C++, at least give it a smaller role.

Which presents a problem for the owners of C++ systems.

The decision to leave C++ has been made at the programmer level. Programmers want out. Very few college graduates learn C++ (or want to learn it).

But the owners of systems (businessmen and managers) have not made the decision to leave C++. For the most part, they want to keep their (now legacy) applications running. They see nothing wrong with C++, just as they saw nothing wrong with C and FORTRAN and COBOL and dBase V. C++ works for them.

In a bizarre, almost Marxist twist, the workers are leaving owners with the means of production (the compilers and IDEs of C++) and moving on to other tools.

C++ has been elevated to the rank of "elder language", joining COBOL and possibly FORTRAN. From this point on, I expect that the majority of comments on C++ will be negative. We have decided to put it out to pasture, to retire it.

There is too much code written in C++ to simply abandon it. Businesses have to maintain their code. Some open source projects will continue to use it. But it will be used grudgingly, as a concession to practicalities. Linux won't be converted to a new language... but the successor to Linux will use something other than C++.


Friday, October 9, 2009

Glass houses

I just went through the experience of renewing my IEEE (and IEEE Computer Society) membership with the IEEE web pages. The transaction was, in a word, embarrassing.

Here is my experience:

- After I logged in, the web site complained that I was attempting to start a second session and left me with an empty window. I had to re-load the renewal page to continue. (Not simple press the "reload" button, but re-select the IEEE URL again.)

- The few pages to process the renewal were straightforward, until I reached the "checkout" page. This page had a collection of errors.

- After entering my credit card number, the site informed me that I had too many characters in the number. I had entered the number with spaces, just as it appears on my credit card and my statements. The site also erased my entry, forcing me to re-enter the entire number.

- I used the "auto-fill" button to retrieve the stored address. The auto-fill did not enter a value for the country, however, and nor could I, as the field was disabled. Only after adjusting the street address could I select a country.

- After clicking the "process" button, the web site informed me that I had an invalid value in the "state/province" field. I dutifully reviewed the value supplied by the auto-fill routine, changed it from "MD" to "MD".

- That action fixed the problem with the state/province field, but the web site then erased my credit card number. After entering the credit card number again (the third time), I was able to renew my membership.

If the IEEE (and by association the IEEE Computer Society) cannot create and maintain a check-out web site, a function that has been with us for the past ten years and is considered elementary, then they have little credibility for advice on software design and construction. More than that, if the IEEE cannot get "the basics" right, how can anyone trust them for the advanced concepts?


Thursday, October 8, 2009

A cell phone is not a land-line phone

When you call a land line, you call a place. When you call a cell phone, you call a person.

I heard this idea at a recent O'Reilly conference. (It was either E-Tech or OSCON, but I don't remember. Nor do I remember the speaker.)

In the good ole days, calling a place was the same as calling a person. Mostly. A typical (working-class) person could have two locations: home and office. To discuss business, you called them at their office. To discuss other matters, you called them at their home.

A funny thing happened on the way to the Twenty-first Century: people became mobile, and technology became mobile too.

Mobility is not a new idea. Indeed, one can look at the technological and social changes of the Twentieth Century to see the trend of increasing mobility. Trains, airplanes, hotels, reservation systems... the arrow points from "stay in one place" to "move among locations". Modern-day cell phones and portable internet tablets are logical steps in a long chain.

People have become mobile and businesses will become mobile too.

Yet many people (and many organizations) cling to the old notion of "a person has a place and only one place". Even stronger is the idea "a business has a place and only one place (except for branch offices and subsidiaries)". Our state and federal governments have coded these notions into laws, with concepts of "state of residence" and "permanent address". Many businesses tie their customers to locations, and then build an internal organization based on that assumption (regional sales reps, for example). For customers that have large physical assets such as factories and warehouses, this makes some sense. But for the lightweight customer, one without the anchoring assets, it does not. (Yet businesses -- and governments -- will insist on a declared permanent address because their systems need it.)

Newer businesses are not encumbered with this idea. Twitter and LiveJournal, for example, don't care about your location. They don't have to assess your property, send tax bills, or deliver physical goods. Facebook does allow you to specify a location, but as a convenience for finding other people in your social network. (Facebook does limit you to one physical location, though, so I cannot add my summer home.)

Some businesses go so far as to tie an account to a physical location. Land line phones for one, from the old billing practices of charging based on distance called. At least one large shipping company uses the "you are always in this place" concept, since it also uses "charge based on distance" model.

For moving physical boxes in the real world, this may make some sense, but telephone service has all but completely moved to the "pure minutes" model, with no notion of distance. (Calling across country borders is more expensive, but this is a function of politics and rate tariffs and not technology.)

We have separated a person from a single location. Soon we will detach businesses from single locations.


Sunday, October 4, 2009

Limits to Growth

Did you know that Isaac Newton, esteemed scientist and notable man of the Church, once estimated the theoretical maximum height for trees? I didn't, until I read a recent magazine article. It claimed that he calculated the maximum height as 300 feet, using strength and weight formulas.

I have found no other reference to confirm this event, but perhaps the truth of the event is less important than the idea that one can calculate a theoretical maximum.

For trees, the calculation is straightforward. Weight is a function of volume, and can be charted as a line graph. Strength is a function of the cross-section of the tree, and can also be charted as a line graph. The two lines are not parallel, however, and the point at which they cross is the theoretical maximum. (There are a few other factors, such as the density of the wood, and they can be included in the calculation.) The intersection point is the limit, beyond which no tree can grow.

Let's move from trees to software. Are there limits to software? Can we calculate the maximum size of a program or system? Here the computations are more complex. I'm not referring to arbitrary limits such as the maximum modules a compiler can handle (although those limits seem to be relegated to our past) but to the size of a program, or of a system of programs.

It's hard to say that there are limits to the size of programs. Our industry, over the past sixty years, has seen programs and systems grow in size and complexity. In the early days, a program of a few hundred lines of code was considered large. Today we have systems with hundreds of millions of lines of code. There seems to be no upper limit.

If we cannot identify absolute limits for programs or systems, can we identify limits to programming teams? It's quite easy to see that a programming team of one person would be limited to the one of a single individual. That individual might be extremely talented and extremely hard-working, or might be an average performer. A team of programmers, in theory, can perform more work than a single programmer. Using simple logic, we could simply add programmers until we achieve the needed size.

Readers of The Mythical Man-Month by Fred Brooks will recognize the fallacy of that logic. Adding programmers to a team increases capacity, but also increases the communication load. More programmers need more coordination. Their contributions increase linearly, but coordination effort increases faster than linearly. (Metcalfe's law which indicates that communication channels increase as the square of the participants, works against you here.) You have a graph with two lines, and at some point they cross. Beyond that point, your project spends more time communication than coding, and each additional programmer costs more than they produce.

I don't have numbers. Brooks indicated that a good team size was about seven people. That's probably a shock to the managers of large, multi-million LOC projects and their teams of dozens (hundreds?) of programmers. Perhaps Brooks is wrong, and the number is higher.

The important thing is to monitor the complexity. Knowing the trend helps one plan for resources and measure efficiency. Here's my list of important factors. These are the things I would measure:

- The complexity of the data
- The complexity of the operations on the data
- The power of the programming language
- The power of the development tools (debuggers, automated tests)
- The talent of people on the team (programmers, testers, and managers)
- The communication mechanisms used by the team (e-mail, phone, video conference)
- The coordination mechanisms used by the team (meetings, code reviews, documents)
- The rate at which changes are made to the code
- The quality of the code
- The rate at which code is refactored

The last two factors are often overlooked. Changes made to the code can be of high or low quality. High-quality changes are elegant and easy to maintain. Low-quality changes get the work done, but leave the code difficult to maintain. Refactoring improves the code quality while keeping the feature set constant. Hastily-made changes often leave you in a technical hole. These two factors measure the rate at which you are climbing out of the hole. If you aren't measuring these two factors, then your team is probably digging the hole deeper.

So, as a manager, are you measuring these factors?

Or are you digging the hole deeper?


Wednesday, September 23, 2009

Why IT has difficulty with estimates

Estimating has always been a difficult task in IT. Especially for development efforts. How long will it take to write the program? How much will it cost? How many people do we need? For decades, we have struggled with estimates and project plans. Development projects run over allotted time (and over allotted budget). Why?

I observe that the problem with estimates is on the development side of IT. The other major parts of IT, support and operations, have loads that can be reliably estimated. For support, we have experience with the number of customers who call and the complexity of their issues. For operations, we have the experience of nightly jobs and the time it takes to run them. It's only on the development side, where we gather requirements, prepare designs, and do the programming that we have the problem with estimates. (I'm including testing as part of the development effort.)

The process of estimation works for repeated tasks. That is, you can form a reasonable estimate for a task that you have performed before. The more often you have performed the task, the better your estimate.

For example, most people have very good estimates for the amount of time they need for their morning commute. We know when to leave to arrive at the office on time. Every once in a while our estimate is incorrect, due to an unforeseen event such as traffic delays or water main breaks, but on average we do a pretty good job.

We're not perfect at estimates. We cannot make them out of nothing. We need some initial values, some basis for the estimate. When we are just hired and are making our first trips to the new office, we allow extra time. We leave early and probably arrive early -- or perhaps we leave at what we think is a good time and arrive late. We try different departure times and eventually find one that works for us. Once we have a repeating process, we can estimate the duration.

Hold that thought while I shift to a different topic. I'll come back to estimates, I promise.

The fundamental job of IT is to automate tasks. The task could be anything, from updating patient records to processing the day's sales transactions. It could be monitoring a set of servers and restarting jobs when necessary. It could be serving custom web pages. It is not a specific kind of task that we automate, it is the repetition of *any* task.

Once we identify a repeating task, we automate it. That's what we do. We develop programs, scripts, and sometimes even new hardware to automate well-defined, repeating tasks.

Once a task has been automated, it becomes part of the operation. As an operation task, it is run on a regular schedule with an expected duration. We can plan for the CPU load, network load, and other resources. And it is no longer part of the development task set.

The repeating tasks, the well-defined tasks, the tasks that can be predicted, move to operations. The tasks remaining for development -- the ones that need estimates -- are the ones that have are not repeating. They are new. They are not well-defined. They cover unexplored territory.

And here's where estimates come back into the discussion. Since we are constantly identifying processes that can be automated, automating them, and moving them from development to operations, the well-defined, repeatable tasks fall out of the development basket, leaving the ill-defined and non-repeating tasks. These are the tasks that cannot be estimated, since they are not well-defined and repeating.

Well, you *can* write down some numbers and call them estimates. But without experience to validate your numbers, I'm not sure how you can call them anything but guesses.


Monday, September 21, 2009

Your data are not nails

Data comes in different shapes, sizes, and with different levels of structure. The containers we select for data should respect those shapes and sizes, not force the data into a different form. But all too often, we pick one form for data and force-fit all types of data into that form. The result is data that is hard to understand, because the natural form has been replaced with the imposed form.

This post, for example, is small and has little structure (beyond paragraphs, sentences, and words). The "natural" form is the one your reading now. Forcing the text into another form, such as XML, would reduce our comprehension of the data. (Unless we converted the text back into "plain" format.)

One poor choice that I saw (and later changed) was the selection of XML for build scripts. It was a system that I inherited, one that was used by a development team to perform the compile and packaging steps for a large C++/MFC application. 

The thinking behind the choice or XML was twofold: XML allowed for some structure (it was thought there would be some) and XML was the shiny new thing. (There were some other shiny new things in the system, including Java, a web server, RMI, EJB, and reflection. It turns out that I got rid of all of the shiny things and the build system still worked.)

I can't blame the designers for succumbing to XML. Even Microsoft has gone a bit XML-happy with their configuration files for projects in Visual Studio.

It's easy to pick a single form and force all data into that form. It's also comfortable. You know that a single tool (or application) will serve your needs. But anyone who has used word processors and spreadsheets knows that the form of data lets us understand it.

Some data is structured, some is free-flowing. Some data is large, some is small. Some data consists of repeated structures, other data has multiple items with structure but each item has its own structure.

For build scripts, we found that text files were the most understandable, most flexible, and most useful form. Scripts are (typically) of moderate size. Converting the XML scripts to text saw the size of scripts shrink, from 20,000 lines to about 2200 lines. The smaller scripts were much easier to maintain, and the time for simple changes dropped from weeks to hours. (Mostly for testing. The time for script changes dropped to minutes.)

Small data sets with no to light structure fit well in text files. Possibly INI files, which have a little more structure to them.

Small to medium data sets with heavy structure fit into XML files.

Large data sets with homogeneous items fit well in relational databases.

Large data sets with heterogeneous items fit better into network databases or graph databases. (The "No SQL" movement can give you information about these databases.)

Don't think of all data as a set of nails, with your One True Format as the hammer. Use forms that make your team effective. Respect the data, and it will respect you.


Wednesday, September 16, 2009

My brain is now small

My brain is now smaller than the programming languages I use.

I started programming in 1976, on a PDP-8/e computer with timeshare BASIC. (Real BASIC, not this Visual Basic thing for wimps. BASIC has line numbers and variable names limited to a letter and an optional digit.)

Since then I have used various languages, operating systems, and environments. I've used HDOS, CP/M, and UCSD p-System on microcomputers (with BASIC, 8080 assembly language, and C), DECsystem-10s (with TOPS-10 and FORTRAN and Pascal), MS-DOS (with C and C++), and Windows (with C++, Java, Perl, and a few other languages).

In each case, I have struggled with the language (and run-time, and environment). The implementation has been the limiting factor. My efforts have been (mostly) beating the language/environment/implementation into submission to get the job done.

That has now changed.

I've been using the Ruby language for a few small projects. Not Ruby on Rails, which is the web framework that uses Ruby as the underlying language, but the Ruby language itself. It is a simple scripting language, like Perl or Python. It has a clean syntax and object-oriented concepts are baked in, not stuck on like in Perl. But that's not the most important point.

Ruby lets me get work done.

The language and run-time library are simple, elegant, and most importantly, capable. It lets me do work and stays out of my way. This is a pleasant change.

And somewhat frightening.

With Ruby, the limiting factor is not the language. The limiting factor is my programming skills. And while my programming skills are good, they are the result of working with stunted languages for the past thirty-odd years. The really neat stuff, the higher-order programming concepts, I have not learned, because the languages that I used could not support them.

Now I can.

I'm frightened, and excited.


Monday, September 14, 2009

Destination Moon

I recently watched the classic movie "Destination Moon". In this movie, the government convinces a set of corporations to design, engineer, and build a rocket that can fly to the moon. It's an interesting movie, albeit a period piece with its male-dominated cast, suits and ties, and special effects. It is technically accurate (for 1950) and tells an enchanting tale.

What is most interesting (and possibly scary) is the approach to project management. The design team, with experience in airplanes and some with stratospheric rockets, builds a single moon rocket in a single project. That's a pretty big leap.

The actual moon missions, run by NASA in the 1960s, took smaller steps. NASA ran three different programs (Mercury, Gemini, and Apollo) each with specific goals. Each program consisted of a number of missions, and each mission had a lot of work, including tests.

Yet for the movies, the better tale is of some bright, industrial engineers who build a rocket and fly it to the moon. The movie heightens the sense of suspense by avoiding the tests. Its much more exciting to launch an *untested* rocket and fly it to the moon!

For big projects, NASA has the better approach: small steps, test as you go, and learn as you go. That goes for software projects too.

Anyone involved in software development for any reasonable length of time has been involved in (or near to) a "big-bang" project. These projects are large, complex, and accompanied by enthusiasm and aggressive schedules. And often, they are run like the project in "Destination Moon": build the system all at once. The result is usually disappointing, since large complex systems are, well, large and complex. And we cannot think of everything. Something (often many things) are left out, and the designed system is slow, incomplete, and sometimes incorrect.

Agile enthusiasts call this the "Big Design Up Front" method, and consider it evil. They prefer to build small portions, test the portions, and learn from them. They would be comfortable on the NASA project: fly the Mercury missions first and learn about orbiting the earth and operating outside the ship, then build the Gemini missions and learn about docking, and finally build the Apollo missions and go to the moon. On each mission, the folks at NASA learned more about the problem and found solutions to the problems. The Apollo missions had problems, but with the knowledge from earlier missions, NASA was able to succeed.

The "Destination Moon" style of project management is nice for movies, but for real-life projects, NASA has the better approach.


Friday, September 11, 2009

Opening the gates for Windows

The SystemMax PC from TigerDirect arrived today. This is a 2.4 GHz dual core, 2 GB RAM, 160 GB DASD unit. I ordered it for Windows 7, and set the PC up this afternoon.

Installing Windows 7RC was fairly easy. Microsoft has made the install just as good as the installs for most Linux distros. The PC started, booted off the DVD, started the install program, detected the video card and internet interface, asked a few questions, and ran for about ten minutes. The process did require two restarts. When it was finished, I had Windows 7 running.

I downloaded and installed the Microsoft Word Viewer and Microsoft Excel Viewer. I also downloaded and installed Visual C# Express and Web Developer Express. They included SQL Server Express and the Silverlight runtime libraries.

This is a big event in my shop. I banished Windows back in 2005 and used nothing but Linux since then. Now Windows is back on my network.

Why allow Windows? To get experience with Microsoft SQL Server. Lots of job posts list it as a required skill. I will do what it takes to get a good job.


Monday, September 7, 2009

Application Whitelist Blues

The latest security/compliance tool for desktop PCs is the application whitelist. This is an administration utility that monitors every application that runs (or attempts to run) and allows only those applications that are on the approved list.

Companies like Bit9 sell application whitelist tools. They have slick advertising and web pages. Here's a quote from Bit9's web page: "Imagine issuing a computer to a new employee and getting it back two years later - with nothing else installed on it but the applications and patches you had centrally rolled out." Appealling, no?

Whitelists will certainly prevent the use of unapproved programs. CIOs will have no worries about their team using pirated software. CIOs will know that their team is not using software that has been downloaded from the internet or smuggled in from home. They will have no worries about their workers bringing in virus programs on USB drives.

Of course, application whitelist techniques are incomplete. They claim to govern the use of programs. But what is a program?

Certainly .EXE files are programs. And it's easy to convince people that .DLL  files are executables. (Microsoft considers them as such.)

Java .class files are programs. They must be "run" by the JVM and not directly by the processor, but they are programs. So are .NET executables (which usually sport the extension .EXE); they are "executed" by Microsoft's CLR, the counterpart to the Java JVM. Application whitelists will probably govern individual .NET programs (since they are launched with the .EXE extension) but not individual Java applications. They will probably stop at the JVM.

Perl, Python, and Ruby programs are text and run by their interpreters. They are compiled into bytecode like Java and .NET programs, but as they are read, not in advance. They can be considered programs. I'm fairly sure that application whitelists won't govern individual Perl scripts; they will allow or disallow the Perl interpreter. If you can run Perl, then you can run any Perl script.

And what about spreadsheets? Spreadsheets are not too far from Perl scripts, in terms of their program-ness. Spreadsheets are loaded and "executed" by an "interpreter" just like a Perl script is loaded and executed. I doubt that application whitelists will govern individual spreadsheets.

From what I can see, programs like Bit9's "Parity" application whitelist monitor .EXE files and nothing more.

OK, enough with the philosophical discussion of applications.

Here's the real danger of application whitelists: If they govern the use of programs (whatever we agree as programs) and the lists are created by governance committees (or software standards committees, or what have you) then they limit the creativity of the entire organization. The entire organization will be constrained to the approved list of software. No one in the organization will be able to think outside the approved-software box.

The entire organization will be as creative as the governance committee lets them be.

And not one bit more.


Wednesday, September 2, 2009

Microsoft thinks inside the document

The Microsoft world revolves around the word processor. It's collective mindset is one of individuals working on documents with little communication between people. And why not? Microsoft Word (and Excel, and Powerpoint) made Microsoft successful.

The earliest text processors were written in the 1950s, but word processors as we know them today were created in the late 1970s. Products such as Electric Pencil and Wordstar defined the notion of interactive word processing: a person typing on a keyboard and looking at a document on the screen. The term "WYSIWYG" came into usage, as batch text processors grew into interactive word processors.

The mental model is one of the struggling author, or the hard-bitten newspaper reporter, slogging away on a manual typewriter. It is the individual preparing a Great Work.

And this is the model that Microsoft follows. Most of its products follow the idea of an individual working on a document. (Whether they are Great Works or not remains to be seen.) Microsoft Word follows it. Microsoft Excel follows it, with the twist that the document is not lines of text but cells that can perform math. Microsoft Project and Microsoft Powerpoint follow the individual model, with their slight twists. Even Visual Studio, Microsoft's IDE for programmers, uses this mental model. (Not all products. Other products, such as SQL Server and their on-line games, are made for sharing data.) Even Sharepoint, the corporate intranet web/share/library system, is geared for storing documents created or edited by individuals.

The idea that Microsoft misses is collaboration. The interactive word processor is thirty years old, yet Microsoft still follows the "user as individual" concept. In the late 1970s, sharing meant exchanging floppy disks or posting files on a bulletin board. In 2009, sharing to Microsoft means sending e-mails or posting files on a Sharepoint site.

The latest product that demonstrates this is Sketchflow, a tool for analysts and user experience experts to quickly create mock-ups of applications. It's a nice idea: create a tool for non-programmers to build specifications for the development team.

Microsoft uses a Silverlight application to let a person build and edit a description. They can specify windows (or pages, if you want to think of them that way); place buttons, text boxes, and other controls on the window; and link actions to buttons and connect pages into a sequence. (It sounds a lot like Visual Studio, doesn't it? But it isn't.)

I can see business analysts and web designers using Sketchflow. It makes sense to have a tool to quickly build a wireframe and let users try it out.

Microsoft misses completely on the collaboration aspect. Each Sketchflow project is a separate thing, owned by a person, much like a document in MS Word. Sharing means sending the project (probably by e-mail) to the reviewers, who use it and then send it back with notes. That works for one or maybe two users, but once you have more reviewers the coordination work becomes untenable. (Consider sending a document to ten reviewers, receiving ten responses, and then combining all of their comments. Even with change-tracking enabled.)

There is no attempt at on-line collaboration. There is no attempt at multi-review comment reconciliation. The thinking is all "my document and your comments".

The word "collaboration" seems to be absent from Microsoft's vocabulary: a review of the Microsoft-hosted web sites that describe Sketchflow omit the word "collaboration" or any of its variants.

Its time that we take the "personal" out of "personal computer" and start thinking of collaboration. People work together, not in complete isolation. Google and Apple have taken small steps towards collaborative tools. Will Microsoft lead, follow, or at least get out of the way?

Thursday, August 27, 2009

If at first...

The movie "The Maltese Falcon" (with Humphery Bogart, Mary Astor, Peter Lorre, and Sydney Greenstreet) is widely recognized as a classic.

Yet the 1941 movie did not simply spring into existence. There were two predecessors: a 1931 version also called "The Maltese Falcon" and a 1936 remake called "Satan Met a Lady". (Both of which were based on the novel by Dashiell Hammett.)

All three of the movies were made by Warner Brothers. The first two have been cast into oblivion; the third remains with us.

The third movie was a success for a number of reasons:

- Warner Brothers tried different approaches with each movie. The first movie was a serious drama. The second movie was light-hearted, almost to the point of comedy. The third movie, like Goldilocks' porridge, was "just right".

- They had the right technologies. The first version used modern (for the time) equipment, but Hollywood was still feeling its way with the new-fangled "sound" pictures. The first version relied on dialog alone; the classic version used sound and music to its advantage.

- They took the best dialog from the first two versions. The third movie was dramatic and witty, combining the straight drama of the first and the comic aspects of the second.

- They had better talent. The acting, screenwriting, and camerawork of the third movie is significantly better than the first two efforts.

In the end, Warner Brothers was successful with the movie, but only after trying and learning from earlier efforts.

Perhaps there is a lesson here for software development.

Monday, August 24, 2009

Dependencies

One of the differences between Windows and Linux is dependencies, and how they handle dependencies in software.

Linux distros have a long history of managing dependencies between packages. Distros include package managers such as 'aptitude' or 'YAST' which, when you to select products for installation, identify the necessary components and install them too.

Windows, on the other hand, has practically no mechanism for handling dependencies. In Windows, every install package is a self-contained thing, or at least views itself that way.

This difference is possibly due to the history of product development. Windows (up until the .NET age) has had a fairly flat stack for dependencies. To install a product, you had to have Windows, MFC, and... that was about it. Everything was in either the original Windows box or the new product box.

Linux has a larger stack. In Linux, there is the kernel, the C run-time libraries, the X windowing system, Qt, the desktop manager (usually KDE or Gnome), and possibly other packages such as Perl, Python, Apache, and MySQL. It's not uncommon for a single package to require a half-dozen other packages.

The difference in dependency models may be due to the cost model. In Linux, and open source in particular, there is no licensing cost touse a package in your product. I can build a system on top of Linux, Apache, MySQL, and Perl (the "LAMP stack") and distribute all components. (Or assume that the user can get the components.) Building a similar system in Microsoft technologies would mean that the customer muct have (or acquire) Windows, IIS, SQL Server, and, umm... there is no direct Microsoft replacement for Perl or other scripting language. (But that's not the point.) The customer would have to have all of those components in place, or buy them. It's a lot easier to leverage sub-packages when they are freely available.

Differences in dependency management affect more than just package installation.

Open source developers have a better handle on dependencies than developers of proprietary software. They have to -- the cost of not using sub-packages is too high, and they have to deal with new versions of those packages. In the proprietary world, the typical approach I have seen is to select a base platform and then freeze the specification of it.

Some groups carry the "freeze the platform" method too far. They freeze everything and allow no changes (except possibly for security updates). They stick to the originally selected configuration and prevent updates to their compilers, IDEs, database managers, ... anything they use.

The problem with this "freeze the platform" method is that it doesn't work forever. At some point, you have to upgrade. A lot of shops are buying new PCs and downgrading Windows Vista to Windows XP. (Not just development shops, but let's focus on them.) That's a strategy that buys a little time, but eventually Microsoft will pull the plug on Windows XP. When the time comes, the effort to update is large -- usually a big, "get everyone on the new version" project that delays coding. (If you're in such a shop, ask folks about their strategy for migrating to Windows Vista or Windows 7. If the answer is "we'll stay on Windows XP until we have to change", you may want to think about your options.)

Open source, with its distributed development model and loose specification for platforms, allows developers to move from one version of a sub-package to another. They follow the "little earthquakes" model, absorbing changes in smaller doses. (I'm thinking that the use of automated tests can ease the adoption of new versions of sub-packages, but have no experience there.)

A process that develops software on a fixed platform will yield fragile software -- any change could break it. A process to handle dependencies will yield a more robust product.

Which would you want?

Thursday, August 20, 2009

Systems are interfaces, not implementations

When building a program, the implementation is important. It must perform a specific task. Otherwise, the program has little value.

When building a system, it is the interfaces that are important. Interfaces exist between the components (the implementations) that perform specific tasks.

Interfaces define the system's architecture. Good interfaces will make the system; poor interfaces will break it.

It is much easier to fix a poorly-designed component than a poorly-designed interface. A component hides behind an interface; it's implementation is not visible to the other components. (By definition, anything that is visible to other components is part of the interface.) Since the other components have no knowledge of the innards, changing the innards will not affect them.

On the other hand, an interface is visible. Changing an interface requires changes to the one component *and* changes to (potentially) any module that uses the component. (Some components have large, complex interfaces; a minor change may affect many or few consumers-components.) Changes to interfaces are more expensive (and riskier) than changes to implementations.

Which is why you should pay attention to interfaces. As you build your system, you will get some right and some wrong. The wrong ones will cost you time and quality. You need a way to fix them.

Which doesn't mean that you can ignore implementations. Implementations are important; they do the actual work. Often they have requirements for functionality, accuracy, precision, or performance. You have to get them right.

Too many times we focus on the requirements of the implementations and ignore the interfaces. When a development process is driven by "business requirements" or "functional requirements" then the focus is on implementations. Interfaces become a residual artifact, something that "goes along for the ride" but isn't really important.

If you spend all of your time implementing business requirements and give no thought to interfaces, you will build a system that is hard to maintain, difficult to expand, and providing a poor user experience.

Monday, August 17, 2009

Teaching does not cause learning

Our society has done a pretty good job at Taylorizing the learning experience. It's structured, it's ordered, it's routine, and it's efficient. And it's quite ineffective. (For proof, compare the results of US children against other nations.)

I learn best when I can explore and make mistakes. The mistakes is the important part. I learn only when I make a mistake. When I do something and get the wrong results, I try different methods until one works. That's how I learn Apparently I'm not alone in this.

In the book Bringing Design to Software, Schön and Bennet's essay describes the results of students with two computer programs. One (named "McCavity") was designed to be a tutoring program and the other ("GrowlTiger") was created as a simple design tool. Both were for engineering students.

The results were surprising. The students found McCavity (the tutoring program) boring and were more interested in GrowlTiger.

Maybe the results were not so surprising. The tutoring program provided students with information but they had little control over the delivery. The design program, on the other hand, let students explore. They could examine the problems foremost in their minds.

Exploring, trying things, and making mistakes. That's how I learn. I don't learn on a schedule.

You can lead a horse to water, but you can't make him drink. And you can lead students to knowledge, but you can't stuff it down their throats.

Sunday, August 16, 2009

Why leave C++?

A recruiter asked me why I wanted to move away from C++. The short answer is that it is no longer the shiny new thing. The longer answer is more complex.

Why move away from C++?

For starters, let's decide on where I'm moving *to*. It's easy to leave C++ but harder to pick the destination. In my mind, the brighter future lies with C#/.NET, Java, Python, and Ruby.

First reason: C++ is tied to hardware. That is, C++ is the last of the big languages to compile to the processor level. C#, Java, Perl, Python, and Ruby all compile to a pseudo-machine and run on interpreters. C# runs in the Microsoft CLR, Java runs on the JVM, and so on. By itself, this is not a problem for C++ but an advantage: C++ programs run more efficiently than the later languages. Unfortunately for C++, the run-time efficiency is not enough to make up for development costs.

Second reason: Easier languages. C++ is a hard language to learn. It has lots of rules, and you as a developer must know them all. The later languages have backed off and use fewer rules. (Note for C#: you're getting complicated and while not at C++'s level of complexity you do have a lot to remember.)

Third reason: Garbage collection. The later languages all have it; C++ does not (unless you use an add-in library). In C++ you must delete() everything that you new(); in the later languages you can new() and never worry about delete(). Not worrying about deleting objects lets me focus on the business problem.

Fourth reason: Better tools. Debugging and testing tools can take advantage of the interpreter layer. Similar tools are available in C++ but their developers have to work harder.

Fifth reason: Platform independence isn't that important. The big advantage of C++ is platform independence; all of the major platforms (Windows, Mac, Linux, Solaris) have ANSI-compliant compilers. And the platform independence works, at the command-line level. It doesn't extend to the GUI level. Microsoft has its API for Windows, Mac has its API, Solaris usually use X, and Linux uses X but often has Gnome or KDE on top of X.

Sixth reason: Developer efficiency. I'm much more effective with Perl than with C#, and more effective with C# than C++. C++ is at the bottom of the pile, the programming language that takes me the longest time to implement a solution. It's usually better for me (and my clients) to get a program done quickly. I can complete the assignment in Perl in half a day, in C#.NET in a day, and in C++ in two or more days. (This does depend on the specifics of the task.)

Seventh reason: Fit with web technologies. C++ fits poorly with the web frameworks that are emerging; especially for cloud computing. Yes, you can make it work with enough effort. But the later languages make it work with less effort.

Eighth reason: Applications in later languages have less cruft. This is probably a function of time and not language design. Cruft accumulates over time, and the applications written in later languages have had less time to accumulate cruft. I'm sure that they will. But by then, the older C++ applications will have accumulated even more cruft. And cruft makes mantenance harder.

Ninth reason: Management support. I've observed that managers support projects with the newer languages better than projects in C++. This is possibly because the applications in newer languages are newer, and the management team supports the newer applications. By 'support', I mean 'provide resources'. New applications are given people, money, and technology; older applications are put into 'maintenance mode' with limited resources.

So there are my reasons for leaving C++. None of these reasona are tied directly to C++; in fact I expect to see many of the same problems with newer applications in the next few years. Look to see another article, say a few years hence, of why I want to leave C#.

Monday, August 10, 2009

Consumers drive tech

I'm not sure when it happened, but some time in the past few years consumers have become the drivers for technology.

In the "good old days", technology such as recording equipment, communication gear, and computing machinery followed a specific path. First, government adopted equipment (and possibly funded the research), then corporations adopted it, and finally consumers used watered-down versions of the equipment. Computers certainly followed this path. (All computers, not just microcomputers or PC variants.)

The result was that government and large corporations had a fairly big say in the design and cost of equipment. When IBM was selling mainframes to big companies (and before they sold PCs), they would have to respond to the needs of the market. (Yes, IBM was a bit of a monopoly and had market power, so they could decide some things.) But the end result was that equipment was designed for large organizations, with diminutive PCs being introduced after the "big" equipment. Since the PCs came later, they had to play with the standards set by the big equipment: PC screens followed the 3270 convention of 25 lines and 80 characters, the original discs for CP/M were IBM 3740 compatible, and the original PC keyboard was left-over parts from the IBM System/23. CP/M took its design from DEC's RT-11 and RSX-11 operating systems, and PC-DOS was a clone of CP/M.

But the world has changed. In the twenty-first century, consumers decide the equipment design. Cell phones, internet tablets, and iPhones are designed and marketed to individuals, not companies. (The one exception is possibly the Blackberry devices, which are designed to integrate into the corporate environment.)

The typical PC purchased for home use is more powerful that the typical corporate PC. I myself saw this effect when I purchased a laptop PC. It had a faster processor, more memory, and a bigger screen than my corporate-issued desktop PC. And it stayed in front for several years. Eventually an upgrade at the office surpassed my home PC... but it took a while.

Corporations are buying the bargain equipment, and consumers are buying the premium stuff. But it's more than hardware.

Individuals are using software, specifically social networking and web applications, much faster than companies and government agencies. If you consider Facebook, Twitter, Dopplr, and LiveJournal, it is clear that the major design efforts are for the consumer market. People use these applications and companies do not. The common office story is often about the new hire just out of college, who looks around and declares the office to be medieval, since corporate policies prevent him from checking his personal e-mail or using Twitter.

With consumers in the driver's seat, corporations now have to use equipment that is first designed for people and somehow tame it for corporate use. They tamed PCs, but that was an easy task since PCs were derived from the bigger equipment. New items like iPhones have always been designed for consumers; integrating them will be much harder.

And there's more. With consumers getting the best and corporations using the bargain equipment, individuals will have an edge. Smaller companies (say, two guys in a garage) will have the better equipment. They've always been able to move faster; now they will have two advantages. I predict that smaller, nimbler companies will arise and challenge the existing companies.

OK, that's always happening. No surprise there. I think there will be more challengers than before.

Friday, July 31, 2009

RIP Software Development Conference

I am behind the times. Not in the loop. Uninformed.

Techweb killed the SD conferences. These were the "Software Development" conferences that I liked. (So much that I would pay my own way to attend them.)

Techweb killed them back in March, shortly after the "SD West 2009" con.

Here's an excerpt of the announcement that Techweb sent to exhibitors.

Due to the current economic situation, TechWeb has made the difficult decision to discontinue the Software Development events, including SD West, SD Best Practices and Architecture & Design World. We are grateful for your support during SD's twenty-four year history and are disappointed to see the events end.

Developers remain important to TechWeb, and we encourage you to participate in other TechWeb brands, online and face-to-face, which include vibrant developer communities:
...
Again, please accept our sincerest gratitude for your time, effort and contributions over the years. It is much appreciated.


The full text is posted on Alan Zeichick's blog.

The SD shows were inspirational. They brought together people with the one common interest of writing software. The shows were not sponsored by a single company, nor did they focus on one technology. People came from different industries to discuss and learn about all aspects of software. As one fellow-attendee said: the conferences were "ecumenical".

While I'm saddened at the loss (and bemused at my ignorance of their demise), I'm disappointed with Techweb's approach. Their announcement is bland and uninspiring. The brutal utility of the message tells us of Techweb's view of its mission: running conferences efficiently (and profitably). Their announcement can be paraphrased: "SD was not profitable, so we killed it. We've got these other shows; please spend money on them."

In contrast, the O'Reilly folks have a very different mission: building a community. They run conferences for the community, not as their means of existence. (They also publish books, host web sites, and run other events.) If a conference should become unprofitable, then it becomes a drag on their mission of building community and I would expect them to cancel it. But here's the difference: I would expect O'Reilly to provide another means for people to meet and discuss and learn, and I would expect O'Reilly to phrase their announcement in a more positive and inspirational light. Something along the lines of:

We've been running the (fill in the name) conference for (number) years, bringing people together and building the community. In recent years, the technology and the environment have changed, and the conference is no longer the best way to meet the needs of the practitioners, presenters, and exhibitors. We're changing our approach, and creating a new (whatever the new thing is) to exchange experiences and learn from each other. We invite you to participate in this new aspect of our community.

OK, that's not the perfect announcement, but it's much closer to what I want from conference organizers.

I'm not a marketing expert; I'm a programmer. But I know what I want: Someone who listens. O'Reilly does that. I'm not sure that Techweb does.

Tuesday, July 28, 2009

Last century's model?

When it comes to processes, commercial software development is living in the industrial age. And it doesn't have to be that way.

Lots of development shops use the standard "integration" approach to building software. Small teams build components, which are then integrated into assemblies, which are then integrated into subsystems, which then become systems, which are then assembled into the final deliverable. At each step, the component/assembly/subsystem/system is tested and then "released" to the next higher team. Nothing is released until it passes the group's quality assurance process.

This model resembles (one might say "duplicates") the process used for physical entities by defense contractors, automobile assembly plants, and other manufacturers. It leads to a long "step" schedule, with each component dependent on the release of pieces.

But does it really have to be that way?

For the folks building a new fighter plane, the model makes sense. If I'm working on a new jet engine, you can't have it, because it can be in only one physical place at any given time. I need it until I'm done. There's no way I can work on it and let you have a copy. (Oh, we could build two of them, but that would add a great deal of expense and the synchronization problems are significant.) We are limited to the "build the pieces and then bring them together" process.

Once the complete prototype is proven, we can change to an asychronous assembly process, one that creates a large number of the components and assembles them as needed. Bot for software, that amounts to duplicating the "golden CD" or posting the files on the internet.

Software is different from physical assemblies. You *can* have a copy while I work on it. Software is bits, and easily copied. Rather than hold a component in hiding, you can make the current version available to everyone else on the project. Other teams can integrate the latest version into their component, test it, and give you feedback. You would get feedback faster, since you don't have to wait for the "integration phase" (when you've committed to the design of your component).

And this is in fact what many open source projects do. They make their latest code available... to anyone. The software projects that I have seen have two sets of available code: the latest build (usually marked as "development" or "unstable") and the most recent "good" release (marked as "stable").

If you're on a project (or running a project) that uses the old "build pieces independently and then bring them together" process, you may want to think about it.

Sunday, July 26, 2009

Just what is a cloud, anyway?

The dominant theme at last week's OSCON conference was cloud computing. So what can I say about cloud computing?

As I see it, "cloud computing" is a step towards a the commoditization of computing power. The cloud model moves server hardware and base software out of corporate data centers and into provider data centers. Clients (mostly corporations) can use cloud computing services "on demand", paying more as they use more and less as they use less.

Cloud computing services fill the second tier, between the front-end browser and the legacy back-end processing systems. This is where web processing is today.

The big players have signed on to this new model. Microsoft has its "Azure" offering, Amazon.com has EC2 and S3, and Google has its App Engine.

Cloud providers are similar to the early electric companies. They build and operate the generators and transmission lines, and insist on meters and bills.

Moving into the cloud requires change. Your applications must be ready to work in the cloud. They have to talk to the cloud APIs and be designed to run in multiple instances. Indeed, one of the features of the cloud is that new instances of your application can come on-line as you request more computing power.

Like the early electricity companies, each provider has its own API. Apps for the Amazon.com platform cannot be (easily) transferred to Google. And Microsoft not only has its own API but uses the .NET platform with its development languages. I suspect that common APIs will emerge, but only after time and possibly with government assistance. (Just as the government set standards for control pedals in automobiles.)

I suspect that apps on the cloud will be different from today's apps. Mainframes had their standards apps: accounting and finance, mainly. When minicomputers arrived, people ported the accounting apps to minis with some success but also created new applications such as word processing. Later, PCs arrived and absorbed the word processing market but also saw new apps such as spreadsheets. Networked PCs created e-mail but left the old apps in stand-alone mode. The web saw new applications like LiveJournal and Facebook. (OK, yes, I know that e-mail existed prior to networked PCs. But networked PCs made e-mail possible for most people. It was the killer app for networks.)

Each new platform sees new applications. Maybe the new platform is created to serve the new apps; maybe the new apps are created as a response to the new environment. I don't know the direction of causality, but I'm leaning towards the former. With cloud computing, expect to see new applications, things that don't work on PCs or on the currrent web. The apps will meet new needs and have advantages (and risks) beyond today's apps.

Sunday, July 19, 2009

Skating to the puck

I recently reviewed an internal study of competing technologies for a new, large-scale, Windows application. It was the typical survey of the possible platforms for the application, summarizing the strengths and weaknesses of each. This kind of analysis has been done hundreds (or thousands, or tens of thousands) of times by companies and organizations around the world and back to the dawn of computing. The thinking is: Before we set out on this project, let's review the possible technologies and pick the best one.

As I was reading the study, I realized that the study was wrong.

Not wrong in the sense of improperly evaluating technologies, or wrong in the sense that the authors ignored a possible platform. They had included the major platforms and listed the strengths and weaknesses in an unbiased presentation.

It was wrong in the sense of time. The report had the wrong tense. It looked at the present capabilities of the platforms. It should be looking at the future.

The project is a long term project. The development is planned for five years, with a lifetime of ten years after that development. (That's a simplified version of the plan. There will be releases in the five year development phase, and enhancements and maintenance during the ten-year follow-on phase.)

For such a project, one needs a view of the future, not a view of the present. Or, as Wayne Gretzky learned from his father: "skate to where the puck is going to be, not to where it has been."

The study looked at the major technologies (.NET, Java, Silverlight, Air, and Flash) and reviewed their current capabilities. The authors made no projections of the possible futures for these platforms.

I understand that predictions are hard. (Especially predictions about the future.) But the majority of the development effort will be made in the future, from two to three years out, and continuing for a decade. Looking at the current state of technologies and deciding the next fifteen years on them is insufficient. You have to look at where each technology is going.

This study was part of a new development effort, to replace an existing product was difficult to maintain. The existing product was about ten years old, with parts going back fifteen years. The difficulties were due to the technology and design decisions that had been made at the inception of the project and during its life, some only a couple of years ago.

This team is embarking on a repetition of their previous development effort. Instead of creating a long-lasting design on robust technology, they are building a system that, in a few years, will be difficult to maintain.

Pucks move and technology changes. If you always skate to where the puck currently is, you will always be behind.

Tuesday, July 14, 2009

Not with a bang

Has the Age of Windows passed? I think it has. Not only that, I think the age of the web application is passing. We're now entering the age of the smartphone app.

In the past, transitions from one technology to another have been sharp and well-defined. When IBM released the first PC (the model 5150) companies jumped onto the PC-DOS bandwagon. Products were either ported from CP/M to PC-DOS or made for PC-DOS with no attempt at compatibility with the older systems. (A few perhaps, but the vast majority of applications were made for PC-DOS and the IBM PC.)

When Microsoft introduced Windows 3.1, manufacturers climbed onto the bandwagon and created Windows applications and abandoned MS-DOS. Windows 3.1 was a "windows thing" on top of the "DOS thing", so the old MS-DOS applications still ran, but all new applications were for Windows.

(I'm ignoring certain technologies, such as OS/2, CP/M-86, and the UCSD p-System. I'm also ignoring Macintosh, although I suspect that the Apple II/Macintosh transition was also fairly swift.)

Back to Windows. Since the rise of Windows, we've had one major transition and we're in the midst of another major transition. The first was the shift from Windows (or client/server) applications to web applications. The second, occurring now, is from Windows and desktop web applications to mobile web applications.

The shift from client/server to web app occurred slowly. There was no "killer app" for the web, no counterpart to Lotus 1-2-3 that pulled people to PCs or network support that pulled people to Windows 3.1. The transition was much slower. (One could argue that the killer app for the web was Google, or YouTube, but it is a difficult case.)

Back to Windows. I think that the Age of Windows is over. Think about it: all new applications are designed for either the web or a mobile phone (usually the iPhone).

Don't believe me? Try this test: Name a major commercial application designed for Windows that has been released in the past year. Not applications that run in a browser, but on Windows (and only on Windows). I was going to rule out applications from Microsoft, but I cannot think of new applications for Windows, even from them. (New versions of products don't count.)

I cannot think of any new applications. I can think of new applications for the web (Facebook, Twitter, DOPPLR, etc.) but these live in the browser, and none of them are tied to Internet Explorer. The iPhone has oodles of new applications, including games such as Labyrinth and the Ocarina player. I don't know of anything significant in the commercial space that is specific to the iPhone, but I suspect that it is coming.

But its more than just a move away from Windows. The market has moved, quietly, from Windows to web apps. Windows applications join their older cousins written in COBOL for "big iron" in the maintenance yard. That's old news.

The market is moving again.

As I see it, Facebook is the last major desktop web application. By "desktop web application" I mean an application that was designed for a web browser on a desktop PC. Faeebook was certainly designed that way, with the iPhone version as an afterthought.

Twitter, on the other hand, was designed out of the gate as an iPhone application, with the Windows client as a concession to the technical laggards.

The creativity has shifted to the smart phone. New applications will be made for the iPhone and other smart phones. The talented developers are thinking about smartphones, not desktop web, and certainly not Windows. Older platforms such as the desktop web and Windows are now "mature" platforms. Their applications will be maintained, but the platforms get nothing new. The new applications will be designed for smartphones. A few apps may carry over from the smartphone, but it will be difficult: smartphones have mobility, position awareness, and a degree of intimacy not available to desktop users.

Not with a bang, but a whimper, does the curtain fall on Windows. And desktop web applications.

Sunday, July 12, 2009

Upper bounds

Sometimes, our environment limits us. Sometimes our physical capabilities limit us. It's good to know which, because a limit in one can reduce the usefulness of plentitude in the other.

For example, our visual system limits our ability to channel surf on a cable network. These limits create an upper bound on the number of channels that we can use on a cable network.

Why? Because we can surf only so fast, and therefore surf a finite number of stations in a given period of time. Let's assume that I can surf from channel to channel, spending one half second on each channel. If I start at channel 2 and work my way upwards, it will take me some amount of time to reach the end and "wrap" back to channel 2. With twenty channels, it takes ten seconds. With two hundred channels, almost two minutes. With a thousand channels, about twenty minutes.

If we had ten thousand channels, it would take the better part of a day to surf them. By the time we decided on a channel, the program would be long over.

Here's the interesting observation: With a thousand channels, by the time one surfs the entire collection, the original program (if a thirty-minute show) is more than half over. We have spent too much time surfing and not enough time watching. ur physical capabilities (the ability to process a video signal and decide to stay or go) creates an upper limit to the number of channels.

That limit holds for the strategy of surfing. If we use a different strategy (perhaps looking for specific programs or types of programs, or using an index, or relying on a TIVO-like prediction system) then we can use a larger cable network.

This effect comes in to play with a lot of things, not just cable television. The web is a large collection of channels. The applications on the Apple iTunes store is a large collection. Books in a bookstore. Videos on Youtube (or Hulu).

Books and web videos don't have the time-limit of television programs, yet we all have finite time, finite resources to spend on viewing, listening, processing, or reading. We are all constrained to make choices in finite time.

The trick is knowing our limits.

Sunday, July 5, 2009

Revisiting Babel

Is it possible to move faster by going slower? The concept defies our intuition, yet such a thing may be possible.

For example, on one recent project a team made significant progress. The details were related to me at the Open Source Bridge 2009 conference. I won't go into them here, as they are not important. The important point is that the team completed its task, faster than expected, and under budget.

This progress was surprising as the different team members came from different countries and spoke different languages. They all spoke English, and used it as the common language for the project, but none were fluent in it.

How could such a team make any progress, never mind rapid progress?

The speculation is that since English was a second language, team members spent more than the usual amount of time listening to other members. Processing a second language is often harder; one must pay more attention and often ask for clarification.

The simple acts of listening and asking for clarification may have made the difference.

Tuesday, June 30, 2009

Old tech fails in interesting ways

During a recent visit to a local (and well-known) hospital, I happened to spy an old LA-75 printer. These were made and sold in the mid 1980s. They were considered the low-end printer, selling for $700 or so. (The letter-quality printers went for quite a bit more.)

Hospitals are health care providers, and therefore fall under HIPAA rules. HIPAA is very specific about security of patient records.

I wonder if hospitals know that dot-matrix printers are not secure? That is, there is an attack for dot-matrix printers.

By carefully recording and analyzing the sounds made by the printer, one can reproduce the text printed. The attack is called the "Acoustic Side Channel Attack". I suspect that it can be done by simply placing an Apple iPod into "record" mode and sitting it in the same room as the printer. Later, by analyzing the sound (even with background noise), one can identify the text printed during the day.

You can see more details here.