Wednesday, November 30, 2011

Is "cheap and easy" a good thing?

In the IT industry, we are all about developing (and adopting) new techniques. The techniques often start as manual processes, often slow, expensive, and unreliable. We automate these processes, and eventually, the processes become cheap and easy. One would think that this path is a good thing.

But there is a dark spot.

Consider two aspects of software development: backups and version control.

More often than I like, I encounter projects that do not use a version control system. And many times, I encounter shops that have no process for creating backup copies of data.

In the early days of PCs, backups were expensive and consumed time and resources. The history of version control systems is similar. The earliest (primitive) systems were followed by (expensive) commercial solutions (that also consumed time and resources).

But the early objections to backups and version control no longer hold. There are solutions that are freely available, easy to use, easy to administer, and mostly automatic. Disk space and network connections are plentiful.

These solutions do require some effort and some administration. Nothing is completely free, or completely automatic. But the costs are significantly less than they were.

The resistance to version control is, then, only in the mindset of the project manager (or chief programmer, or architect, or whoever is running the show). If a project is not using version control, its because the project manager thinks that not using version control will be faster (or cheaper, or better) than using version control. If a shop is not making backup copies of important data, its because the manager thinks that not making backups is cheaper than making backups.

It is not enough for a solution to be cheap and easy. A solution has to be recognized as cheap and easy, and recognized as the right thing to do. The problem facing "infrastructure" items like backups and version control is that as they become cheap and easy, they also fade into the background. Solutions that "run themselves" require little in the way of attention from managers, who rightfully focus their efforts on running the business.

When solutions become cheap and easy (and reliable), they fall off of managers' radar. I suspect that few magazine articles talk about backup systems. (The ones that do probably discuss compliance with regulations for specific environments.) Today's articles on version control talk about the benefits of the new technologies (distributed version control systems), not the necessity of version control.

So here is the fading pain effect: We start with a need. We develop solutions, and make those tasks easier and more reliable, and we reduce the pain. As the pain is reduced, the visibility of the tasks drops. As the visibility drops, the importance assigned by managers drops. As the importance drops, fewer resources are assigned to the task. Resources are allocated to other, bigger pains. ("The squeaky wheel gets the grease.")

Beyond that, there seems to be a "window of awareness" for technical infrastructure solutions. When we invent techniques (version control, for example), there is a certain level of discussion and awareness of the techniques. As we improve the tools, the discussions become fewer, and at some point they live only in obscure corners of web forums. Shops that have adopted the techniques continue to use them, but shops that did not adopt the techniques have little impetus to adopt them, since they (the solutions) are no longer discussed.

So if you're a shop and you're "muddling through" with a manual solution (or no solution), you eventually stop getting the message that there are automated solutions. At this point, it is likely that you will never adopt the technology.

And this is why I think that "cheap and easy" may not always be a good thing.

Saturday, November 19, 2011

Programming and the fullness of time

Sometimes when writing code, the right thing to do is to wait for the technology to improve.

On a previous project, we had the challenge of (after the system had been constructed) making the programs run faster. Once the user saw the performance of the system, they wanted a faster version. It was a reasonable request, although the performance was sluggish, not horrible. The system was usable, it just wasn't "snappy".

So we set about devising a solution. We knew the code, and we knew that making the system faster would not be easy. Improved performance would require changing much of the code. Complicating the issue was the tool that we had used: a code-generation package that created a lot of the code for us. Once we started modifying the generated code, we could no longer use the generator. Or we could track all changes and apply them to later generated versions of the system. Neither path was appealing.

We debated various approaches, and the project management bureaucracy was such that we *could* debate various approaches without showing progress in code changes. That is, we could stall or "run the clock out".

It turned out that doing nothing was the right thing to do. By making no changes to the code, but simply waiting for PCs to become faster, the problem was solved.

So now we come to Google's Chromebook, the portable PC with only a browser.

One of the criticisms against the Chromebook is the lack of off-line capabilities for Google Docs. This is a fair criticism; the Chromebook is useful only when connected to the internet, and internet connections are not available everywhere.

Yet an off-line mode for Google Docs may be the wrong solution to the problem. The cost of developing such a solution is not trivial. Google might invest several months (with multiple people) developing and testing the off-line mode.

But what if networking becomes ubiquitous? Or at least more available? If that were to happen, then the need for off-line processing is reduced (if not eliminated). The solution to "how do I process documents when I am not connected" is solved not by creating a new solution, but by waiting for the surrounding technology to improve.

Google has an interesting decision ahead of them. They can build the off-line capabilities into their Docs applications. (I suspect it would require a fair amount of Javascript and hacks for storing large sets of data.) Or they can do nothing and hope that network coverage improves. (By "do nothing", I mean work on other projects.)

These decisions are easy to review in hindsight, they are cloudy on the front end. If I were them, I would be looking at the effort for off-line processing, the possible side benefits from that solution, and the rate of network coverage. Right now, I see no clear "winning" choice; no obvious solution that is significantly better than others. Which doesn't mean that Google should simply wait for network coverage to get better -- but it also means that Google shouldn't count that idea out.

Sunday, November 13, 2011

Programming languages exist to make programming easy

We create programming languages to make programming easy.

After the invention of the electronic computer, we invented FORTRAN and COBOL. Both languages made the act of programming easy. (Easier than assembly language, the only other game in town.) FORTRAN made it easy to perform numeric computations, and despite the horror of its input/output methods, it also made it easier to read and write numerical values. COBOL also made it easy to perform computations and input/output operations; it was slanted towards structured data (records containing fields) and readability (longer variable names, and verbose language keywords).

After the invention of time-sharing (and a shortage of skilled programmers), we invented BASIC, a teaching language that linguistically sat between FORTRAN and COBOL.

After the invention of minicomputers (and the ability for schools and research groups to purchase them), we invented the C language, which combined structured programming concepts from Algol and Pascal with the low-level access of assembly language. The combination allowed researchers to connect computers to laboratory equipment and write efficient programs for processing data.

After the invention of graphics terminals and PCs, we invented Microsoft Windows and the Visual Basic language to program applications in Windows. The earlier languages of C and C++ made programming in Windows possible, but Visual Basic was the language that made it easy.

After PCs became powerful enough, we invented Java, which leverage the power to run interpreted byte-code programs, but also (and more significantly) handle threaded applications. Support for threading was built into the Java language.

With the invention of networking, we created HTML and web browsers and Javascript.

I have left out equipment (microcomputers with CP/M, the Xerox Alto, the Apple Macintosh) and languages (LISP, RPG, C#, and others). I'm looking at the large trend using a few data points. If your favorite computer or language is missing, please forgive my arbitrary selections.

We create languages to make tasks easier for ourselves. As we develop new hardware, larger data sets, and new ways of connecting data and devices, we need new languages to handle the capabilities of these new inventions.

Looking forward, what can we see? What new hardware will stimulate the creation of new languages?

Cloud computing is big, and will lead to creative solutions. We're already seeing new languages that have increased rigor in the form of functional programming. We moved from non-structured programming to structured programming to object-oriented programming; in the future I expect us to move to functional programming. Functional programming is a good fit for cloud computing, with its immutable objects and no-side-effect functions.

Mobile programming is popular, but I don't expect a language for mobile apps. Instead, I expect new languages for mobile devices. The Java, C#, and Objective-C languages (from Google, Microsoft, and Apple, respectively) will mutate into languages better suited to small, mobile devices that must run applications in a secure manner. I expect that security, not performance, will be the driver for change.

Big data is on the rise. We'll see new languages to handle the collection, synchronization, querying, and analysis of large data sets. The language 'Processing' is a start in that direction, letting us render data in a visual form. The invention of NoSQL databases is also a start; look for a 'NoSQL standard' language (or possibly several).

The new languages will allow us to handle new challenges. But that doesn't mean that the old languages will go away. Those languages were designed to handle specific challenges, and they handle them well. So well that new languages have not displaced them. (Billing systems are still in COBOL, scientists still use Fortran, and lots of Microsoft Windows applications are still running in Visual Basic.) New languages are optimized for different criteria and cannot always handle the older tasks; I would not want to write a billing system in C, for example.

As the 'space' of our challenges expands, we invent languages to fill that space. Let's invent some languages and meet some new challenges!

Thursday, November 10, 2011

The insignificance of significant figures in programming languages

If a city with a population figure of 500,000 gets three more residents, the population figure is... 500,000, not 500,003. The reasoning is this: the original figure was accurate only to the first digit (the hundred-thousands digit). It has a finite precision, and adding a number that is smaller than the precision has no affect on the original number.

Significant figures is not the same as "number of decimal places", although many people do confuse the two.

Significant figures are needed for calculations with measured quantities. Measurements will have some degree of imprecision, and the rigor of significant figures keeps our calculations honest. The rules for significant figures are more complex (and subtle) than a simple "use 3 decimal places". The number of decimal places will vary, and some calculations may affect positions to the left of the decimal point. (As in our "city with 500,000 residents" example.)

For a better description of significant figures, see the wikipedia page.

Applications such as Microsoft Excel (or LibreOffice Calc) have no built-in support for significant figures. Nor, to my knowledge, are there plug-ins or extensions to support calculations with significant figures.

Perhaps the lack of support for significant figures is caused by a lack of demand. Most spreadsheets are built to handle money, which is counted (not measured) and therefore does not fall under the domain of significant figures. (Monetary figures are considered to be exact, in most applications.)

Perhaps the lack of support is driven by the confusion between "decimal places" and "significant figures".

But perhaps the biggest reason for a lack of support for significant figures in applications is this: There is no support for significant figures in popular languages.

A quick search for C++, Java, Python, and Ruby yield no such corresponding packages. Interestingly, the only language that had a package for significant figures was Perl: CPAN has the Math::SigFigs package.

So the better question is: Why do programming languages have no support for significant figures?

Tuesday, November 8, 2011

Advertisements in the brave new world of Swipeville

We've seen the different types of ads in Webland: banner ads, side-bar ads, in-text ads, pop-up ads, pop-under ads... all of the types.

My favorite are the "your page is loading" ads which block the (completely loaded, who do you think you are kidding) content page. I like them because, with multi-tab browsers, I can view a different page while the ad-covered page is "loading" while the advertisement times out. With multiple tabs, I can avoid the delay and essentially skip the advertisement.

This all changes in the world of phones and tablets. (I call this new world "Swipeville".) Classic desktop browsers gave us tabbed windows; the new platforms do not. The phone and tablet browsers have one window and one "tab", much like the early desktop browsers.

In this world, we cannot escape advertisements by using multiple tabs. Nor can we look at another window (such as another application) while our page "loads". Since apps own the entire screen, they are either running or not -- switching to another app means that the browser stops, and switching back re-runs the page load/draw operation.

Which means that advertisements will be less avoidable, and therefore (possibly) more effective.

Or they may be less effective; the psychology of cell phone ads is, I think, poorly understood. Regardless of effectiveness, we will be seeing more of them.

Sunday, November 6, 2011

Picking a programming language

In the great, ongoing debate of language superiority, many factors are considered ... and brandished. The discussions of languages are sometimes heated. My purpose here is to provide some musings in a cool light.

The popular languages of the day are (in order provided by Tiobe Software): Java, C, C++, PHP, C#, Objective C, Visual Basic, Python, Perl, and Javascript.

But instead of arguing about the sequence of these languages (or even other candidates for inclusion), let's look at the attributes that make languages popular. Here's a list of some considerations:

  • Platform: which platforms (Windows, OSX, iOS, Android, Linux) support the language
  • Performance: how well the programs perform at run-time (whether compiled or interpreted)
  • Readability: how well programs written by programmers can be read by other programmers
  • Reliability: how consistently the written programs perform
  • Cost: here I mean direct costs: the cost of the compiler and tools (and ongoing costs for support and licenses)
  • Market support: how much support is available from vendors, groups, and developers

How well do languages match these criteria? Let's try some free association.

For performance, the language of choice is C++. Some might argue that Objective-C provides better performance, but I think the argument would come only from developers in the OSX and iOS platforms.

Readability is a difficult notion, and subject to a lot of, well, subjectivity. My impression is that most programmers will claim that their favorite language is eminently readable, if only one takes the time to learn it. To get around this bias, I propose that people will pick as second-best in readability the language Python, and I choose that as the most readable language.

I submit that reliability among languages is a neutral item. Compilers and interpreters for all of these languages are quite good, and programs perform -- for the most part -- consistently.

For cost, all of these languages are available in no-cost options. There are commercial versions for C# (Microsoft's Visual Studio) and Objective-C (Apple's developer kit), and one would think that such costs would give boosts to the other languages. And it does, but cost alone is not enough to "make" or "break" a language. Which brings us to market support.

The support of Microsoft and Apple for C# and Objective-C make those languages appealing. The Microsoft tools have a lot of followers: companies that specify them as standards and developers who know and keep active in the C# language.

Peering into the future, what can we see?

I think that the Perl/Python tussle will end up going to Python. Right now, Perl has better market support: the CPAN libraries and lots of developers. These factors can change, and are changing. O'Reilly has been printing (and selling) lots of books on Python. People have been starting projects in Python. In contrast, Perl loses on readability, something that is hard to change.

The Java/C# tussle is more about market support and less about readability and performance. These languages are about the same in readability, performance, and reliability. Microsoft has made C# the prince of languages for development in Windows; we need to see what Oracle will do with Java.

Apple had designated Objective-C, C, and C++ as the only languages suitable for iOS, but is relaxing their rules. I expect some change in the popularity of iOS programming languages.

But what about those other popular languages, the ones I have not mentioned? What about C, Visual Basic, PHP, and Javascript? Each have their fanbase (companies and developers) and each have a fair rating in performance, reliability, and market support.

I expect that Javascript will become more popular, continuing the current trend. The others I think will fade gradually. Expect to see less market support (fewer books, fewer updates to tools) and more conversion projects (from Visual Basic to C#, for example). But also expect a long life from these languages. The old languages of Fortran and COBOL are still with us.

Which language you pick for your project is a choice that you should make consciously. You must weigh many factors -- more than are listed here -- and live with the consequences of that decision. I encourage you to think of these factors, think of other factors, and discuss them with your colleagues.

Tuesday, November 1, 2011

Mobile first, desktop second (maybe)

Mobile computing has arrived, and is no longer a second-class citizen. In fact, it is the desktop that may be the second-class application.

A long time ago, desktop applications were the only game in town. Then mobile arrived, and it was granted a small presence: usually m.whatever.com, with some custom scripts to generate a limited set of web pages.

Now, the mobile app is the leader. If you are starting a project, start with mobile, and if you have time, build in the "plain" version later. Focus on your customers; for new apps, customers are mobile devices: iPhones, iPads, Android phones, and tablets. You can add the desktop browser version later, after you get the core running.