Showing posts with label complexity. Show all posts
Showing posts with label complexity. Show all posts

Friday, June 21, 2019

The complexity of programming languages

A recent project saw me examining and tokenizing code for different programming languages. The languages ranged from old languages (COBOL and FORTRAN, among others) to modern languages (Python and Go, among others). It was an interesting project, and I learned quite a bit about many different languages. (By 'tokenize', I mean to identify the type of each item in a program: Variables, identifiers, function names, operators, etc. I was not parsing the code, or building an abstract syntax tree, or compiling the code into op-codes. Tokenizing is the first step of compiling, but a far cry from actually compiling the code.)

One surprising result: newer languages are easier to tokenize than older languages. Python is easier to tokenize than COBOL, and Go is easier to tokenize than FORTRAN.

This is counterintuitive. One would think that older languages would be primitive (and therefore easy to tokenize) and modern languages sophisticated (and therefore difficult to tokenize). Yet my experience shows the opposite.

Why would this be? I can think of two -- no, three -- reasons.

First, the old languages (COBOL, FORTRAN, and PL/I) were designed in the age of punch cards, and punch cards impose limits on source code. COBOL, FORTRAN, and PL/I have few things in common, but one thing that they do have in common is line layout and the 'identification' field in columns 72 through 80.

When your program is stored on punch cards, a risk is that someone will drop the deck of cards and the cards will become out of order. Such a thing cannot happen with programs stored in disk files, but with punch cards such an event is a real risk. To recover from that event, the right-most columns were reserved for identification: a code, unique to each line, that would let a card sorter machine (there were such things) put the cards back into their proper order.

The need for an identification column is tied to the punch card medium, yet it became part of each language standard. COBOL, FORTRAN, and PL/I standards all refer to the columns 72 through 80 as reserved for identification, and they could not be used for "real" source code. Programs transferred from punch cards to disk files (when disks became available to programmers) kept the rule for the identification field -- probably to make conversion easy.  Later versions of languages did drop the rule, but the damage had been done. The identification field was part of the language specification.

As part of the language specification, I had to tokenize the identification numbers. Mostly they were not a problem -- just another "thing" to tokenize -- but sometimes they occurred in the middle of a string literal or a comment, which are awkward situations.

Anyway, the tokenization of old languages has its challenges.

New languages don't suffer from such problems. Their source code was never stored on punch cards, and they never had identification fields. (Either within string literals or not.)

But the tokenization of modern languages is easier. Each language has a set of token types, but older languages have a larger set, and a more varied set. Most languages have identifiers, numeric literals, and operators; COBOL also has picture values and level indicators, and PL/I has attributes and conditions (among other token types).

Which brings me to the second reason for modern languages to have simpler tokenizing requirements: The languages are designed to be easy to tokenize.

It seems to me that, intentionally or not, the designers of modern languages have made design choices that reduce the work for tokenizers. They have built languages that are easy to tokenize, and therefore have simple logic for tokenizers. (All compilers and interpreters have tokenizers; it is a step in converting the source to executable bytes.)

So maybe the simplicity of language tokenization is the result of the "laziness" of language designers.

But I have a third reason, one that I believe is the true reason for the simplicity of modern language tokenizers.

Modern languages are easy to tokenize because they are easy to read (by humans).

A language that is easy to read (for a human) is also easy to tokenize. Language designers have been consciously designing languages to be easy to read. (Python is the leading example, but all designers claim their language is "easy to read".)

Languages that are easy to read are easy to tokenize. It's that simple. We've been designing languages are humans, and as a side effect we have made them easy for computers.

I, for one, welcome the change. Not only does it make my job easier (tokenizing all of those languages) but it makes every developer's job easier (reading code from other developers and writing new code).

So I say three cheers for simple* programming languages!

* Simple does not imply weak. A simple programming language may be easy to understand, yet it may also be powerful. The combination of the two is the real benefit here.

Wednesday, April 10, 2019

Program language and program size

Can programs be "too big"? Does it depend on the language?

In the 1990s, the two popular programming languages from Microsoft were Visual Basic and Visual C++. (Microsoft also offered Fortran and an assembler, and I think COBOL, but they were used rarely.)

I used both Visual Basic and Visual C++. With Visual Basic it was easy to create a Windows application, but the applications in Visual Basic were limited. You could not, for example, launch a modal dialog from within a modal dialog. Visual C++ was much more capable; you had the entire Windows API available to you. But the construction of Visual C++ applications took more time and effort. A simple Visual Basic application could be "up and running" in a minute. The simplest Visual C++ application took at least twenty minutes. Applications with dialogs took quite a bit of time in Visual C++.

Visual Basic was better for small applications. They could be written quickly, and changed quickly. Visual C++ was better for large applications. Larger applications required more design and coding (and more testing) but could handle more complex tasks. Also, the performance benefits of C++ were only obtained for large applications.

(I will note that Microsoft has improved the experience since those early days of Windows programming. The .NET framework has made a large difference. Microsoft has also improved the dialog editors and other tools in what is now called Visual Studio.)

That early Windows experience got me thinking: are some languages better at small programs, and other languages better at large programs? Small programs written in languages that require a lot of code (verbose languages) have a disadvantage because of the extra work. Visual C++ was a verbose language; Visual Basic was not -- or was less verbose. Other languages weigh in at different points on the scale of verbosity.

Consider a "word count" program. (That is, a program to count the words in a file.) Different languages require different amounts of code. At the small-program end of the scale we have languages such as AWK and Perl. At the large-end of the scale we have COBOL.

(I am considering lines of code here, and not executable size or the size of libraries. I don't count run-time environments or byte-code engines.)

I would much rather write (and maintain) the word-count program in AWK or Perl (or Ruby or Python). Not because these languages are modern, but because the program itself is small. (Trival, actually.) The program in COBOL is large; COBOL has some string-handling functions (but not many) and it requires a fair amount of overhead to define the program. A COBOL program is long, by design. The COBOL language is a verbose language.

Thus, there is an incentive to build small programs in certain languages. (I should probably say that there is an incentive to build certain programs in certain languages.)

But that is on the small end of the scale of programs. What about the other end? Is there an incentive to build large programs in certain languages?

I believe that the answer is yes. Just as some languages are good for small programs, other languages are good for large programs. The languages that are good for large programs have structures and constructs which help us humans manage and understand the code in large scale.

Over the years, we have developed several techniques we use to manage source code. They include:

  • Multiple source files (#include files, copybooks, separate compiled files in a project, etc.)
  • A library of subroutines and functions (the "standard library")
  • A repository of libraries (CPAN, CRAN, gems, etc.)
  • The ability to define subroutines
  • The ability to define functions
  • Object-oriented programming (the ability to define types)
  • The ability to define interfaces
  • Mix-in fragments of classes
  • Lambdas and closures

These techniques help us by partitioning the code. We can "lump" and "split" the code into different subroutines, functions, modules, classes, and contexts. We can define rules to limit the information that is allowed to flow between the multiple "lumps" of a system. Limiting the flow of information simplifies the task of programming (or debugging, or documenting) a system.

Is there a point when a program is simply "too big" for a language?

I think there are two concepts lurking in that question. The first is a relative answer, and the second is an absolute answer.

Let's start with a hypothetical example. A mind experiment, if you will.

Let's imagine a program. It can be any program, but it is small and simple. (Perhaps it is "Hello, world!") Let's pick a language for our program. As the program is small, let's pick a language that is good for small programs. (It could be Visual Basic or AWK.)

Let's continue our experiment by increasing the size of our program. As this was a hypothetical program, we can easily expand it. (We don't have to write the actual code -- we simply expand the code in our mind.)

Now, keeping our program in mind, and remembering our initial choice of a programming language, let us consider other languages. Is there a point when we would like to switch from our chosen programming language to another language?

The relative answer applies to a language when compared to a different language. In my earlier example, I compared Visual Basic with Visual C++. Visual Basic was better for small programs, Visual C++ for large programs.

The exact point of change is not clear. It wasn't clear in the early days of Windows programming, either. But there must be a crossover point, where the situation changes from "better in Visual Basic" to "better in Visual C++".

The two languages don't have to be Visual Basic and Visual C++. They could be any pair. One could compare COBOL and assembler, or Java and Perl, or Go and Ruby. Each pair has its own crossover point, but the crossover point is there. Each pair of languages has a point in which it is better to select the more verbose language, because of its capabilities at managing large code.

That's the relative case, which considers two languages and picks the better of the two. Then there is the absolute case, which considers only one language.

For the absolute case, the question is not "Which is the better language for a given program?", but "Should we write a program in a given language?". That is, there may be some programs which are too large, too complex, too difficult to write in a specific programming language.

Well-informed readers will be aware that a program written in a language that is "Turing complete" can be translated into any other programming language that is also "Turing complete". That is not the point. The question is not "Can this program be written in a given language?" but "Should this program be written in a given language?".

That is a much subtler question, and much more subjective. I may consider a program "too big" for language X while another might consider it within bounds. I don't have metrics for such a decision -- and even if I did, one could argue that my cutoff point (a complexity value of 2000, say) is arbitrary and the better value is somewhat higher (perhaps 2750). One might argue that a more talented team can handle programs that are larger and more complex.

Someday we may have agreed-upon metrics, and someday we may have agreed-upon cutoff values. Someday we may be able to run our program through a tool for analysis, one that computes the complexity and compares the result to our cut-off values. Such a tool would be an impartial judge for the suitability of the programming language for our task. (Assuming that we write programs that are efficient and correct in the given programming language.)

Someday we may have all of that, and the discipline to discard (or re-design) programs that exceed the boundaries.

But we don't have that today.

Tuesday, March 19, 2019

C++ gets serious

I'm worried that C++ is getting too ... complicated.

I am not worries that C++ is a dead language. It is not. The C++ standards committee has adopted several changes over the years, releasing new C++ standards. C++11. C++14. C++17 is the most recent. C++20 is in progress. Compiler vendors are implementing the new standards. (Microsoft has done an admirable job in their latest versions of their C++ compiler.)

But the changes are impressive -- and intimidating. Even the names of the changes are daunting:
  • contracts, with preconditions and postconditions
  • concepts
  • transactional memory
  • ranges
  • networking
  • modules
  • concurrency
  • coroutines
  • reflection
  • spaceship operator
Most of these do not mean what you think they mean (unless you have been reading the proposed standards). The spaceship operator is the familiar to anyone who has worked in Perl or Ruby. The rest may sound familiar but are quite specific in their design and use, and it is probably very different from your first guess.

Here is an example of range, which simplifies the common "iterate over a collection" loop:

int array[5] = { 1, 2, 3, 4, 5 };
for (int& x : array)
    x *= 2;

This is a nice improvement. Notice that it does not use STL iterators; this is pure C++ code.

Somewhat more complex is an implementation of the spaceship operator:

template
struct pair {
  T t;
  U u;

  auto operator<=> (pair const& rhs) const
    -> std::common_comparison_category_t<
         decltype(std::compare_3way(t, rhs.t)),
         decltype(std::compare_3way(u, rhs.u)>
  {
    if (auto cmp = std::compare_3way(t, rhs.t); cmp != 0)
        return cmp;

    return std::compare3_way(u, rhs.u);
  }
}

That code seems... not so obvious.

The non-obviousness of code doesn't end there.

Look at two functions, one for value types and one for all types (value and reference types):

For simple value types, for our two functions, we can write the following code:

std::for_each(vi.begin(), vi.end(), [](auto x) { return foo(x); });

The most generic form:

#define LIFT(foo) \
  [](auto&... x) \
    noexcept(noexcept(foo(std::forward(x)...))) \
   -> decltype(foo(std::forward(x)...)) \
  { return foo(std::forward(x)...); }

I will let you ponder that bit of "trivial" code.

Notice that the last example uses the #define macro to do its work, with '\' characters to continue the macro across multiple lines.

* * *

I have been pondering that code (and more) for some time.

- C++ is becoming more capable, but also more complex. It is now far from the "C with Classes" that was the start of C++.

- C++ is not obsolete, but it is for applications with specific needs. C++ does offer fine control over memory management and can provide predictable run-time performance, which are advantages for embedded applications. But if you don't need the specific advantages of C++, I see little reason to invest the extra effort to learn and maintain C++.

- Development work will favor other languages, mostly Java, C#, Python, JavaScript, and Go. Java and C# have become the "first choice" languages for business applications; Python has become the "first choice" for one's first language. The new features of C++, while useful for specific applications, will probably discourage the average programmer. I'm not expecting schools to teach C++ as a first language again -- ever.

- There will remain a lot of C++ code, but C++'s share of "the world of code" will become smaller. Some of this is due to systems being written in other languages. But I'm willing to bet that the total lines of code for C++ (if we could measure it) is shrinking in absolute numbers.

All of this means that C++ development will become more expensive.

There will be fewer C++ programmers. C++ is not the language taught in schools (usually) and it is not the language taught in the "intro to programming" courses. People will not learn C++ as a matter of course; only those who really want to learn it will make the effort.

C++ will be limited to the projects that need the features of C++, projects which are larger and more complex. Projects that are "simple" and "average" will use other languages. It will be the complicated projects, the projects that need high performance, the projects that need well-defined (and predictable) memory management which will use C++.

C++ will continue as a language. It will be used on the high end projects, with specific requirements and high performance. The programmers who know C++ will have to know how to work on those projects -- amateurs and dabblers will not be welcome. If you are managing projects, and you want to stay with C++, be prepared to hunt for talent and be prepared to pay.

Wednesday, January 24, 2018

Cloud computing is repeating history

A note to readers: This post is a bit of a rant, driven by emotion. My 'code stat' project, hosted on Microsoft Azure's web app PaaS platform, has failed and I have yet to find a resolution.

Something has changed in Azure, and I can no longer deploy a new version to the production servers. My code works; I can test it locally. Something in the deployment sequence fails. This is a test project, using the free level of Azure, which means no monthly costs but also means no support -- other than the community help pages.

There are a few glorious advances in IT, advances which stand out above the others. They include the PC revolution (which saw individuals purchasing and using computers), the GUI (which saw people untrained in computer science using computers), and the smartphone (which saw lots more people using computers for lots more sophisticated tasks).

The PC revolution was a big change. Prior to personal computers (whether they were IBM PCs, Apple IIs, or Commodore 64s), computers were large, expensive, and complicated; they were especially difficult to administer. Mainframes and even minicomputers were large and expensive; an individual could afford one if they were an enormously wealthy individual and had lots of time to read manuals and try different configurations to make the thing work.

The consumer PCs changed all of that. They were expensive, but within the range of the middle class. They required little or no administration effort. (The Commodore 64 was especially easy: plug it in, attach to a television, and turn it on.)

Apple made the consumer PC easier to use with the Macintosh. The graphical user interface (lifted from Xerox PARC's Alto, and later copied by Microsoft Windows) made many operations and concepts consistent. Configuration was buried, and sometimes options were reduced to "the way Apple wants you to do it".

It strikes me that cloud computing is in a "mainframe phase". It is large and complex, and while an individual can create a an account (even a free account), the complexity and time necessary to learn and use the platform is significant.

My issue with Microsoft Azure is precisely that. Something has changed and it behaves differently than it did in the past. (It's not my code, the change is in the deployment of my app.) I don't think that I have changed something in Azure's configuration -- although I could have.

The problem is that once you go beyond the 'three easy steps to deploy a web app', Azure is a vast and intimidating beast with lots of settings, each with new terminology. I could poke at various settings, but will that fix the problem or make things worse?

From my view, cloud computing is a large, complex system that requires lots of knowledge and expertise. In other words, it is much like a mainframe. (Except, of course, you don't need a large room dedicated to the equipment.)

The "starter plans" (often free) are not the equivalent of a PC. They are merely the same, enterprise-level plans with certain features turned off.

A PC is different from a mainframe reduced to tabletop size. Both have CPUs and memory and peripheral devices and operating systems, but are two different creatures. PCs have fewer options, fewer settings, fewer things you (the user) can get wrong.

Cloud computing is still at the "mainframe level" of options and settings. It's big and complicated, and it requires a lot of expertise to keep it running.

If we repeat history, we can expect companies to offer smaller, simpler versions of cloud computing. The advantage will be an easier learning curve and less required expertise; the disadvantage will be lower functionality. (Just as minicomputers were easier and less capable than mainframes and PCs were easier and less capable than minicomputers.)

I'll go out on a limb and predict that the companies who offer simpler cloud platforms will not be the current big providers (Amazon.com, Microsoft, Google). Mainframes were challenged by minicomputers from new vendors, not the existing leaders. PCs were initially constructed by hobbyists from kits. Soon after companies such as Radio Shack, Commodore, and the newcomer Apple offered fully-assembled, ready-to-run computers. IBM offered the PC after the success of these upstarts.

The driver for simpler cloud platforms will be cost -- direct and indirect, mostly indirect. The "cloud computing is a mainframe" analogy is not perfect, as the billed costs for cloud platforms can be inexpensive. The expense is not in the hardware, but the time to make the thing work. Current cloud platforms require expertise, and expertise that is not cheap. Companies are willing to pay for that expertise... for now.

I expect that we will see competition to the big cloud platforms, and the marketing will focus on ease of use and low Total Cost of Ownership (TCO). The newcomers will offer simpler clouds, sacrificing performance for reduced administration cost.

My project is currently stuck. Deployments fail, so I cannot update my app. Support is not really available, so I must rely on the limited web pages and perhaps trial and error. I may have to create a new app in Azure and copy my existing code to it. I'm not happy with the experience.

I'm also looking for a simpler cloud platform.

Friday, July 31, 2015

Locking mistakes on web pages

As a professional in the IT industry, I occasionally visit web sites. I do so to get information about new technologies and practices, new devices and software, and current events. There are a number of web sites that provide a magazine format of news stories. I visit more than IT web sites; I also read about social, political, and economic events.

I find many stories informative, some pertinent, and a few silly. And a find a number of them to contain errors. Not factual errors, but simple typographic errors. Simple errors, yet these errors should be obvious to anyone reading the story. For example, a misspelling of the name 'Boston' as 'Botson'. Or the word 'compromise' appearing as 'compromse'.

Spelling errors are bad enough. What makes it worse is that they remain. The error may be on the web site at 10:00 in the morning. It is still there at 4:00 in the afternoon.

A web page is run by a computer (a web server, to be precise). The computer waits for a request, and when it gets one, it builds an HTML page and sends back the response. The HTML page can be static (a simple file read from disk) or dynamic (a collection of files and content merged into a single HTML file). But static or dynamic, the source is the same: files on a computer. And files can be changed.

The whole point of the web was to allow for content to be shared, content that could be updated.

Yet here are these web sites with obviously incorrect content. And they (the people running the web sites) do nothing about it.

I have a few theories behind this effect:

  • The people running the web site don't care
  • The errors are intentional
  • The people running the site don't have time
  • The people running the site don't know how

It's possible that the people running the web site do not care about these errors. They may have a cavalier attitude towards their readers. Perhaps they focus only on their advertisers. It is a short-sighted strategy and I tend to doubt that it would be in effect at so many web sites.

It's also possible that the errors are intentional. They may be made specifically to "tag" content, so that if another web side copies the content then it can be identified as coming from the first web site. Perhaps there is an automated system that makes these mistakes. I suspect that there are better ways to identify copied content.

More likely is that the people running the web site either don't have time to make corrections or don't know how to make corrections. (They are almost the same thing.)

I blame our Content Management Systems. These systems (CMSs) manage the raw content and assemble it into HTML form. (Remember that dynamic web pages must combine information from multiple sources. A CMS does that work, combining content into a structured document.)

I suspect (and it is only a suspicion, as I have not used any of the CMS systems) that the procedures to administrate a CMS are complicated. I suspect that CMSs, like other automated systems, have grown in complexity over the years, and now require deep technical knowledge.

I also suspect that these web sites with frequent typographical errors are run with a minimal crew of moderately skilled people. The staff has enough knowledge (and time) to perform the "normal" tasks of publishing stories and updating advertisements. It does not have the knowledge (and the time) to do "extraordinary" tasks like update a story.

I suspect the "simple" process for a CMS would be to re-issue a fixed version of the story, but the "simple" process would add the fixed version as a new story and not replace the original. A web site might display the two versions of the story, possibly confusing readers. The more complex process of updating the original story and fixing it in the CMS is probably so labor-intensive and risk-prone that people judge it as "not worth the effort".

That's a pretty damning statement about the CMS: The system is too complicated to use to correct content.

It's also a bit of speculation on my part. I haven't worked with CMSs. Yet I have worked with automated systems, and observed them over time. The path of simple to complex is all too easy to follow.

Sunday, November 17, 2013

The end of complex PC apps

Businesses are facing a problem with technology: PCs (and tablets, and smart phones) are changing. Specifically, they are changing faster than businesses would like.

Corporations have many programs that they use internally. Some corporations build their own software, others buy software "off the shelf". Many companies use a combination of both.

All of the companies with whom I have worked wanted stable platforms on which to build their systems and processes. Whether it was a complex program built in C++, a comprehensive model built in a spreadsheet, or an office suite (word processor, spreadsheet, and e-mail), companies want to invest their effort in their custom solutions. They did not want to spend money or time on upgrades and changes to the operating system or commercially available applications.

While they dislike change, corporations are willing to upgrade systems. Corporations want long upgrade cycles. They want gentle upgrade paths, with easy transitions from one version to the next. They were happy with the old Microsoft world: Windows NT, Windows 2000, and Windows XP were excellent examples of the long, gentle upgrades desired by corporations.

That is no longer the world of PCs. The new world sees fast update cycles for operating systems, major updates that require changes to applications. For companies with custom-made applications, they have to invest time and effort in updating their applications to match the new operating systems. (Consider Windows Vista and Windows 8.) For companies with off-the-shelf applications, they have to purchase new versions that run on the new operating systems.

What is a corporation to do?

My guess is that corporations will seek out other platforms and move their apps to those platforms. My guess is that corporations will recognize the cost of frequent change in the PC and mobile platforms, and look for other solutions with lower cost.

If they do, then PCs will lose their title to the development world. The PC platform will not be the primary target for applications.

What are the new platforms? I suspect the two "winning" platforms will be web apps (browsers and servers), and mobile/cloud (tablets and phones with virtualized servers). While the front ends for these systems undergo frequent changes, the back ends are relatively stable. The browsers for web apps are mostly stable and they buffer the app from changes to the operating system. Tablets and smart phones undergo frequent updates; this cost can be minimized with simple apps that can be updated easily.

The big trend is away from complex PC applications. These are too expensive to maintain in the new world of frequent updates to operating systems.

Thursday, November 14, 2013

Instead of simplicity, measure complexity

The IEEE Computer Society devoted their November magazine issue to "Simplicity in IT". Simplicity is a desirable trait, but I have found that one cannot measure it. Instead, one must measure its opposite: complexity.

Some qualities cannot be measured. I learned this lesson as a sysadmin, managing disk space for multiple users and groups. We had large but finite disk resources (resources are always finite), shared by different teams. Despite the large disk resources, the combined usage of the teams exceeded our resources -- in other words, we "ran out of free space". My job was to figure out "where the space had gone".

I quickly learned that the goal of "where the space had gone" was the wrong one. It is impossible to measure, because space doesn't "go" anywhere. I substituted new metrics: who is using space, and how much, and how does that compare to their usage last week? These were possible to measure, and more useful. A developer who uses more than four times the next developer, and more than ten times the average developer is (probably) working inefficiently.

The metric "disk space used by developer" is measurable. The metric "change in usage from last week" is also measurable. In contrast, the metric "where did the unallocated space go" is not.

The measure of simplicity is similar. Instead of measuring simplicity, measure the opposite: complexity. Instead of asking "why is our application (or code, or UI, or database schema) not simple?", ask instead "where is the complexity?"

Complexity in source code can be easily measured. There are a number of commercial tools, a number of open source tools, and I have written a few tools for my own use. Anyone who wants to measure the complexity of their system has tools available to them.

Measuring the change in complexity (such as the change from one week to the next) involves taking measurements at one time and storing them, then taking measurements at a later time and comparing them against the earlier measurements. That is a little more complex that merely taking measurements, but not much more complicated.

Identifying the complex areas of your system give you an indicator. It shows you the sections of your system that you must change to achieve simplicity. That work may be easy, or may be difficult; a measure of complexity merely points to the problem areas.

* * * *

When I measure code, I measure the following:

  • Lines of code
  • Source lines of code (non-comments)
  • Cyclomatic complexity
  • Boolean constants
  • Number of directly referenced classes
  • Number of indirectly referenced classes
  • Number of directly dependent classes
  • Number of indirectly dependent classes
  • Class interface complexity (a count of member variables and public functions)

I find that these metrics let me quickly identify the "problem classes" -- the classes that cause the most defects. I can work on those classes and simplify the system.

Wednesday, November 6, 2013

More was more, but now less is more

IBM and Microsoft built their empires with the strategy "bigger and more features". IBM mainframes, over time, became larger (in terms of processor speed and memory capacity) and included more features. Microsoft software, over time, became larger (in terms of capacity) and included more features.

It was a successful strategy. IBM and Microsoft could win any "checklist battle" which listed the features of products. For many managers, the product with the largest list of features is the safest choice. (Microsoft and IBM's reputations also helped.)

One downside of large, complicated hardware and large, complicated software is that it leads to large, complicated procedures and data sets. Many businesses have developed their operating procedures first around IBM equipment and later around Microsoft software. When developing those procedures, it was natural to, over time, increase the complexity. New business cases, new exceptions, and special circumstances all add to complexity.

Businesses are trying to leverage mobile devices (tablets and phones) and finding that their long-established applications don't "port" easily to the new devices. They are focussing on the software, but the real issue is their processes. The complex procedures behind the software are making it hard to move business to mobile devices.

The user interfaces on mobile devices limit applications to much simpler operations. Perhaps our desire for simplicity comes from the size of the screen, or the change from mouse to touch, or from the fact that we hold the devices in our hands. Regardless of the reason, we want mobile devices to have simple apps.

Complicated applications of the desktop, with drop-down menus, multiple dialogs, and oodles of options simply do not "work" on a mobile device. We saw this with early hand-held devices such as the popular Palm Pilot and the not-so-popular Microsoft PocketPC. Palm's simple operation won over the more complex Windows CE.

Simplicity is a state of mind, one that is hard to obtain. Complicated software tempts one into complicated processes (so many fonts, so many spreadsheet formula operations, ...). Mobile devices demand simplicity. With mobile, "more" may be more, but it is not better. The successful businesses will simplify their procedures and their underlying business rules (perhaps the MBA crowd will prefer the words "streamline" or "optimize") to leverage mobile devices.


Friday, July 12, 2013

In the cloud, simple will be big

The cloud uses virtualized computers, usually virtualized PCs or PC-based servers.

The temptation is to build (well, instantiate) larger virtualized PCs. More powerful processors, more cores, more memory, more storage. It is a temptation that is based on the ideas of the pre-cloud era, when computers stood alone.

In the mainframe era, bigger was better. It was also more expensive, which in addition to creating a tension between larger and smaller computers, defined a status ranking of computer owners. Similar thinking held in the PC era: a larger, more capable PC was better than a smaller one. (I suppose that similar thinking happens with car owners.)

In the cloud, the size of individual PCs is less important. The cloud is built of many (virtualized) computers, and more importantly, able to increase the number of these computers. This ability shifts the equation. Bigger is still better, but now the measure of bigger is the cloud, not an individual computer.

The desire to improve virtual PCs has merit. Our current virtual PCs duplicate the common PC architecture of several years ago. That design includes the virtual processor type, the virtual hard disk controller, and the virtual video card. They are copies of the common devices of the time, chosen for compatibility with existing software. As copies of those devices, they replicate not only the good attributes but the foibles as well. For example, the typical virtualized environment emulates IDE and SCSI disk controllers, but allows you to boot only from the IDE controllers. (Why? Because the real-world configurations of those devices worked that way.)

An improved PC for the cloud is not bigger but simpler. Cloud-based systems use multiple servers and "spin up" new instances of virtual servers when they need additional capacity. One does not need a larger server when one can create, on demand, more instances of that server and share work among them.

The design of cloud-based systems is subtle. I have asserted that simpler computers are better than complex ones. This is true, but only up to a point. A cloud of servers so simple that they cannot run the network stack would be useless. Clearly, a minimum level of computation is required.

Our first generation of virtual computers were clones of existing machines. Some vendors have explored the use of simpler systems running on a sophisticated virtualization environment. (VMware's ESX and ESXi offerings, for example.)

Future generations of cloud computers will blur the lines between the virtualization manager, the virtualized machine, the operating system, and what is now called the language run-time (the JVM or CLR).

The entire system will be complex, yet I believe the successful configurations will have simplicity in each of the layers.

Wednesday, February 13, 2013

No help for mobile apps

A big change from PC application to mobile app is the 'help' feature. Not the addition of the feature, but the removal of it.

"Help" is a distinguishing characteristic of PC and Mac applications. All modern applications have it, and most have the dual-mode 'general' help and 'context' help. The Microsoft guidelines for well-behaved Windows applications specify the "Help" menu and the "About" information. (Possibly because Apple's specifications for well-behaved Macintosh applications specify them.)

The concept of on-line help precedes Windows and Mac software. The venerable Wordstar program offered help (on-line help) in its menu configuration. Unix has provided the 'man' utility for decades.

Yet look at any mobile app and you will see no help feature. It is not a menu option. It is not a button or a hidden screen.

Apps for smart phones and tablets do not have help. And they don't need it.

This is a big change.

I can think of a few reasons that mobile apps have dropped the 'help' feature:

Single-screen focus PC help is designed for multi-window displays. Mobile apps are designed to display on the whole screen. iOS and Android enforce this; Windows RT allows for two apps to display but not an app and its help screen. When the app takes the screen, there is no room for help messages. Switching from the main app to the help app might be too much of an inconvenience.

No example There is no reference app that uses help. All of the apps run without help, and new apps copy the designs of the existing apps.

Simplicity of operation Mobile apps are designed to be simple. So simple that help is not needed. The user can operate the software without the help of help.

Of these reasons, I prefer the last. Toggling between an app and a help screen is awkward but possible. Allocating a portion of the screen to help is also possible; many apps allocate space to advertisements.

I like to think that mobile apps need no help because they are easy to use. This easy comes in two forms: ease of operations and ease of understanding the concepts. Services such as Twitter and Facebook provide easy-to-use apps that manipulate a relatively simple set of data. The concepts they use are easy to grasp.

If this is the reason, then we have an interesting aspect of mobile apps: they must be simple enough that they can be used by the average person with no help. That is, there is an upper bound on the complexity of the app. Photoshop and Visual Studio will never be mobile apps, at least not in their current forms.

As applications migrate from desktop to web to mobile, they must become simpler. (Web apps, like mobile apps, have dropped the 'help' feature.) If my theory is correct, then we should see the mobile versions of apps like Microsoft Word and Excel existing as simpler versions of their desktop PC counterparts.

I keep saying 'simpler', but that should not mean 'less powerful'. We may see some reduction in capabilities, but I suspect the big changes will be to the user interface and the techniques we use to manipulate data.

Sunday, December 12, 2010

Simple or complex

Computers have been complex since the beginning. Computer users have a love/hate relationship with complexity.

We can add new features by adding a new layer onto an existing system, or expanding an existing layer within a system. Modifying an existing system can be difficult; adding a new layer can be fast and cheap. For example, the original Microsoft Windows was a DOS program that ran on PCs. Morphing DOS into Windows would have been a large effort (not just for development but also for sales and support to users who at the time were not convinced of the whole GUI idea) and a separate layer was the more effective path for Microsoft.

But adding layers is not without cost. The layers may not always mesh, with portions of lower layers bleeding through the upper layers. Each layer adds complexity to the system. Add enough layers, and the complexity starts to affect performance and the ability to make other changes.

The opposite of "complex" is "simple"; the opposite of "complexify" (if I may coin the word) is "simplify". But the two actions do not have equivalent effort. Where adding complexity is fast and cheap, simplifying a system is hard. One can add new features to a system; if users don't want them, they can ignore them. One has a harder time removing features from a system; if users want them they cannot ignore that the features are gone.

Complexity is not limited to PCs. Consumer goods, such as radios and televisions, were at one point complex devices. Radios had tubes that had to be replaced. TVs had tubes also, and lots of knobs for things like "horizontal control", "color", "hue", and "fine tuning". But those exposed elements of radio and TV internals were not benefits and not part of the desired product; they were artifacts of utility. We needed them to make the device work and give us our programming. They disappeared as soon as technically possible.

Automobiles had their share of complexity, with things like a "choke" and a pedal for the transmission. Some features have been removed, and others have been added. Cars are gaining complexity in the form of bluetooth interfaces to cell phones and built-in GPS systems.

Software is not immune to the effect of layers and complexity. Microsoft Windows was one example, but most systems expand through added layers. The trick to managing software is to manage not just the growth of features but to manage the reduction in complexity. Microsoft eventually merged Windows and DOS and Windows became an operating system in its own right. Microsoft continues to revise Windows, but they do it in both directions: they add features and expand capabilities with new layers, but they also remove complexity and the equivalent of the "fine tuning" knob.

Google's Cr-48 laptop is a step in simplifying the PC. It removes lots of knobs (no local programs, and even no Caps Lock key) and pushes work onto the internet. I predict that this will be a big seller, with simplicity being the sales driver.