Friday, July 31, 2015

Locking mistakes on web pages

As a professional in the IT industry, I occasionally visit web sites. I do so to get information about new technologies and practices, new devices and software, and current events. There are a number of web sites that provide a magazine format of news stories. I visit more than IT web sites; I also read about social, political, and economic events.

I find many stories informative, some pertinent, and a few silly. And a find a number of them to contain errors. Not factual errors, but simple typographic errors. Simple errors, yet these errors should be obvious to anyone reading the story. For example, a misspelling of the name 'Boston' as 'Botson'. Or the word 'compromise' appearing as 'compromse'.

Spelling errors are bad enough. What makes it worse is that they remain. The error may be on the web site at 10:00 in the morning. It is still there at 4:00 in the afternoon.

A web page is run by a computer (a web server, to be precise). The computer waits for a request, and when it gets one, it builds an HTML page and sends back the response. The HTML page can be static (a simple file read from disk) or dynamic (a collection of files and content merged into a single HTML file). But static or dynamic, the source is the same: files on a computer. And files can be changed.

The whole point of the web was to allow for content to be shared, content that could be updated.

Yet here are these web sites with obviously incorrect content. And they (the people running the web sites) do nothing about it.

I have a few theories behind this effect:

  • The people running the web site don't care
  • The errors are intentional
  • The people running the site don't have time
  • The people running the site don't know how

It's possible that the people running the web site do not care about these errors. They may have a cavalier attitude towards their readers. Perhaps they focus only on their advertisers. It is a short-sighted strategy and I tend to doubt that it would be in effect at so many web sites.

It's also possible that the errors are intentional. They may be made specifically to "tag" content, so that if another web side copies the content then it can be identified as coming from the first web site. Perhaps there is an automated system that makes these mistakes. I suspect that there are better ways to identify copied content.

More likely is that the people running the web site either don't have time to make corrections or don't know how to make corrections. (They are almost the same thing.)

I blame our Content Management Systems. These systems (CMSs) manage the raw content and assemble it into HTML form. (Remember that dynamic web pages must combine information from multiple sources. A CMS does that work, combining content into a structured document.)

I suspect (and it is only a suspicion, as I have not used any of the CMS systems) that the procedures to administrate a CMS are complicated. I suspect that CMSs, like other automated systems, have grown in complexity over the years, and now require deep technical knowledge.

I also suspect that these web sites with frequent typographical errors are run with a minimal crew of moderately skilled people. The staff has enough knowledge (and time) to perform the "normal" tasks of publishing stories and updating advertisements. It does not have the knowledge (and the time) to do "extraordinary" tasks like update a story.

I suspect the "simple" process for a CMS would be to re-issue a fixed version of the story, but the "simple" process would add the fixed version as a new story and not replace the original. A web site might display the two versions of the story, possibly confusing readers. The more complex process of updating the original story and fixing it in the CMS is probably so labor-intensive and risk-prone that people judge it as "not worth the effort".

That's a pretty damning statement about the CMS: The system is too complicated to use to correct content.

It's also a bit of speculation on my part. I haven't worked with CMSs. Yet I have worked with automated systems, and observed them over time. The path of simple to complex is all too easy to follow.

No comments: