Fitzpatrick's Fabulous Future: SQL

Showing posts with label SQL. Show all posts

Thursday, April 19, 2018

Why no language to replace SQL?

The history of programming is littered with programming languages. Some endure for ages (COBOL, C, Java) and some live briefly (Visual J++). We often develop new languages to replace existing ones (Perl, Python).

Yet one language has endured and has seen no replacements: SQL.

SQL, invented in the 1970s and popularized in the 1980s, has lived a good life with no apparent challengers.

It is an anomaly. Every language I can think of has a "challenger" language. FORTRAN was challenged by BASIC. BASIC was challenged by Pascal. C++ was challenged by Java; Java was challenged by C. Unix shell programming was challenged by AWK, which in turn was challenged by Perl, which in turn has been challenged by Python.

Yet there have been no (serious) challengers to SQL. Why not?

I can think of several reasons:

Everyone loves SQL and no one wants to change it.
Programmers think of SQL as a protocol (specialized for databases) and not a programming language. Therefore, they don't invent a new language to replace it.
Programmers want to work on other things.
The task is bigger than a programming language. Replacing SQL means designing the language, creating an interpreter (or compiler?), command-line tools (these are programmers, after all), bindings to other languages (Python, Ruby, and Perl at minimum), and data access routines. With all features of SQL, including triggers, access controls, transactions, and audit logs.
SQL gets a lot of things right, and works.

I'm betting on the last. SQL, for all of its warts, is effective, efficient, and correct.

But perhaps there is a challenger to SQL: NoSQL.

In one sense, NoSQL is a replacement for SQL. But it is a replacement of more than the language -- it is a replacement of the notion of data structure. NoSQL "databases" store documents and photographs and other things, but they are rarely used to process transactions. NoSQL databases don't replace SQL databases, they complement them. (Some companies move existing data from SQL databases to NoSQL databases, but this is data that fits poorly in the relational structure. They move some of their data but not all of their data out of the SQL database. These companies are fixing a problem, not replacing the SQL language.)

NoSQL is a complement of SQL, not a replacement (and therefore not a true challenger). SQL handles part of our data storage and NoSQL handles a different part.

It seems that SQL will be with us for some time. It is tied to the notion of relational organization, which is a useful mechanism for storing and processing homogeneous data.

Sunday, September 20, 2015

Derivatives of programming languages

Programmers are, apparently, unhappy with their tools. Specifically, their programming languages.

Why do I say that? Because programmers frequently revise or reinvent programming languages.

If we start with symbolic assembly language as the first programming language, we can trace the development of other languages. FORTRAN, in its beginning, was a very powerful macro assembler (or something like it). Algol was a new vision of a programming language, in part influenced by FORTRAN. Pascal was developed as a simpler language, as compared to Algol.

Changes to languages come in two forms. One is an enhancement, a new version of the same language. For example, Visual Basic had multiple versions, yet it was essentially the same language. FORTRAN changed, from FORTRAN IV to FORTRAN 66 to Fortran 77 (and later versions).

The other form is a new, separate language. C# was based on Java, yet was clearly not Java. Modula and ADA were based on Pascal, yet quite different. C++ was a form of C that had object-oriented programming.

Programmers are just not happy with their languages. Over the half-century of programming, we have had hundreds of languages. Only a small fraction have gained popularity, yet we keep tuning them, adjusting them, and deriving them. And programmers are not unwilling to revamp an existing language to meet the needs of the day.

There are two languages that are significant exceptions: COBOL and SQL. Neither of these have been used (to my knowledge) to develop other languages. At least not popular ones. Each has had new versions (COBOL-61, COBOL-74, Cobol-85, SQL-86, SQL-89, SQL-92, and so on) but none have spawned new, different languages.

There have been many languages that have had a small following and never been used to create a new language. It's one thing for a small language to live and die in obscurity. But COBOL and SQL are anything but obscure. They drive most business transactions in the world. They are used in all organizations of any size. One cannot work in the IT world without being aware of them.

So why is it that they have not been used to create new languages?

I have a few ideas.

First, COBOL and SQL are popular, capable, and well-understood. Both have been developed for decades, they work, and they can handle the work. There is little need for a "better COBOL" or a "better SQL".

Second, COBOL and SQL have active development communities. When a new version of COBOL is needed, the COBOL community responds. When a new version of SQL is needed, the SQL community responds.

Third, the primary users of COBOL and SQL (businesses and governments) tend to be large and conservative. They want to avoid risk. They don't need to take a chance on a new idea for database access. They know that new versions of COBOL and SQL will be available, and they can wait for a new version.

Fourth, COBOL and SQL are domain-specific languages, not general-purpose. They are suited to financial transactions. Hobbyists and tinkerers have little need for COBOL or a COBOL-like language. When they experiment, they use languages like C or Python ... or maybe Forth.

The desire to create a new language (whether brand new or based on an existing language) is a creative one. Each person is driven by his own needs, and each new language probably has different root causes. Early languages like COBOL and FORTRAN were created to let people be more productive. The urge to help people be more productive may still be there, but I think there is a bit of fun involved. People create languages because it is fun.

Wednesday, March 19, 2014

The fecundity of programming languages

Some programming languages are more rigorous than others. Some programming languages are said to be more beautiful than others. Some programming languages are more popular than others.

And some programming languages are more prolific than others, in the sense that they are the basis for new programming languages.

Algol, for example, influenced the development of Pascal and C, which in turn influenced Java, C# and many others.

FORTRAN influenced BASIC, which in turn gave us CBASIC, Visual Basic, and True Basic.

The Unix shell lead to Awk and Perl, which influenced Python and Ruby.

But COBOL has had little influence on languages. Yes, it has been revised, including an object-oriented version. Yes, it guided the PL/I and ABAP languages. But outside of those business-specific languages, COBOL has had almost no effect on programming languages.

Why?

I'm not certain, but I have two ideas: COBOL was as early language, and it designed for commercial uses.

COBOL is one of the earliest languages, dating back to the 1950s. Other languages of the time include FORTRAN and LISP (and oodles of forgotten languages like A-0 and FLOWMATIC). We had no experience with programming languages. We didn't know what worked and what didn't work. We didn't know which language features were useful to programmers. Since we didn't know, we had to guess.

For a near-blind guess, COBOL was pretty good. It has been useful in close to its original form for decades, a shark in the evolution of programming languages.

The other reason we didn't use COBOL to create other languages is that it was commercial. It was designed for business transactions. While it ran on general-purpose computers, COBOL was specific to the financial applications, and the people who would tinker and build new languages were working in other fields and with computers other than business mainframes.

The tinkerers were using minicomputers (and later, microcomputers). These were not in the financial setting but in universities, where people were more willing to explore new languages. Minicomputers from DEC were often equipped with FORTRAN and BASIC. Unix computers were equipped with C. Microcomputers often came with BASIC baked in, because it was easier for individuals to use.

COBOL's success in the financial sector may have doomed it to stagnancy. Corporations (especially banks and insurance companies) lean conservative with technology and programming; they prefer to focus on profits and not research.

I see a similar future for SQL. As a data descriptions and access language, it does an excellent job. But it is very specific and cannot be used outside of that domain. The up-and-coming NoSQL databases avoid SQL in part, I think, because the SQL language is tied to relational algebra and structured data. I see no languages (well, no popular languages) derived from SQL.

I think the languages that will influence or generate new languages will be those which are currently popular, easily learned, and easily used. They must be available to the tinkerers of today; those tinkerers will be writing the languages of the future. Tinkerers have limited resources, so less expensive languages have an advantage. Tinkerers are also a finicky bunch, with only a few willing to work with ornery products (or languages).

Considering those factors, I think that future languages will come from a set of languages in use today. That set includes C, C#, Java, Python, and JavaScript. I omit a number of candidates, including Perl, C++, and possibly your favorite language. (I consider Perl and C++ difficult languages; tinkerers will move to easier languages. I would like to include FORTH in the list, but it too is a difficult language.)

Sunday, February 17, 2013

Losing data in the cloud of big data

NoSQL databases have several advantages over traditional SQL databases -- in certain situations. I think most folks agree that NoSQL databases are better for some tasks, and SQL databases are better in others. And most discussions about Big Data agree that NoSQL is the tool for Big Data databases.

One aspect that I have not seen discussed is auditing. That is, knowing that we have all of the data we expect to have. Traditional data processing systems (accounting, insurance, banking, etc.) have lots of checks in place to ensure that all transactions are processed and none are lost.

These checks and audits were put in place over a long time. I suspect that each error, when detected, was reviewed and a check was added to prevent such errors, or at least detect them early.

Do we have these checks in our Big Data databases? Is it even possible to build the checks for accountability? Big Data is, by definition, big. Bigger than normal, and bigger than one can conveniently inventory. Big Data can also contain things that are not always auditable. We have the techniques to check bank accounts, but how can we check something non-numeric such as photographs, tweets, and Facebook posts?

On the other hand, there may be risks from losing data, or subsets of data. Incomplete datasets may contain bias, a problem for sampling and projections. How can you trust your data if you don't have the checks in place?

Sunday, February 20, 2011

Whining for SQL

Several folks have been whining (and I do mean whining) about the lack of SQL in cloud computing. The common arguments are that SQL is the standard for database access, that it is familiar (meaning that a lot of people have learned it), that it is needed to build applications effectively.

To these arguments, I say "bunk".

SQL has a limited history in the computing age. It become popular in the 1980s. prior to then, we got along without SQL quite well. SQL is not needed to access data or build applications effectively.

I will pause here to disclose my bias: I dislike SQL. I think that the language is ugly, and I don't like ugly languages.

As I see it, SQL was the result of two forces. One was the popularity of relational databases, which in turn were driven by a desire to reduce redundant data. The second force was the desire to divide the work of application development cleanly between database design and application design. I'm not sure that either of these forces applies in today's world of technology.

Cloud applications may not need SQL at all. We may be able to create new methods of accessing data for cloud applications. (And we seem to be well on our way doing so.) Insisting that cloud apps use SQL is a misguided attempt at keeping an old (and ugly ... did I mention ugly?) data access mechanism. Similar thinking was common in the early days of microcomputers (the pre-IBM PC days) when people strove to implement FORTRAN and COBOL compilers on microcomputers and build systems for general ledger and inventory.

Google has no incentive to bring SQL to the cloud. Nor do Amazon.com and Salesforce.com. The players who do have incentive for SQL in the cloud are the vendors selling SQL databases: Microsoft and Oracle. I expect that they will find a way -- some way, any way -- to use SQL in cloud apps. And I expect those ways to be expensive.

But despite the efforts of Microsoft and Oracle, I expect cloud apps to thrive without SQL.

Fitzpatrick's Fabulous Future