Sunday, August 15, 2021

COBOL and Elixir

Someone has created a project to transpile (their word) COBOL to Elixir. I have some thoughts on this idea. But first, let's look at the example they provide.

A sample COBOL program:

      >>SOURCE FORMAT FREE
IDENTIFICATION DIVISION.
PROGRAM-ID. Test1.
AUTHOR. Mike Binns.
DATE-WRITTEN. July 25th 2021
DATA DIVISION.
WORKING-STORAGE SECTION.
01 Name     PIC X(4) VALUE "Mike".
PROCEDURE DIVISION.

DISPLAY "Hello " Name

STOP RUN.

This is "hello, world" in COBOL. Note that it is quite longer than equivalent programs in most languages. Also note that while long, it is still readable. Even a person who does not know COBOL can make some sense of it.

Now let's look at the same program, transpiled to Elixr:

defmodule ElixirFromCobol.Test1 do
  @moduledoc """
  author: Mike Binns
  date written: July 25th 2021
  """

  def main do
    try do
      do_main()
    catch
      :stop_run -> :stop_run
    end
  end 

  def do_main do
    # pic: XXXX
    var_Name = "Mike"
    pics = %{"Name" => {:str, "XXXX", 4}}
    IO.puts "Hello " <> var_Name
    throw :stop_run
  end
end

That is ... a lot of code. More than the code for the COBOL version! Some of that is due to the exception of "stop run" which in this small example seems to be excessive. Why wrap the core function inside a main() that simply exists to trap the exception? (There is a reason. More on that later.)

I'm unsure of the reason for this project. If it is a side project made on a whim, and used for the entertainment (or education) of the author, then it makes sense.

But I cannot see this as a serious project, for a couple of reasons.

First, the produced Elixir code is longer, and in my opinion less readable, than the original COBOL code. I may be biased here, as I am somewhat familiar with COBOL and not at all familiar with Elixir, so I can look at COBOL code and say "of course it does that" but when I look at Elixir code I can only guess and think "well, maybe it does that". Elixir seems to follow the syntax for modern scripting languages such as Python and Ruby, with some unusual operators.

Second, the generated Elixir code provides some constructs which are not used. This is, perhaps, an artifact of generated code. Code generators are good, up to a point. They tend to be non-thinking; they read input, apply some rules, and produce output. They don't see the bigger picture. In the example, the transpiler has produced code that contains the variable "pics" which contains information about the COBOL programs PICTURE clauses, but this "pics" variable is not used.

The "pics" variable hints at a larger problem, which is that the transpiled code is not running the equivalent program but is instead interpreting data to achieve the same output. The Elixir program is, in fact, a tuned interpreter for a specific COBOL program. As an interpreter, its performance will be less than that of a compiled program. Where COBOL can compile code to handle the PICTURE clauses, the Elixir code must look up the PICTURE clause at runtime, decode it, and then take action.

My final concern is the most significant. The Elixir programming language is not a good match for the COBOL language. Theoretically, any program written in a Turing-complete language can be re-written in a different Turing-complete language. That's a nice theory, but in practice converting from one language to another can be easy or can be difficult. Modern languages like Elixir have object-oriented and structured programming constructs. COBOL predates those constructs and has procedural code and global variables.

We can see the impedance mismatch in the Elixir code to catch the "stop run" exception. A COBOL program may contain "STOP RUN" anywhere in the code. The Elixir transpiler project has to build extra code to duplicate this capability. I'm not sure how the transpiler will handle global variables, but it will probably be a method that is equally tortured. Converting code from a non-structured language to a structured programming language is difficult at best and results in odd-looking code.

My point here is not to insult or to shout down the transpiler project. It will probably be an educational experience, teaching the author about Elixir and probably more about COBOL.

My first point is that programs are designed to match the programming language. Programs written in object-oriented languages have object-oriented designs. Programs written in functional languages have functional designs. Programs written in non-structured languages have... non-structured designs. The designs from one type of programming language do not translate readily to a programming language of a different type.

My second point is that we assume that modern languages are better than older languages. We assume that object-oriented languages like C++, C#, and Java are better than (non-OO) structured languages like Pascal and Fortran-77. Some of us assume that functional languages are better than object-oriented languages.

I'm not so sure about those assumptions. I think that object oriented languages are better at some tasks that mere structured languages, and older structured-only languages are better at other tasks. Object-oriented languages are useful for large systems; they let us organize code into classes and functions, and even larger constructs through inheritance and templates. Dynamic languages like Python and Ruby are good for some tasks but not others.

And I must conclude that even older, non-functional, non-dynamic, non-object-oriented, non-structured programming languages are good for some tasks.

One analogy of programming languages is that of a carpenter's toolbox: full of various tools for different purposes. COBOL, one of the oldest languages, might be considered the hammer, one of the oldest tools. Hammers do not have the ability of saws, drills, tape measures, or levels, but carpenters still use them, when the task is appropriate for a hammer.

Perhaps we can learn a thing or two from carpenters.

No comments: