Sunday, May 18, 2014

How to untangle code: Use variables for only one purpose

Early languages (COBOL, FORTRAN, BASIC, Pascal, C, and others) forced the declarations of variables into a single section of code. COBOL was the strictest of this taskmaster, with data declarations in sections of code separate from the procedural section.

With limited memory, it was often necessary to re-use variables. FORTRAN assisted in the efficient use of memory with the 'EQUIVALENCE' directive which let one specify variables that used the same memory locations.

Today, the situation has changed. Memory is cheap and plentiful. It is no longer necessary to use variables for more than one purpose. Our languages no longer have EQUIVALENCE statements -- something for which I am very grateful. Modern languages (including C++, C#, Java, Perl, Python, Ruby, and even the later versions of C) allow us to declare variables when we need them; we are not limited to declaring them in a specific location.

Using variables for more than one purpose is still tempting, but not necessary. Modern languages allow us to declare variables as we need them, and use different variables for different purposes.

Suppose we have code that calculates the total expenses and total revenue in a system.

Instead of this code:

void calc_total_expense_and_revenue()
{
    int i;
    double amount;

    amount = 0;
    for (i = 0; i < 10; i++)
    {
        amount += calc_expense(i);
    }

    store_expense(amount);

    amount = 0;

    for (i = 0; i < 10; i++)
    {
        amount += calc_revenue(i);
    }

    store_revenue(amount);
}

we can use this code:

void calc_total_expense_and_revenue()
{
    double expense_amount = 0;
    for (unsigned int i = 0; i < 10; i++)
    {
        expense_amount += calc_expense(i);
    }

    store_expense(expense_amount);

    double revenue_amount = 0;
    for (unsigned int i = 0; i < 10; i++)
    {
        revenue_amount += calc_revenue(i);
    }

    store_revenue(revenue_amount);
}

I much prefer the second version. Why, because the second version cleanly separates the calculation of expenses and revenue. In fact, the separation is so good we can break the function into two smaller functions:

void calc_total_expense()
{
    double expense_amount = 0;
    for (unsigned int i = 0; i < 10; i++)
    {
        expense_amount += calc_expense(i);
    }

    store_expense(expense_amount);
}

void calc_total_revenue()
{
    double revenue_amount = 0;
    for (unsigned int i = 0; i < 10; i++)
    {
        revenue_amount += calc_revenue(i);
    }

    store_revenue(revenue_amount);
}

Two small functions are better than one large function. Small functions are easier to read and easier to maintain. Using a variable for more than one purpose can tie those functions together. Using separate variables (or one variable for each purpose) allows us to separate functions.

No comments: