Sunday, May 21, 2017

Parallel processing on the horizon

Parallel processing has been with us for years. Or at least attempts at parallel processing.

Parallel processing has failed due the numerous challenges it faces. It requires special (usually expensive) hardware. Parallel processing on convention CPUs is simply processing items serially, because conventional CPUs can process only serially. (Multi-core processors address this problem to a small degree.) Parallel processing requires support in compilers and run-time libraries, and often new data structures. Most importantly, parallel processing requires tasks that are partitionable. The classic example of "nine women producing a baby in one month" highlights a task that is not partitionable, not divisible, into smaller tasks.

Cloud computing offers a new twist on parallel processing.

First, it offers multiple processors. Not just multiple cores, but true multiple processors -- as many as you would like.

Second, it offers these processors cheaply.

Cloud computing is a platform that can handle parallel processing -- in some areas. It has its problems.

First, creating new cloud processing systems is expensive in terms of time. A virtual machine must be instantiated, started, and given software to handle the task. Then, data must be shipped to the server. After processing, the result must be sent back, or forward to another processor. The time for all of these tasks is significant.

Second, we still have the problems of partitioning tasks and representing the data and operations in a program.

There is one area of development that I believe is ready to leverage parallel processing. That area is testing.

The typical testing effort for a project can have multiple levels: unit tests, component tests, system tests, end-to-end tests, you name it. But each level of testing follows the same general pattern:

  • Get a collection of tests, complete with input data and expected results
  • For each test
  • 1) Set up a test environment (program and data)
  • 2) Run the test
  • 3) Compare output to expected output
  • 4) Record the results
  • Summarize the results and report

In this process, the sequence of steps I've labelled 1 through 4 is repeated for each test. Traditional testing puts all of these tests on one computer, performing each test in sequence. Parallel testing can put each test on its own cloud-based processor, effectively running all tests at once.

Testing has a series of well-defined and partitionable tasks. Modern testing methods use automated tests, so a test can run locally or remotely (as long as it has access to everything it needs). Testing can be a drain on resources and time, requiring lots of requests to servers and lots of time to complete all tests.

Testing in the cloud, and in parallel, addresses these issues. It reduces the time for tests and improves the feedback to developers. Cloud processing is cheap -- at least cheaper than paying developers to wait for tests to run.

I think one the next "process improvements" for software development will be the use of cloud processing to run tests. Look for new services and changes to testing frameworks to support this new mode of testing.

No comments: