Thursday, June 7, 2007

Process or Threads? A performance comparison

Multi-processing or Multi-threading?

Much has been said about whether multi-threading is more efficient than multi-processing.
Apache 2.0 has introduced the MPM Worker Model which supports multi-threading in addition to the earlier multi-processing model.
But just how efficient is multi-threading vis-à-vis multi-processing in the context of enterprise software?
I recently did some consultancy for a high-volume web-service facing performance issues, which was exposing a legacy system and running on Apache with mod_gsoap library for SOAP handling.
During the architecture review, I discovered the system uses multi-process (process per request) model, where each request runs for a few seconds. The system receives ~1000 concurrent requests at peak hours and ~600000 requests / day.
I was considering moving to a process-pool + thread-pool model, and decided to quantify the expected benefits through a small app-specific prototype as follows:
  • Process Model Scenario:
N child processes spawned in a test run, each process handles 1000 requests. Total number of requests = N * 1000
  • Process-Pool/Thread-Pool Scenario:
M child processes spawned in a test run, where each process spawns P worker threads. Each thread handles 1000 requests. Total number of requests = M * P * 1000

In both the cases, total number of requests is constant for a given test run.
Based on this prototype, I got the following results on a Linux FC4 box, Intel Core Duo processor with 1 Gb RAM:
As can be seen from the results, process-pool/thread-pool model worked twice as fast as process-per-request model.
Of course, dynamics would vary on higher-end machines, but this served as a good indicator of relative performance, which was also seen on the live deployment.