Multi process / thread PHP execution
UPDATE-- I've created the official PHP Multi-process script page. Further updates, the usage manual, and other information about PHP Multi-process will be found here.
I've run into the problem of needing to use multiple processes with a php script on multiple occasions. Most notably, I have some very complex data processing scripts that take several minutes to complete. If I can break these up into several parts and run them all at the same time, the execution time can be greatly reduced.

There are several methods of running additional processes with php. pcntl_fork is a great way to do it, but it requires php to be running as a cgi module rather than through apache as most web hosts do.
This is a script of how to fork multiple processes on a server running PHP through apache. These process will all run concurrently, and the parent will wait to exit until all of the child processes have completed.
An easy way to fork a process through php is to run a quiet exec command with no output. This will create a process that runs in the background. However, there are times that we need to know a script successfully processed, or even get some data that the script produces.
This solution requires tying together several moderately advanced ideas, but in the end will produce a usable and stable multiple processing php system. Despite it's simplicity, this script took a lot of testing to complete. It's definitely in an Alpha+ stage so if you find any bugs or problems, please let me know.
Server requirements:
- PHP 5
- Sqlite (Memcached is an alternative)
- Ability to use exec and run php from the command line
How does this work?
Using the exec function, a script (the parent) executes multiple additional php scripts (children) silently in the background. The parent script neither waits for, nor cares about the execution of, the children scripts, so the children all process concurrently. Each child script, upon completion updates a temporary database table used as a caching mechanism. The parent queries the cache table until each child is complete, or until itself times out, and then exits. This allows multiple scripts to be spawned from a single script.
How to use it:
Download multiproc.zip. Extract and upload the files to your server. The sqlite folder will need to be made writable.
To use the script you will need to specify the child scripts you want to process. Add these into the $processes_to_execute array. You may need to use the full path of the files depending on your server setup.
-
<?php
-
'other_process.php',
-
'other_process.php'
-
);
For the children, you can copy other_process.php and add any of your own php code between the BEGIN and END comments. Passed data does not inheretly come across in $_GET variables. Do not alter the variable setting code at the top of the script unless you understand how not to break it.
-
/*-----------------BEGIN YOUR OWN CODE HERE------------------------*/
-
-
/*
-
This is a normal PHP script so you can do whatever you want.
-
But, don't output anything because this is running in the background.
-
You can return output to the primary script via the setOutput method.
-
$ouput = 'WHATEVER YOU WANT';
-
$script->setOutput($envelope,$pid,$output);
-
*/
-
-
/*-----------------END YOUR OWN CODE HERE---------------------------*/
Lastly, point your browser to multi_proc.php, and let it go. It may take some altering to get the to work on your specific server setup, but I've got it working on several servers with very little/no additional configuration.
Personal Notes:
Windows
This is written for Linux running PHP from Apache, but should be able to be modified to work on Windows Apache.
Benchmarking:
I've benchmarked this script and it does deliver the performance I was expecting. Each child provides a linear reduction in execution time. Since the time spent executing the children and reading / writing to the database is negligible. If a script takes 30 seconds to process, and you can split it into 2 children or 1 parent 1 child, the final result will be ~15 seconds.
Additionally, you could execute just about anything in each child as long as the child doesn't timeout. The parent will wait for each to finish, or until it's execution time limit is reached before exiting.
Personally, I am using this on a script which originally took 12 minutes to process, now down to about 90 seconds.
Caching
This script uses sqlite as an interim caching system. APC was my original plan for a cache, but due to some technical impossibilities, discovered in 6 hours of troubleshooting, APC can not be used for this (trust me on this one...). Memcache however, is a method that would work very well for this.
Script Failures
One of the current issues that needs to be addressed is reporting failed children. This could easily be accomplished by creating another method to return child processes that haven't completed. This will be added into my next revision of this script along with some other error control and any bug fixes that are needed.
As long as exit() is used at the bottom of each child process, there shouldn't be any potential for "permanent" memory leaks. If the parent process is killed prematurely, the children will continue to run until they timeout, or are killed via exit(). For this reason it is imperative to set a timeout, and to manually kill the script once it reaches the end.
Subscribe to the RSS feed and have all new posts delivered straight to you.
Thanks for the awesome script! I own a domain tools site and have implemented your script on our Type-In Traffic Finder tool. Execution has gone down from 40s to 1m to consistently under 10 seconds. I’ve been looking for something like this forever since our host doesn’t let us use fork(). Thanks again!
Thanks John,
It’s definitely a rough script right now, but I’m glad it’s working for you. I was in the same place as you with a huge script execution time. This brought some huge report building scripts down to a reasonable execution time. This was more or less the realization that scripts can run concurrently in the background, and then just figuring out how to keep track of them.
[...] SayNoToFlash ยป Multi process PHP execution This entry was written by trindade, posted on June 30, 2009 at 11:59 pm, filed under Lifestream. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback: Trackback URL. « Daily Digest for 2009-06-29 [...]
Hello,
how would I use memcached as cache of this thing?
I am running now some 130 child and db and filesystem seems to be limiting factor – I still got over 60% of memory unused and very low cpu usage.
Thanks