By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,925 Members | 727 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,925 IT Pros & Developers. It's quick & easy.

PHP CLI & Forking children

P: n/a
I'm new to multi-process programming, should one avoid forking
children from children of a parent?

I'd like to spawn 10 children from the parent and each of those
children spawns another 5 children which process chunks of data (200
rows) with heavy usage of CPU and regexp

Sep 29 '07 #1
Share this Question
Share on Google+
10 Replies


P: n/a
On Sat, 29 Sep 2007 03:12:19 -0700, qw*******@googlemail.com wrote:
>I'm new to multi-process programming, should one avoid forking
children from children of a parent?

I'd like to spawn 10 children from the parent and each of those
children spawns another 5 children which process chunks of data (200
rows) with heavy usage of CPU and regexp
So you're spawning 500 processes? Do you have a very large number of CPUs to
run them on? Otherwise only a few will actually be running at any time, and
you'll be losing useful throughput to overhead, surely.

--
Andy Hassall :: an**@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
Sep 29 '07 #2

P: n/a
On Sep 29, 3:49 pm, Andy Hassall <a...@andyh.co.ukwrote:
So you're spawning 500 processes? Do you have a very large number of CPUs to
run them on? Otherwise only a few will actually be running at any time, and
you'll be losing useful throughput to overhead, surely.
Maybe the example I gave was bad :) How about PHP script with launches
4 children, with each child forking another 5 children (20 processes)

Would this development headaches or possible extra bugs?

Sep 29 '07 #3

P: n/a
qw*******@googlemail.com wrote:
On Sep 29, 3:49 pm, Andy Hassall <a...@andyh.co.ukwrote:
> So you're spawning 500 processes? Do you have a very large number of CPUs to
run them on? Otherwise only a few will actually be running at any time, and
you'll be losing useful throughput to overhead, surely.

Maybe the example I gave was bad :) How about PHP script with launches
4 children, with each child forking another 5 children (20 processes)

Would this development headaches or possible extra bugs?
Just wondering - why do you need to fork processes, anyway? There's a
lot of overhead in doing it, and if they're all CPU bound anyway you
aren't going to gain anything (unless you have a potload of CPU's).

Forking is good if you have different processes using different
resources. But when they have to contend for the same resource,
performance often goes down.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Sep 29 '07 #4

P: n/a
On Sep 29, 7:51 pm, Jerry Stuckle <jstuck...@attglobal.netwrote:
Just wondering - why do you need to fork processes, anyway? There's a
lot of overhead in doing it, and if they're all CPU bound anyway you
aren't going to gain anything (unless you have a potload of CPU's).

Forking is good if you have different processes using different
resources. But when they have to contend for the same resource,
performance often goes down.
Instead of writing a PHP script that downloads 2 million headers from
a newsgroup in a single connection (which will cause PHP to crash
anyway as it'll reach 500MB+ memory usage), I thought it would be
better to launch 4 processes do download it in chunks of 50,000
headers - with 4 connections to the same NNTP server.

Sep 29 '07 #5

P: n/a
On Sep 29, 8:32 pm, qwerty...@googlemail.com wrote:
Instead of writing a PHP script that downloads 2 million headers from
a newsgroup in a single connection (which will cause PHP to crash
anyway as it'll reach 500MB+ memory usage), I thought it would be
better to launch 4 processes do download it in chunks of 50,000
headers - with 4 connections to the same NNTP server.
I admit I should be using Perl or C for these tasks, but I know PHP
and I'm used to using its functions.

Sep 29 '07 #6

P: n/a
qw*******@googlemail.com wrote:
On Sep 29, 7:51 pm, Jerry Stuckle <jstuck...@attglobal.netwrote:
>Just wondering - why do you need to fork processes, anyway? There's a
lot of overhead in doing it, and if they're all CPU bound anyway you
aren't going to gain anything (unless you have a potload of CPU's).

Forking is good if you have different processes using different
resources. But when they have to contend for the same resource,
performance often goes down.

Instead of writing a PHP script that downloads 2 million headers from
a newsgroup in a single connection (which will cause PHP to crash
anyway as it'll reach 500MB+ memory usage), I thought it would be
better to launch 4 processes do download it in chunks of 50,000
headers - with 4 connections to the same NNTP server.
Which means you'll be downloading 500MB+ anyway - just in different
processes.

Or you could get some headers and cache them to disk, processing them later.

But which newsgroup has 2M+ headers? Glad I don't have to read that
one! :-)

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Sep 29 '07 #7

P: n/a
qw*******@googlemail.com wrote:
On Sep 29, 8:32 pm, qwerty...@googlemail.com wrote:
>Instead of writing a PHP script that downloads 2 million headers from
a newsgroup in a single connection (which will cause PHP to crash
anyway as it'll reach 500MB+ memory usage), I thought it would be
better to launch 4 processes do download it in chunks of 50,000
headers - with 4 connections to the same NNTP server.

I admit I should be using Perl or C for these tasks, but I know PHP
and I'm used to using its functions.
Nothing wrong with using PHP for this. It will be slower than a
compiled language like C, but most of your time will be spent waiting on
I/O anyway. So it shouldn't be that much slower.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Sep 29 '07 #8

P: n/a
On Sat, 29 Sep 2007 12:32:24 -0700, qw*******@googlemail.com wrote:
>On Sep 29, 7:51 pm, Jerry Stuckle <jstuck...@attglobal.netwrote:
>Just wondering - why do you need to fork processes, anyway? There's a
lot of overhead in doing it, and if they're all CPU bound anyway you
aren't going to gain anything (unless you have a potload of CPU's).

Forking is good if you have different processes using different
resources. But when they have to contend for the same resource,
performance often goes down.

Instead of writing a PHP script that downloads 2 million headers from
a newsgroup in a single connection (which will cause PHP to crash
anyway as it'll reach 500MB+ memory usage),
Well, presumably you're doing something with this data, like saving it to a
file or database? In which case you stream it from the network into the
database, rather than read it *all* into memory, and only *then* start saving
it?
>I thought it would be
better to launch 4 processes do download it in chunks of 50,000
headers - with 4 connections to the same NNTP server.
Yes, it may well be worth doing this to get better throughput (depending where
the bottleneck is), but I wouldn't have thought that the memory limit's the
issue, so long as you're streaming the data through.

I'm still not quite sure about the second level of forking you have in there
though; so there's 1 initial parent, 4 children reading from the server, but
then each has multiple children processing this data? Unless you have masses of
CPUs, you're unlikely to gain anything at that level; the 4 2nd level processes
may as well do the processing as they stream the data in from the network?

(As always, It Depends).
Back to the general question though, when you start forking, you've got child
process management to work out. One child process is relatively easy, more than
one means you have to do a bit more work to send (and receive) signals and
other IPC stuff (since you have to work out *which* child process you're
talking to), and work out what happens if either a child, or a parent process
terminates unexpectedly, or hangs. More than two processes and more than one
level of parent/child doesn't really get any more complicated as such, but
there's more processes to go wrong :-)

--
Andy Hassall :: an**@andyh.co.uk :: http://www.andyh.co.uk
http://www.andyhsoftware.co.uk/space :: disk and FTP usage analysis tool
Sep 29 '07 #9

P: n/a
Hello,

on 09/29/2007 07:12 AM qw*******@googlemail.com said the following:
I'm new to multi-process programming, should one avoid forking
children from children of a parent?

I'd like to spawn 10 children from the parent and each of those
children spawns another 5 children which process chunks of data (200
rows) with heavy usage of CPU and regexp
Here you may find several classes that can simplify that task for you:

http://www.phpclasses.org/php_fork

http://www.phpclasses.org/daemon

http://www.phpclasses.org/clsdaemonize

--

Regards,
Manuel Lemos

Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Oct 4 '07 #10

P: n/a
On Oct 4, 1:00 am, Manuel Lemos <mle...@acm.orgwrote:
Here you may find several classes that can simplify that task for you:

http://www.phpclasses.org/php_fork

http://www.phpclasses.org/daemon

http://www.phpclasses.org/clsdaemonize
Thanks Manuel for the good links.

Oct 4 '07 #11

This discussion thread is closed

Replies have been disabled for this discussion.