473,473 Members | 1,900 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Can anyone explain PHP slowing down please ?


I'm using PHP 5 on Win-98 command line (ie no web server involved)

I'm processing a large csv file and when I loop through it I can process
around 275 records per second.

However at around 6,000 records this suddenly drops off to around 40
records per second.

This is a big problem as the "live" list is over 4 million records long.
I'd break it up but this is to be a regular test so that would be messy
to say the least -

Each record is 8 fields & total length tends to be below 200 characters
CSV is comma and ""

I was wondering if anyone with strong PHP knowledge has heard of this or
could help explain it please (As you probably know I'm very new to PHP)

I've trimmed the startup code to pseudocode to make it easier to read.
Otherwise my code is as below:

Sorry if line wrap is wrong - that would be my newsreader not the code

As you can see the code grabs a field from the database - spawns a
windows (msdos command line) .exe file to test it and writes the field
out to either a good or bad result file.

I dont do any file seeking or open and closing of files during the loop.

Tony

------------------------------ CODE START ------------------

<?php

//+++++++++++++++++++++++++++ PSeudocode start
open all new files for appending here (fopen($fin, 'a');)
open database for read-only here
Initialise all variables to 0 here

START;
get start-time
loop()
get end-time
write-statistics
close all files here
exit;
// +++++++++++++++++++++++++PSeudocode End

function loop() {
global
$fin,$fout,$fgood,$records,$fields,$good,$bad,$tot al,$dif_fcount,$nodata
;
while (($data = fgetcsv($fin, 1024, ",", "\"")) !== FALSE) {
if($data == '') { continue; }
$records++;
if (count($data) != $fields ) { $fields = count($data);
$dif_fcount++; }
if ($data[2] == '') { $data[2] = 'NO DATA' ; $nodata++; }
$raw = $data[7];
$star = "\"" . ($data[2]) . "\"";
$star = $raw ;
if (checkit($star) == false) {
fwrite($fout, $records . "," . $raw . "\r");
$bad += 1;
} else {
fwrite($fgood,$star . "\r");
$good += 1;
}
$total += 1;
echo("Total checked: " . $total . "\r" );
} //while
}
function checkit($star) {
exec("declination.exe " . $star , $aout, $returnval);
if ($aout[0][0] === "Y") {
return true;
} else {
return false;
}
}
?>
Jun 10 '06 #1
5 2701

ps - yes I know about the $star/$raw data confusion etc. that will be
debugged later.

tony
Jun 10 '06 #2
to**@tony.com wrote:
I'm using PHP 5 on Win-98 command line (ie no web server involved)

I'm processing a large csv file and when I loop through it I can process
around 275 records per second.

However at around 6,000 records this suddenly drops off to around 40
records per second.

This is a big problem as the "live" list is over 4 million records long.
I'd break it up but this is to be a regular test so that would be messy
to say the least -

Each record is 8 fields & total length tends to be below 200 characters
CSV is comma and ""

I was wondering if anyone with strong PHP knowledge has heard of this or
could help explain it please (As you probably know I'm very new to PHP)

I've trimmed the startup code to pseudocode to make it easier to read.
Otherwise my code is as below:

Sorry if line wrap is wrong - that would be my newsreader not the code

As you can see the code grabs a field from the database - spawns a
windows (msdos command line) .exe file to test it and writes the field
out to either a good or bad result file.

I dont do any file seeking or open and closing of files during the loop.

Tony

------------------------------ CODE START ------------------

<?php

//+++++++++++++++++++++++++++ PSeudocode start
open all new files for appending here (fopen($fin, 'a');)
open database for read-only here
Initialise all variables to 0 here

START;
get start-time
loop()
get end-time
write-statistics
close all files here
exit;
// +++++++++++++++++++++++++PSeudocode End

function loop() {
global
$fin,$fout,$fgood,$records,$fields,$good,$bad,$tot al,$dif_fcount,$nodata
;
while (($data = fgetcsv($fin, 1024, ",", "\"")) !== FALSE) {
if($data == '') { continue; }
$records++;
if (count($data) != $fields ) { $fields = count($data);
$dif_fcount++; }
if ($data[2] == '') { $data[2] = 'NO DATA' ; $nodata++; }
$raw = $data[7];
$star = "\"" . ($data[2]) . "\"";
$star = $raw ;
if (checkit($star) == false) {
fwrite($fout, $records . "," . $raw . "\r");
$bad += 1;
} else {
fwrite($fgood,$star . "\r");
$good += 1;
}
$total += 1;
echo("Total checked: " . $total . "\r" );
} //while
}
function checkit($star) {
exec("declination.exe " . $star , $aout, $returnval);
if ($aout[0][0] === "Y") {
return true;
} else {
return false;
}
}
?>


A slow down of this magnitude typically points to some system-related
bottleneck rather than an algorithmic one. Have you checked the
processes' virtual memory use? I would suspect that you are starting to
swap around the 6,000th record.

If not, I would start to place finer-grained time information around the
major I/O points (fgetcsv, fwrite) to see if they are causing the slow down.

On a stylistic note, why do you use $x++ in some places and $x += 1 in
others? Also, the checkit function could use the trinary compare operator:

function checkit($star) {
exec('declination.exe '.$star, $aout, $returnval);
return ($aout[0][0] === 'Y');
}

and 'if( $checkit($star) == false ) {' could become
'if( ! $checkit($star) ) {'

However, I don't think any of these would contribute to your slow down
issue.

-david-

Jun 10 '06 #3
In article <xP****************@fe24.usenetserver.com>,
da***********@sympatico.ca says...

A slow down of this magnitude typically points to some system-related
bottleneck rather than an algorithmic one. Have you checked the
processes' virtual memory use? I would suspect that you are starting to
swap around the 6,000th record.

If not, I would start to place finer-grained time information around the
major I/O points (fgetcsv, fwrite) to see if they are causing the slow down.

On a stylistic note, why do you use $x++ in some places and $x += 1 in
others? Also, the checkit function could use the trinary compare operator:

function checkit($star) {
exec('declination.exe '.$star, $aout, $returnval);
return ($aout[0][0] === 'Y');
}

and 'if( $checkit($star) == false ) {' could become
'if( ! $checkit($star) ) {'

However, I don't think any of these would contribute to your slow down
issue.

-david-


Thanks for the comments david - I've run this on both windows and linux
now and the linux system I ran through apache - I get the same results on
that too (very similar but not identical) (The linux box is entirely
different hardware) - I'll post timing differences later - they're
outrageous !!!

I've also run the windows version with apache too now and still get the
slowdown.

I've tried removing my call to the executable replacing it with a simple
return and it still slows down.

Looking at what you suggest - could it be that the fgetcsv command is
searching from the top of the file on every itteration?
I dont cause it to - but if thats how it works ?

I have discovered a windows loss of 2Mb of system memory for every run -
thats either a windows or PHP problem I dont know which - probably
windows and I dont think its connected to the slowdown.

If this isnt obvious to anyone I guess the only thingI can do is start
taking things out one by one starting with your suggestions.

On the memory side - no it all(suprisingly) seems to happen in ram - even
the test database file of 1 million records seems to go straight to ram.
There's no thrashing or anything (The win-98 system has 1Gb ram - the
linux system just 128Mb)

On the style front - Dave you wouldn't believe my working methods!
In my time I've used maybe 7 languages in anger and currently I use 3 or
4 so I'm in and out of them all the time - things get mixed up in my
head.
I always do a "comment" sweep when I'm done and tidy things up for this
very reason but sometimes I do use ++x and += 1 because it helps me
remember whats going on and what I need to keep an eye on. (I have a bad
short term memory problem) My "if" statements and other stuff follow the
same routine for the same reasons - there is method in my madness - you
just have to be mad too to see it...
This particular code has been messed about with quite badly too as I try
to find the problem.

I care more about function not fancy - most PHP style I've seen is
abysmal anyway - the standard style reccomended is appaling. I'm not a
fan of the way most people code - nor of the way many languages work for
that matter. Anything other than Forth is bad form in my book ;-)

As for the trinary operator - its a terrible construct for anyone who
doesn't use it regularly or anyone looking at someone elses code so I
avoid it.

tony

Jun 11 '06 #4
I wouldn't be surprised at all if PHP was loading the whole file into
memory or scanning from start to finish every fgetcsv() call. Checking
the C source code would give better insight.

there are also a couple functions that will tell you the current memory
usage by PHP. Putting those in for each iteration might give some
insight into if its running out of memory.

Maybe it is the data confusion the function, too? Unescaped quote,
missing newline, too long of a line, etc. Something like that might
confuse the fgetcsv() function.

to**@tony.com wrote:
In article <xP****************@fe24.usenetserver.com>,
da***********@sympatico.ca says...

A slow down of this magnitude typically points to some system-related
bottleneck rather than an algorithmic one. Have you checked the
processes' virtual memory use? I would suspect that you are starting to
swap around the 6,000th record.

If not, I would start to place finer-grained time information around the
major I/O points (fgetcsv, fwrite) to see if they are causing the slow down.

On a stylistic note, why do you use $x++ in some places and $x += 1 in
others? Also, the checkit function could use the trinary compare operator:

function checkit($star) {
exec('declination.exe '.$star, $aout, $returnval);
return ($aout[0][0] === 'Y');
}

and 'if( $checkit($star) == false ) {' could become
'if( ! $checkit($star) ) {'

However, I don't think any of these would contribute to your slow down
issue.

-david-


Thanks for the comments david - I've run this on both windows and linux
now and the linux system I ran through apache - I get the same results on
that too (very similar but not identical) (The linux box is entirely
different hardware) - I'll post timing differences later - they're
outrageous !!!

I've also run the windows version with apache too now and still get the
slowdown.

I've tried removing my call to the executable replacing it with a simple
return and it still slows down.

Looking at what you suggest - could it be that the fgetcsv command is
searching from the top of the file on every itteration?
I dont cause it to - but if thats how it works ?

I have discovered a windows loss of 2Mb of system memory for every run -
thats either a windows or PHP problem I dont know which - probably
windows and I dont think its connected to the slowdown.

If this isnt obvious to anyone I guess the only thingI can do is start
taking things out one by one starting with your suggestions.

On the memory side - no it all(suprisingly) seems to happen in ram - even
the test database file of 1 million records seems to go straight to ram.
There's no thrashing or anything (The win-98 system has 1Gb ram - the
linux system just 128Mb)

On the style front - Dave you wouldn't believe my working methods!
In my time I've used maybe 7 languages in anger and currently I use 3 or
4 so I'm in and out of them all the time - things get mixed up in my
head.
I always do a "comment" sweep when I'm done and tidy things up for this
very reason but sometimes I do use ++x and += 1 because it helps me
remember whats going on and what I need to keep an eye on. (I have a bad
short term memory problem) My "if" statements and other stuff follow the
same routine for the same reasons - there is method in my madness - you
just have to be mad too to see it...
This particular code has been messed about with quite badly too as I try
to find the problem.

I care more about function not fancy - most PHP style I've seen is
abysmal anyway - the standard style reccomended is appaling. I'm not a
fan of the way most people code - nor of the way many languages work for
that matter. Anything other than Forth is bad form in my book ;-)

As for the trinary operator - its a terrible construct for anyone who
doesn't use it regularly or anyone looking at someone elses code so I
avoid it.

tony


Jun 11 '06 #5
In article <11**********************@m38g2000cwc.googlegroups .com>,
ri********@gmail.com says...

I wouldn't be surprised at all if PHP was loading the whole file into
memory or scanning from start to finish every fgetcsv() call. Checking
the C source code would give better insight.

there are also a couple functions that will tell you the current memory
usage by PHP. Putting those in for each iteration might give some
insight into if its running out of memory.

Maybe it is the data confusion the function, too? Unescaped quote,
missing newline, too long of a line, etc. Something like that might


Thanks Richard - I looked at windows memory usage but never thought about
PHP itself - it does have a limit in the .INI file I think... I'll take a
look.

It isnt connected to the data (the csv file itself is decoded perfectly)

I've now run the process with just the drop to shell and return back
and it still slows down both on windows and linux so it is looking like
the fget function... or the loop ...(stack overflow bug?) for a while I
thought maybe it was windows but it doesn't seem to be. I changed it to
fgets and get the same result so that would seem to rule out the CSV
decoding part too.

I'm going to try just an empty file read loop but the trouble with that
is it probably isnt going to show up as it wont be doing any work. ;-(

There also seems to be a massive "flush" of something about 20 seconds
after the PHP code ends which slows windows down to a crawl for about 3
seconds - again I don't know what that is yet. Could be windows.

The other thing I've just discovered is PHP is taking 94% of processor
usage and I can't seem to change its priority in windows with the tool I
normally use for that. The number changes but PHP doesn't respond.

Can PHP be controlled for that - I dont remember reading anything in the
docs. 94% is way over the top - my real time TV capture device only uses
25%

PHP seems to have a default priority of 8 which is normal so this is
somewhat confusing too as it refuses to release to other apps.

This one is another application killer if I can't sort it ...

tony
Jun 12 '06 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
by: TheKeith | last post by:
I'm writing a script with a while loop in it. How can I get it to go slower so that it doesn't appear to happen all at once--so that it looks animated--basically. I tried the setTimeout(500) in the...
22
by: Jaspreet | last post by:
I was recently asked this question in an interview. Unfortunately I was not able to answer it and the interviewer made a decision on my C strengths (or weekness) based on this single question and...
7
by: Adam Clauss | last post by:
I have a fairly large project (over 1800 files). When I open up the solution, the entire IDE slows to a crawl. CPU usage sits at about 50% for a couple minutes, and then everything is normal...
5
by: Droopy | last post by:
Hi, I am writing a C# program that is calling C++ legacy code. I wrote a C++ managed wrapper for this legacy code. It seems to work well. The legacy code is handling serial (RS232 or RS422)...
3
by: Bahman | last post by:
Hello! I have a listbox that auto-refeshes the page retrieving information based as what was selected. The page runs fast at the beginning. Then it gradually starts to slow down. What could...
3
by: Dan Stromberg | last post by:
I have two different python programs that are slowing down quite a bit as their memory use goes up. I'm not really sure if this is some sort of CPU cache effect, or if it's something about...
7
by: Chris Thompson | last post by:
Hi All, I have a private website created using HTML/PHP. Within this site, there is a page that has a form (question.php), which is populated depending on the question number that has been...
6
by: lawpoop | last post by:
Hello all - I have a problem with a php page. I have a setup with Apache 2.0, PHP 5 and Postgres 8.1 on Debian 4.0. My script uses simplexml to parse large xml files ( 8 files at 2-15 MB ea. )...
11
by: Ken Fine | last post by:
I am using VS.NET 2008 and like it a lot. One of the very few things I don't like is a bug that seems to spawn literally thousands of   strings, one after the other, on design view changes....
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.