473,756 Members | 7,560 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

looping eating 100 per cpu

i've got a while loop thats iterating through a text file and pumping the
contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i remove
the mysql insert query and just loop through the file , it still hits 100
per cent CPU. This has the knock on effect of slowing my script down so
that mysql inserts are occuring every 1/2 second or so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}
Jul 17 '05 #1
8 4060
"kaptain kernel" <no****@nospam. gov> wrote in message
news:3f******** *************** @news.easynet.c o.uk...
i've got a while loop thats iterating through a text file and pumping the
contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i remove the mysql insert query and just loop through the file , it still hits 100
per cent CPU. This has the knock on effect of slowing my script down so
that mysql inserts are occuring every 1/2 second or so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}


Thats because you are asking it to do alot.

Regards
Richard Grove

http://shopbuilder.org - ecommerce systems
Become a Shop Builder re-seller:
http://www.affiliatewindow.com/affil...ls.php?mid=611
http://www.affiliatewindow.com/a.pl?590

Jul 17 '05 #2
kaptain kernel:
i've got a while loop thats iterating through a text file and pumping the
contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i
remove the mysql insert query and just loop through the file , it still
hits 100 per cent CPU. This has the knock on effect of slowing my script
down so that mysql inserts are occuring every 1/2 second or so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}


That's very strange, 1/2 second between each insert is crazy. Is it possible
that you're running out of memory? (It shouldn't, as you only read 4KB on
each iteration). The 100% CPU usage isn't surprising though.

What if you remove the query and don't do anything with the data? What if
you write a perl or python script which does the same (again, without the
query)?

André Næss
Jul 17 '05 #3
kaptain kernel wrote:

i've got a while loop thats iterating through a text file and pumping the
contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i remove
the mysql insert query and just loop through the file , it still hits 100
per cent CPU. This has the knock on effect of slowing my script down so
that mysql inserts are occuring every 1/2 second or so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}


I wonder if it would be faster with fread()? If the average line is very small
it should drastically reduce the number of times the thing has to read from the
disk. This is UNTESTED code, so if you use it, please be careful. I don't read
from huge files often enough to be confident that it will work.

$line = "";

while ($line.=fread($ fd,4096))
{
$insertlist = "";
$linecontents=e xplode("\n",$li ne);
$line = $linecontents[count[$linecontents]]; // remember beginning of last
(incomplete) data set to be used in next loop
unset($linecont ents[count[$linecontents]]); // delete last (incomplete) data
set
foreach($lineco ntents as $val) {
$insertlist .= "('','$val[0]','$val[1]'),"; // build the data portion of the
mysql query statement
}

// insert up to 4K worth of data at once while removing trailing comma
mysql_query("IN SERT INTO MyDatabase (col1, col2, col3) VALUES " .
rtrim($insertli st, ","));
}

Regards,
Shawn

--
Shawn Wilson
sh***@glassgian t.com
http://www.glassgiant.com
Jul 17 '05 #4
kaptain kernel wrote:
i've got a while loop thats iterating through a text file and pumping
the contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i
remove the mysql insert query and just loop through the file , it
still hits 100 per cent CPU. This has the knock on effect of slowing
my script down so that mysql inserts are occuring every 1/2 second or
so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}


You should try using the built-in functions of PHP to simplify this,
could propably speed up the process quite a bit. (see
http://www.php.net/file). I am assuming that doing it your way means
reading the file _very_ many times, whereas with the example below the
whole file is read once, and then processed line by line from memory.

$bigfile = file("bigtextfi le.txt");

foreach ($bigfile as $line_num => $line) {
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}

HTH

--
Suni

Jul 17 '05 #5
Juha Suni:
kaptain kernel wrote:
i've got a while loop thats iterating through a text file and pumping
the contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i
remove the mysql insert query and just loop through the file , it
still hits 100 per cent CPU. This has the knock on effect of slowing
my script down so that mysql inserts are occuring every 1/2 second or
so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}


You should try using the built-in functions of PHP to simplify this,
could propably speed up the process quite a bit. (see
http://www.php.net/file). I am assuming that doing it your way means
reading the file _very_ many times, whereas with the example below the
whole file is read once, and then processed line by line from memory.


Ehm... that means reading a 150MB file into memory, even though you don't
need it there. Not a very good idea! His code only reads the file once, and
only 4KB a time, so there is very little wasted memory.

André Næss
Jul 17 '05 #6
Juha Suni wrote:

kaptain kernel wrote:
i've got a while loop thats iterating through a text file and pumping
the contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i
remove the mysql insert query and just loop through the file , it
still hits 100 per cent CPU. This has the knock on effect of slowing
my script down so that mysql inserts are occuring every 1/2 second or
so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}


You should try using the built-in functions of PHP to simplify this,
could propably speed up the process quite a bit. (see
http://www.php.net/file). I am assuming that doing it your way means
reading the file _very_ many times, whereas with the example below the
whole file is read once, and then processed line by line from memory.

$bigfile = file("bigtextfi le.txt");

foreach ($bigfile as $line_num => $line) {
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}


Excerpt from http://ca.php.net/manual/en/function.file.php:

//------------------------------------------
Note: Now that file() is binary safe it is 'much' slower than it used to be. If
you are planning to read large files it may be worth your while using fgets()
instead of file() For example:

$fd = fopen ("log_file.txt" , "r");
while (!feof ($fd))
{
$buffer = fgets($fd, 4096);
$lines[] = $buffer;
}
fclose ($fd);

The resulting array is $lines.

I did a test on a 200,000 line file. It took seconds with fgets() compared to
minutes with file().
//------------------------------------------

My guess is, for a 150Mb file, file() would take forever, if it worked at all.

Shawn
--
Shawn Wilson
sh***@glassgian t.com
http://www.glassgiant.com
Jul 17 '05 #7
> //------------------------------------------
Note: Now that file() is binary safe it is 'much' slower than it used
to be. If you are planning to read large files it may be worth your
while using fgets() instead of file() For example:

$fd = fopen ("log_file.txt" , "r");
while (!feof ($fd))
{
$buffer = fgets($fd, 4096);
$lines[] = $buffer;
}
fclose ($fd);

The resulting array is $lines.

I did a test on a 200,000 line file. It took seconds with fgets()
compared to minutes with file().
//------------------------------------------

My guess is, for a 150Mb file, file() would take forever, if it
worked at all.

Shawn


Thank you for the info, I stand corrected.

I guess You really have no good options on this front. a 150 MB
text-file is a huge load of data, and looping through it just seems to
be too much for your system.

First I would consider the source of the text-file, where does it come
from? Is there no way the application creating the file could push the
data directly to mysql? I would also consider benchmarking with the
original data in different formats. How fast is the script freezing,
i.e. is it slow from the start or only after reading the first x lines?
Could it be faster to process 10 x 15Mb text-files than a large single
file?

Other than that, I cant really guess. I suppose a script with another
(compiled) language might be faster, so you could try C. A PHP-optimizer
could also fasten the process quite a bit.

If nothing else helps, a system hardware upgrade might be your best
option, considering the importance of the script, of course.

--
Suni
Jul 17 '05 #8
Just out of curiosity I tried looping through a 150 meg text files with 100
byte records (i.e. ~1,500,000 lines) using fgets(). My computer cut through
it in less than 2 minutes.

My thinking is that the SQL statements generated by the script are malformed
most of the time. That would explain why it takes so long for an actual
insert to occur. If an insert occurs on every iteration through the loop,
the MySQL process would be using up most of the CPU time instead, with
Apache/PHP sitting there waiting for it to finish each query.

Uzytkownik "Shawn Wilson" <sh***@glassgia nt.com> napisal w wiadomosci
news:3F******** *******@glassgi ant.com...
kaptain kernel wrote:

i've got a while loop thats iterating through a text file and pumping the contents into a database. the file is quite large (over 150mb).

the looping causes my CPU load to race up to 100 per cent. Even if i remove the mysql insert query and just loop through the file , it still hits 100 per cent CPU. This has the knock on effect of slowing my script down so
that mysql inserts are occuring every 1/2 second or so.

here's my script:

$fd=fopen("bigt extfile.txt","r ");

while (!feof($fd) )
{
$line=fgets($fd ,4096);
$linecontents=e xplode(":",$lin e);
mysql_query("IN SERT INTO MyDatabase VALUES
'','$lineconten ts[0]','$linecontent s[1]'");
}
I wonder if it would be faster with fread()? If the average line is very

small it should drastically reduce the number of times the thing has to read from the disk. This is UNTESTED code, so if you use it, please be careful. I don't read from huge files often enough to be confident that it will work.

$line = "";

while ($line.=fread($ fd,4096))
{
$insertlist = "";
$linecontents=e xplode("\n",$li ne);
$line = $linecontents[count[$linecontents]]; // remember beginning of last (incomplete) data set to be used in next loop
unset($linecont ents[count[$linecontents]]); // delete last (incomplete) data set
foreach($lineco ntents as $val) {
$insertlist .= "('','$val[0]','$val[1]'),"; // build the data portion of the mysql query statement
}

// insert up to 4K worth of data at once while removing trailing comma
mysql_query("IN SERT INTO MyDatabase (col1, col2, col3) VALUES " .
rtrim($insertli st, ","));
}

Regards,
Shawn

--
Shawn Wilson
sh***@glassgian t.com
http://www.glassgiant.com

Jul 17 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

45
7469
by: Trevor Best | last post by:
I did a test once using a looping variable, first dimmed as Integer, then as Long. I found the Integer was quicker at looping. I knew this to be true back in the 16 bit days where the CPU's (80286) word size was 16 bits same as an integer. Now with a 32 bit CPU I would have expected the long to be faster as it's the same size as the CPU's word size so wouldn't need sawing in half like a magician's assistant to calculate on like an...
5
2418
by: masood.iqbal | last post by:
My simplistic mind tells me that having local variables within looping constructs is a bad idea. The reason is that these variables are created during the beginning of an iteration and deleted at the end of the iteration. Kindly note that I am not talking about dynamically allocating memory within a loop, which is a perfectly valid operation (albeit with a chance of resulting in a memory leak, unless we are careful). The most common...
1
5390
by: Diva | last post by:
Hi, I have a data grid in my application. It has 20 rows and I have set the page size as 5. I have a Submit button on my form and when I click on Submit, I need to loop through the rows in the datagrid. Using the items collection just gives me the 5 rows that are displayed on the screen. Is there any way of looping through all the rows in the grid?
2
1068
by: Bob | last post by:
I'm sure some of you are familliar with PTFB... http://www.bobos.demon.co.uk/par/PTFB.htm it's a wonderful little app that answers dialogue box questions for you when they pop up. I have an unavoiable need to use this in an application now, and I would prefer to code up something myself rather than tell a client that he needs to install PTFB. Does anyone have a pointer to dialogue-eating source code?
2
1799
by: Just Me | last post by:
I do not get WM_KEYDOWN messages for arrow keys presses in my overridden WndProc for a picturebox control. Something must be eating them - anyone familiar with what's going on? Thanks
1
2016
by: google | last post by:
Hello all- Apologies to everyone for what's probably a very inchoate and uninformed question, but I've been thrust into the position about having to learn more about PHP very quickly. (I'm also posting to alt.comp.lang.php, since I'm not sure what the difference is between the two groups.) Basically, I run a Web site that includes two PHP-based subsites -- one WordPress site and one PHPBB site. I get traffic that strikes me as
9
7843
by: Tomi Lindberg | last post by:
Hi, With the following function definition, is it possible to create an instance of class C outside the function f (and if it is, how)? And yes, I think this is one of those times when the real question is why :) class C(object): def __init__(self): self.a = 'a'
5
1775
by: Per B. Sederberg | last post by:
Hi Everybody: I'm having a difficult time figuring out a a memory use problem. I have a python program that makes use of numpy and also calls a small C module I wrote because part of the simulation needed to loop and I got a massive speedup by putting that loop in C. I'm basically manipulating a bunch of matrices, so nothing too fancy. That aside, when the simulation runs, it typically uses a relatively small amount of memory (about...
20
2850
by: Ifoel | last post by:
Hi all, Sorry im beginer in vb. I want making programm looping character or number. Just say i have numbers from 100 to 10000. just sample: Private Sub Timer1_Timer() if check1.value= 1 then
0
9287
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9722
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8723
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7259
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6542
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5155
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5318
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3817
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2677
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.