469,352 Members | 2,150 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,352 developers. It's quick & easy.

File() too slow

RG
I am trying to process a CSV file but am having trouble with my hosts
maximum execution time of 30 seconds.
This is how the script works at the moment.
User uploads their CSV file
The script goes through the file() and writes smaller chunk files.
The script then goes through processing the smaller files, populating the
database, deleting the processed file and refreshing itself, thus starting
again.

This system works for files up to 15000 rows, but I need to be able to
process larger files.
The bottleneck is with the initial splitting, since I use the file()
function to read the entire uploaded file.

Does anyone know a quicker way to split a file into smaller chunks.

TIA
RG

Jul 17 '05 #1
22 5018
> This system works for files up to 15000 rows, but I need to be able to
process larger files.
The bottleneck is with the initial splitting, since I use the file()
function to read the entire uploaded file.


file() is by far NOT a bottleneck at your present Problem as it is not
any slower than reading the using fread() and splitting it up afterwards.

The Bottleneck is (seems like) your Database compare / writing stuff.

It sounds like you read in eg. 15000 lines, break them down into chunks
and then proceed all the chunks line by line, checking the database,
comparing
the database data with your cvs and then deciding on what to do (delete,
insert
update ...)

At 15000 Entries THIS will make your Script timeout, not the file()
command.

If you do not believe it, try to manually read and explode it ...

regards

timo

Jul 17 '05 #2
RG

"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com...
This system works for files up to 15000 rows, but I need to be able to
process larger files.
The bottleneck is with the initial splitting, since I use the file()
function to read the entire uploaded file.


file() is by far NOT a bottleneck at your present Problem as it is not
any slower than reading the using fread() and splitting it up afterwards.

The Bottleneck is (seems like) your Database compare / writing stuff.

It sounds like you read in eg. 15000 lines, break them down into chunks
and then proceed all the chunks line by line, checking the database,
comparing
the database data with your cvs and then deciding on what to do (delete,
insert
update ...)

At 15000 Entries THIS will make your Script timeout, not the file()
command.

If you do not believe it, try to manually read and explode it ...

regards

timo

The initial splitting of the file is the bottleneck, I do not compare
anything in this procedure.
There is not a problem with the database comparing etc, it seems to take
about 2 seconds to do around 3000 mysql queries. This is fine.

I just need a quick way to initially split the large file into smaller
files.
TIA
RG

Jul 17 '05 #3
> The initial splitting of the file is the bottleneck, I do not compare
anything in this procedure.
There is not a problem with the database comparing etc, it seems to take
about 2 seconds to do around 3000 mysql queries. This is fine.


lets talk about filesizes. I JUST tried it. Reading (and splitting) a 19MB
File
with 322000 Lines took 0.39843 seconds on My Machine:
<?php

list($usec, $sec) = explode(" ",microtime());
$start =((float)$usec + (float)$sec);

$indata = file("cvs");

list($usec, $sec) = explode(" ",microtime());
$end =((float)$usec + (float)$sec);

printf("%.5f",$end-$start);

?>

This IS NOT a bottleneck i believe?

Timo

Jul 17 '05 #4
RG

"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com...
The initial splitting of the file is the bottleneck, I do not compare
anything in this procedure.
There is not a problem with the database comparing etc, it seems to take
about 2 seconds to do around 3000 mysql queries. This is fine.


lets talk about filesizes. I JUST tried it. Reading (and splitting) a 19MB
File
with 322000 Lines took 0.39843 seconds on My Machine:
<?php

list($usec, $sec) = explode(" ",microtime());
$start =((float)$usec + (float)$sec);

$indata = file("cvs");

list($usec, $sec) = explode(" ",microtime());
$end =((float)$usec + (float)$sec);

printf("%.5f",$end-$start);

?>

This IS NOT a bottleneck i believe?

Timo


Lightning fast.
I think I've found the problem: upload_max_filesize is 2M
It is dieing, very poorly.
You must have changed your php.ini?
Thanks for the pointers
RG
Jul 17 '05 #5
> Lightning fast.

yeehaa :-)
I think I've found the problem: upload_max_filesize is 2M
It is dieing, very poorly.
You must have changed your php.ini?
jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.
Thanks for the pointers


hope it would work out well for you

regards

timo

Jul 17 '05 #6
RG

"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com...
Lightning fast.


yeehaa :-)
I think I've found the problem: upload_max_filesize is 2M
It is dieing, very poorly.
You must have changed your php.ini?


jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.
Thanks for the pointers


hope it would work out well for you

regards

timo


I don't suppose there's a workaround for this?
RG
Jul 17 '05 #7
RG

"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com...
Lightning fast.


yeehaa :-)
I think I've found the problem: upload_max_filesize is 2M
It is dieing, very poorly.
You must have changed your php.ini?


jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.
Thanks for the pointers


hope it would work out well for you

regards

timo


I don't suppose there's a workaround for this?
RG
Jul 17 '05 #8
In article <3f***********************@mercury.nildram.net>, RG's output
was...
I think I've found the problem: upload_max_filesize is 2M
It is dieing, very poorly.
You must have changed your php.ini?


jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.


I don't suppose there's a workaround for this?
RG

IIRC - you can do something along the lines of:

ini_set("upload_max_filesize", "64M");
- I understand this will only change the max size for operations in that
particular script - not for the whole server/virtual server.
Jul 17 '05 #9
RG

"Eto Demerzel" <et**********@fijivillage.com> wrote in message
news:MP************************@news-text.blueyonder.co.uk...
In article <3f***********************@mercury.nildram.net>, RG's output
was...
> I think I've found the problem: upload_max_filesize is 2M
> It is dieing, very poorly.
> You must have changed your php.ini?

jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.


I don't suppose there's a workaround for this?
RG

IIRC - you can do something along the lines of:

ini_set("upload_max_filesize", "64M");
- I understand this will only change the max size for operations in that
particular script - not for the whole server/virtual server.

Tried that out but it seems that the file is uploaded before the script is
executed which means, the file is dumped before the function is called.
I'm sure my host wont want to change these settings, Rackshack (cheap).
Looks like I'm gonna have to get some more expensive hosting.
Any suggestions: PHP with GD, Multiple MySQL databases, password protect
directories, 20gb month
Thanks
RG

Jul 17 '05 #10
On Fri, 26 Sep 2003 10:22:39 +0100, RG wrote:
jupp .. changed to 32M, because i often need to transfer bigger data in
our intranet.


[ snip ]

I don't suppose there's a workaround for this? RG

RG,

You can put a hidden field in a form:
<input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />
This should "do the biz" before the form is whizzed off to the server for
processing =)
HTH.

Regards,

Ian

--
Ian.H [Design & Development]
digiServ Network - Web solutions
www.digiserv.net | irc.digiserv.net | forum.digiserv.net
Programming, Web design, development & hosting.

Jul 17 '05 #11
On Fri, 26 Sep 2003 10:29:15 +0000, Ian.H wrote:
<input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />

Oops.. that should be:
<input type="hidden" name="MAX_FILE_SIZE" value="33554432" />

Regards,

Ian

--
Ian.H [Design & Development]
digiServ Network - Web solutions
www.digiserv.net | irc.digiserv.net | forum.digiserv.net
Programming, Web design, development & hosting.

Jul 17 '05 #12
RG

"Ian.H" <ia*@WINDOZEdigiserv.net> wrote in message
news:pa****************************@hybris.digiser v.net...
On Fri, 26 Sep 2003 10:22:39 +0100, RG wrote:
jupp .. changed to 32M, because i often need to transfer bigger data in
our intranet.


[ snip ]

I don't suppose there's a workaround for this? RG

RG,

You can put a hidden field in a form:
<input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />
This should "do the biz" before the form is whizzed off to the server for
processing =)
HTH.

Regards,

Ian

Tried that and it didn't work.
Thanks anyway.
Anyone else?
RG


Jul 17 '05 #13
RG

"Ian.H" <ia*@WINDOZEdigiserv.net> wrote in message
news:pa****************************@hybris.digiser v.net...
On Fri, 26 Sep 2003 10:29:15 +0000, Ian.H wrote:
<input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />

Oops.. that should be:
<input type="hidden" name="MAX_FILE_SIZE" value="33554432" />

Regards,

Tried the later too.
Still didn't work
Thanks though, I'm stuck here.
RG


Jul 17 '05 #14
> Tried that and it didn't work.
Thanks anyway.
Anyone else?
RG


Just another suggestion : try to upload the file in GZ Format (if
possible).

This would cut down transfer and filesize to a minimum and could easily
with NO TIME unpacked by PHPs internal gz functions.

My previously mentioned 19MB testfile got shrinked down to 351.164 bytes.

You get the point?

timo

Jul 17 '05 #15
RG

"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com...
Tried that and it didn't work.
Thanks anyway.
Anyone else?
RG


Just another suggestion : try to upload the file in GZ Format (if
possible).

This would cut down transfer and filesize to a minimum and could easily
with NO TIME unpacked by PHPs internal gz functions.

My previously mentioned 19MB testfile got shrinked down to 351.164 bytes.

You get the point?

timo

Think I'm stuck, I can't trust clients with no knowledge to do this
Thanks though
RG
Jul 17 '05 #16
RG wrote:

"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com...
Tried that and it didn't work.
Thanks anyway.
Anyone else?
RG


Just another suggestion : try to upload the file in GZ Format (if
possible).

This would cut down transfer and filesize to a minimum and could easily
with NO TIME unpacked by PHPs internal gz functions.

My previously mentioned 19MB testfile got shrinked down to 351.164 bytes.

You get the point?

timo


Think I'm stuck, I can't trust clients with no knowledge to do this
Thanks though
RG


You could probably write a tiny program in VB or something else that would split
a CSV file (file.csv) into 2MB chunks (file1.csv, file2.csv, file3.csv ...). It
would almost definitely be simple enough for your clients to use.

Shawn
--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #17
RG

"Shawn Wilson" <sh***@glassgiant.com> wrote in message
news:3F**************@glassgiant.com...
RG wrote:

"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com...
> Tried that and it didn't work.
> Thanks anyway.
> Anyone else?
> RG

Just another suggestion : try to upload the file in GZ Format (if
possible).

This would cut down transfer and filesize to a minimum and could easily with NO TIME unpacked by PHPs internal gz functions.

My previously mentioned 19MB testfile got shrinked down to 351.164 bytes.
You get the point?

timo

Think I'm stuck, I can't trust clients with no knowledge to do this
Thanks though
RG


You could probably write a tiny program in VB or something else that would

split a CSV file (file.csv) into 2MB chunks (file1.csv, file2.csv, file3.csv ....). It would almost definitely be simple enough for your clients to use.

Shawn

Thats not a bad idea. Would this be possible with Javascript? or maybe VB
Script?
Would ideally want the program to somehow interface with the import scripts,
so ideally, browser based.
Any ideas?
RG
Jul 17 '05 #18
RG wrote:
> Just another suggestion : try to upload the file in GZ Format (if
> possible).
>
> This would cut down transfer and filesize to a minimum and could easily > with NO TIME unpacked by PHPs internal gz functions.
>
> My previously mentioned 19MB testfile got shrinked down to 351.164 bytes. >
> You get the point?
>
> timo
>

Think I'm stuck, I can't trust clients with no knowledge to do this
Thanks though
RG


You could probably write a tiny program in VB or something else that would

split
a CSV file (file.csv) into 2MB chunks (file1.csv, file2.csv, file3.csv

...). It
would almost definitely be simple enough for your clients to use.

Shawn


Thats not a bad idea. Would this be possible with Javascript? or maybe VB
Script?
Would ideally want the program to somehow interface with the import scripts,
so ideally, browser based.
Any ideas?


JS, no for sure. VBScript - I doubt it, but I don't know VBScript. I doubt
you'll be able to do it from the browser. I think you'd almost have to write a
standalone executable and then use a browser to complete the upload normally.
It's kind of a clunky solve, but feasible. Alternately, you could tell them to
FTP it to a directory, then go to your program in a browser and select the file
to use from the FTP directory. Again, clunky... :(

Regards,
Shawn
--
Shawn Wilson
sh***@glassgiant.com
http://www.glassgiant.com
Jul 17 '05 #19
In article <3F***************@glassgiant.com>, Shawn Wilson's output
was...
Thats not a bad idea. Would this be possible with Javascript? or maybe VB
Script?
Would ideally want the program to somehow interface with the import scripts,
so ideally, browser based.
Any ideas?


JS, no for sure. VBScript - I doubt it, but I don't know VBScript. I doubt
you'll be able to do it from the browser. I think you'd almost have to write a
standalone executable and then use a browser to complete the upload normally.
It's kind of a clunky solve, but feasible. Alternately, you could tell them to
FTP it to a directory, then go to your program in a browser and select the file
to use from the FTP directory. Again, clunky... :(

This sounds like just the sort of thing java applets are useful for.

Perhaps have an applet in the web-page which selects the file, breaks it
into smaller chunks (and/or compresses), then calls a server-side script
to re-assemble the parts and then do whatever the script was supposed to
do with the file in the first place.
You might find some useful info at http://javaboutique.internet.com/ or
in comp.lang.java
Jul 17 '05 #20
It sounds like you're using MySQL and if loading the data is still a
bottleneck, have you tried using the "LOAD DATA LOCAL INFILE ..." stuff?
I saw an order of magnitude improvement in import speed when I went
from manually splitting my CSVs to using LOAD DATA.

If you're also having problems uploading large files you'll also need to
set php's post_max_size and memory_limit (if it's enabled) to something
larger than the size of the largest file you expect.

RG wrote:
Any suggestions: PHP with GD, Multiple MySQL databases, password protect
directories, 20gb month


I've had good luck with pair.com.

--Brent

Jul 17 '05 #21
"RG" <Me@NotTellingYa.com> wrote in message
news:3f***********************@mercury.nildram.net ...

"Eto Demerzel" <et**********@fijivillage.com> wrote in message
news:MP************************@news-text.blueyonder.co.uk...
In article <3f***********************@mercury.nildram.net>, RG's output
was...
> > I think I've found the problem: upload_max_filesize is 2M
> > It is dieing, very poorly.
> > You must have changed your php.ini?
>
> jupp .. changed to 32M, because i often need to transfer bigger
> data in our intranet.

I don't suppose there's a workaround for this?
RG

IIRC - you can do something along the lines of:

ini_set("upload_max_filesize", "64M");
- I understand this will only change the max size for operations in that
particular script - not for the whole server/virtual server.

Tried that out but it seems that the file is uploaded before the script is
executed which means, the file is dumped before the function is called.
I'm sure my host wont want to change these settings, Rackshack (cheap).
Looks like I'm gonna have to get some more expensive hosting.
Any suggestions: PHP with GD, Multiple MySQL databases, password protect
directories, 20gb month
Thanks
RG


I believe the 'ini_set("upload_max_filesize", "64M");' needs to be set in
the script in which the FORM element is placed, not in the script where the
file is being processed.

- Virgil
Jul 17 '05 #22
There is already an exisitng product that has an applet that breaks
the files into smaller chunks and uploads those files to a server side
script (via Java Servlet) to re-assemble the file on the server side.
The product can also send the reconstructed file to an FTP server of
your choice. This method is great for Secure file transfers and to
by-pass proxies and firewalls on the client side.

For demos and to download an evaluation version
take a look at http://www.unlimitedftp.ca/uftps

Good Luck!

John
Eto Demerzel <et**********@fijivillage.com> wrote in message news:<MP************************@news-text.blueyonder.co.uk>...
In article <3F***************@glassgiant.com>, Shawn Wilson's output
was...
Thats not a bad idea. Would this be possible with Javascript? or maybe VB
Script?
Would ideally want the program to somehow interface with the import scripts,
so ideally, browser based.
Any ideas?


JS, no for sure. VBScript - I doubt it, but I don't know VBScript. I doubt
you'll be able to do it from the browser. I think you'd almost have to write a
standalone executable and then use a browser to complete the upload normally.
It's kind of a clunky solve, but feasible. Alternately, you could tell them to
FTP it to a directory, then go to your program in a browser and select the file
to use from the FTP directory. Again, clunky... :(

This sounds like just the sort of thing java applets are useful for.

Perhaps have an applet in the web-page which selects the file, breaks it
into smaller chunks (and/or compresses), then calls a server-side script
to re-assemble the parts and then do whatever the script was supposed to
do with the file in the first place.
You might find some useful info at http://javaboutique.internet.com/ or
in comp.lang.java

Jul 17 '05 #23

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Dennis English | last post: by
8 posts views Thread by yinjennytam | last post: by
8 posts views Thread by rdemyan via AccessMonster.com | last post: by
35 posts views Thread by keerthyragavendran | last post: by
5 posts views Thread by Ravi | last post: by
3 posts views Thread by Ryan Liu | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.