I am trying to process a CSV file but am having trouble with my hosts
maximum execution time of 30 seconds.
This is how the script works at the moment.
User uploads their CSV file
The script goes through the file() and writes smaller chunk files.
The script then goes through processing the smaller files, populating the
database, deleting the processed file and refreshing itself, thus starting
again.
This system works for files up to 15000 rows, but I need to be able to
process larger files.
The bottleneck is with the initial splitting, since I use the file()
function to read the entire uploaded file.
Does anyone know a quicker way to split a file into smaller chunks.
TIA
RG 22 5200
> This system works for files up to 15000 rows, but I need to be able to process larger files. The bottleneck is with the initial splitting, since I use the file() function to read the entire uploaded file.
file() is by far NOT a bottleneck at your present Problem as it is not
any slower than reading the using fread() and splitting it up afterwards.
The Bottleneck is (seems like) your Database compare / writing stuff.
It sounds like you read in eg. 15000 lines, break them down into chunks
and then proceed all the chunks line by line, checking the database,
comparing
the database data with your cvs and then deciding on what to do (delete,
insert
update ...)
At 15000 Entries THIS will make your Script timeout, not the file()
command.
If you do not believe it, try to manually read and explode it ...
regards
timo
"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com... This system works for files up to 15000 rows, but I need to be able to process larger files. The bottleneck is with the initial splitting, since I use the file() function to read the entire uploaded file.
file() is by far NOT a bottleneck at your present Problem as it is not any slower than reading the using fread() and splitting it up afterwards.
The Bottleneck is (seems like) your Database compare / writing stuff.
It sounds like you read in eg. 15000 lines, break them down into chunks and then proceed all the chunks line by line, checking the database, comparing the database data with your cvs and then deciding on what to do (delete, insert update ...)
At 15000 Entries THIS will make your Script timeout, not the file() command.
If you do not believe it, try to manually read and explode it ...
regards
timo
The initial splitting of the file is the bottleneck, I do not compare
anything in this procedure.
There is not a problem with the database comparing etc, it seems to take
about 2 seconds to do around 3000 mysql queries. This is fine.
I just need a quick way to initially split the large file into smaller
files.
TIA
RG
> The initial splitting of the file is the bottleneck, I do not compare anything in this procedure. There is not a problem with the database comparing etc, it seems to take about 2 seconds to do around 3000 mysql queries. This is fine.
lets talk about filesizes. I JUST tried it. Reading (and splitting) a 19MB
File
with 322000 Lines took 0.39843 seconds on My Machine:
<?php
list($usec, $sec) = explode(" ",microtime());
$start =((float)$usec + (float)$sec);
$indata = file("cvs");
list($usec, $sec) = explode(" ",microtime());
$end =((float)$usec + (float)$sec);
printf("%.5f",$end-$start);
?>
This IS NOT a bottleneck i believe?
Timo
"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com... The initial splitting of the file is the bottleneck, I do not compare anything in this procedure. There is not a problem with the database comparing etc, it seems to take about 2 seconds to do around 3000 mysql queries. This is fine.
lets talk about filesizes. I JUST tried it. Reading (and splitting) a 19MB File with 322000 Lines took 0.39843 seconds on My Machine:
<?php
list($usec, $sec) = explode(" ",microtime()); $start =((float)$usec + (float)$sec);
$indata = file("cvs");
list($usec, $sec) = explode(" ",microtime()); $end =((float)$usec + (float)$sec);
printf("%.5f",$end-$start);
?>
This IS NOT a bottleneck i believe?
Timo
Lightning fast.
I think I've found the problem: upload_max_filesize is 2M
It is dieing, very poorly.
You must have changed your php.ini?
Thanks for the pointers
RG
> Lightning fast.
yeehaa :-) I think I've found the problem: upload_max_filesize is 2M It is dieing, very poorly. You must have changed your php.ini?
jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.
Thanks for the pointers
hope it would work out well for you
regards
timo
"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com... Lightning fast.
yeehaa :-)
I think I've found the problem: upload_max_filesize is 2M It is dieing, very poorly. You must have changed your php.ini?
jupp .. changed to 32M, because i often need to transfer bigger data in our intranet.
Thanks for the pointers
hope it would work out well for you
regards
timo
I don't suppose there's a workaround for this?
RG
"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com... Lightning fast.
yeehaa :-)
I think I've found the problem: upload_max_filesize is 2M It is dieing, very poorly. You must have changed your php.ini?
jupp .. changed to 32M, because i often need to transfer bigger data in our intranet.
Thanks for the pointers
hope it would work out well for you
regards
timo
I don't suppose there's a workaround for this?
RG
In article <3f***********************@mercury.nildram.net>, RG's output
was... I think I've found the problem: upload_max_filesize is 2M It is dieing, very poorly. You must have changed your php.ini?
jupp .. changed to 32M, because i often need to transfer bigger data in our intranet.
I don't suppose there's a workaround for this? RG
IIRC - you can do something along the lines of:
ini_set("upload_max_filesize", "64M");
- I understand this will only change the max size for operations in that
particular script - not for the whole server/virtual server.
"Eto Demerzel" <et**********@fijivillage.com> wrote in message
news:MP************************@news-text.blueyonder.co.uk... In article <3f***********************@mercury.nildram.net>, RG's output was... > I think I've found the problem: upload_max_filesize is 2M > It is dieing, very poorly. > You must have changed your php.ini?
jupp .. changed to 32M, because i often need to transfer bigger data in our intranet.
I don't suppose there's a workaround for this? RG
IIRC - you can do something along the lines of:
ini_set("upload_max_filesize", "64M");
- I understand this will only change the max size for operations in that particular script - not for the whole server/virtual server.
Tried that out but it seems that the file is uploaded before the script is
executed which means, the file is dumped before the function is called.
I'm sure my host wont want to change these settings, Rackshack (cheap).
Looks like I'm gonna have to get some more expensive hosting.
Any suggestions: PHP with GD, Multiple MySQL databases, password protect
directories, 20gb month
Thanks
RG
On Fri, 26 Sep 2003 10:22:39 +0100, RG wrote: jupp .. changed to 32M, because i often need to transfer bigger data in our intranet.
[ snip ]
I don't suppose there's a workaround for this? RG
RG,
You can put a hidden field in a form:
<input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />
This should "do the biz" before the form is whizzed off to the server for
processing =)
HTH.
Regards,
Ian
--
Ian.H [Design & Development]
digiServ Network - Web solutions www.digiserv.net | irc.digiserv.net | forum.digiserv.net
Programming, Web design, development & hosting.
On Fri, 26 Sep 2003 10:29:15 +0000, Ian.H wrote: <input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />
Oops.. that should be:
<input type="hidden" name="MAX_FILE_SIZE" value="33554432" />
Regards,
Ian
--
Ian.H [Design & Development]
digiServ Network - Web solutions www.digiserv.net | irc.digiserv.net | forum.digiserv.net
Programming, Web design, development & hosting.
"Ian.H" <ia*@WINDOZEdigiserv.net> wrote in message
news:pa****************************@hybris.digiser v.net... On Fri, 26 Sep 2003 10:22:39 +0100, RG wrote:
jupp .. changed to 32M, because i often need to transfer bigger data in our intranet.
[ snip ]
I don't suppose there's a workaround for this? RG
RG,
You can put a hidden field in a form:
<input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />
This should "do the biz" before the form is whizzed off to the server for processing =)
HTH. Regards,
Ian
Tried that and it didn't work.
Thanks anyway.
Anyone else?
RG
"Ian.H" <ia*@WINDOZEdigiserv.net> wrote in message
news:pa****************************@hybris.digiser v.net... On Fri, 26 Sep 2003 10:29:15 +0000, Ian.H wrote:
<input type="hidden" name="MAX_UPLOAD_SIZE" value="33554432" />
Oops.. that should be:
<input type="hidden" name="MAX_FILE_SIZE" value="33554432" /> Regards,
Tried the later too.
Still didn't work
Thanks though, I'm stuck here.
RG
> Tried that and it didn't work. Thanks anyway. Anyone else? RG
Just another suggestion : try to upload the file in GZ Format (if
possible).
This would cut down transfer and filesize to a minimum and could easily
with NO TIME unpacked by PHPs internal gz functions.
My previously mentioned 19MB testfile got shrinked down to 351.164 bytes.
You get the point?
timo
"Timo Henke" <we*******@fli7e.de> wrote in message
news:bl*************@news.t-online.com... Tried that and it didn't work. Thanks anyway. Anyone else? RG
Just another suggestion : try to upload the file in GZ Format (if possible).
This would cut down transfer and filesize to a minimum and could easily with NO TIME unpacked by PHPs internal gz functions.
My previously mentioned 19MB testfile got shrinked down to 351.164 bytes.
You get the point?
timo
Think I'm stuck, I can't trust clients with no knowledge to do this
Thanks though
RG
RG wrote: "Timo Henke" <we*******@fli7e.de> wrote in message news:bl*************@news.t-online.com... Tried that and it didn't work. Thanks anyway. Anyone else? RG
Just another suggestion : try to upload the file in GZ Format (if possible).
This would cut down transfer and filesize to a minimum and could easily with NO TIME unpacked by PHPs internal gz functions.
My previously mentioned 19MB testfile got shrinked down to 351.164 bytes.
You get the point?
timo
Think I'm stuck, I can't trust clients with no knowledge to do this Thanks though RG
You could probably write a tiny program in VB or something else that would split
a CSV file (file.csv) into 2MB chunks (file1.csv, file2.csv, file3.csv ...). It
would almost definitely be simple enough for your clients to use.
Shawn
--
Shawn Wilson sh***@glassgiant.com http://www.glassgiant.com
"Shawn Wilson" <sh***@glassgiant.com> wrote in message
news:3F**************@glassgiant.com... RG wrote: "Timo Henke" <we*******@fli7e.de> wrote in message news:bl*************@news.t-online.com... > Tried that and it didn't work. > Thanks anyway. > Anyone else? > RG
Just another suggestion : try to upload the file in GZ Format (if possible).
This would cut down transfer and filesize to a minimum and could
easily with NO TIME unpacked by PHPs internal gz functions.
My previously mentioned 19MB testfile got shrinked down to 351.164
bytes. You get the point?
timo
Think I'm stuck, I can't trust clients with no knowledge to do this Thanks though RG
You could probably write a tiny program in VB or something else that would
split a CSV file (file.csv) into 2MB chunks (file1.csv, file2.csv, file3.csv
....). It would almost definitely be simple enough for your clients to use.
Shawn
Thats not a bad idea. Would this be possible with Javascript? or maybe VB
Script?
Would ideally want the program to somehow interface with the import scripts,
so ideally, browser based.
Any ideas?
RG
RG wrote: > Just another suggestion : try to upload the file in GZ Format (if > possible). > > This would cut down transfer and filesize to a minimum and could easily > with NO TIME unpacked by PHPs internal gz functions. > > My previously mentioned 19MB testfile got shrinked down to 351.164 bytes. > > You get the point? > > timo >
Think I'm stuck, I can't trust clients with no knowledge to do this Thanks though RG
You could probably write a tiny program in VB or something else that would
split a CSV file (file.csv) into 2MB chunks (file1.csv, file2.csv, file3.csv ...). It would almost definitely be simple enough for your clients to use.
Shawn
Thats not a bad idea. Would this be possible with Javascript? or maybe VB Script? Would ideally want the program to somehow interface with the import scripts, so ideally, browser based. Any ideas?
JS, no for sure. VBScript - I doubt it, but I don't know VBScript. I doubt
you'll be able to do it from the browser. I think you'd almost have to write a
standalone executable and then use a browser to complete the upload normally.
It's kind of a clunky solve, but feasible. Alternately, you could tell them to
FTP it to a directory, then go to your program in a browser and select the file
to use from the FTP directory. Again, clunky... :(
Regards,
Shawn
--
Shawn Wilson sh***@glassgiant.com http://www.glassgiant.com
In article <3F***************@glassgiant.com>, Shawn Wilson's output
was... Thats not a bad idea. Would this be possible with Javascript? or maybe VB Script? Would ideally want the program to somehow interface with the import scripts, so ideally, browser based. Any ideas?
JS, no for sure. VBScript - I doubt it, but I don't know VBScript. I doubt you'll be able to do it from the browser. I think you'd almost have to write a standalone executable and then use a browser to complete the upload normally. It's kind of a clunky solve, but feasible. Alternately, you could tell them to FTP it to a directory, then go to your program in a browser and select the file to use from the FTP directory. Again, clunky... :(
This sounds like just the sort of thing java applets are useful for.
Perhaps have an applet in the web-page which selects the file, breaks it
into smaller chunks (and/or compresses), then calls a server-side script
to re-assemble the parts and then do whatever the script was supposed to
do with the file in the first place.
You might find some useful info at http://javaboutique.internet.com/ or
in comp.lang.java
It sounds like you're using MySQL and if loading the data is still a
bottleneck, have you tried using the "LOAD DATA LOCAL INFILE ..." stuff?
I saw an order of magnitude improvement in import speed when I went
from manually splitting my CSVs to using LOAD DATA.
If you're also having problems uploading large files you'll also need to
set php's post_max_size and memory_limit (if it's enabled) to something
larger than the size of the largest file you expect.
RG wrote: Any suggestions: PHP with GD, Multiple MySQL databases, password protect directories, 20gb month
I've had good luck with pair.com.
--Brent
"RG" <Me@NotTellingYa.com> wrote in message
news:3f***********************@mercury.nildram.net ... "Eto Demerzel" <et**********@fijivillage.com> wrote in message news:MP************************@news-text.blueyonder.co.uk... In article <3f***********************@mercury.nildram.net>, RG's output was... > > I think I've found the problem: upload_max_filesize is 2M > > It is dieing, very poorly. > > You must have changed your php.ini? > > jupp .. changed to 32M, because i often need to transfer bigger > data in our intranet.
I don't suppose there's a workaround for this? RG IIRC - you can do something along the lines of:
ini_set("upload_max_filesize", "64M");
- I understand this will only change the max size for operations in that particular script - not for the whole server/virtual server.
Tried that out but it seems that the file is uploaded before the script is executed which means, the file is dumped before the function is called. I'm sure my host wont want to change these settings, Rackshack (cheap). Looks like I'm gonna have to get some more expensive hosting. Any suggestions: PHP with GD, Multiple MySQL databases, password protect directories, 20gb month Thanks RG
I believe the 'ini_set("upload_max_filesize", "64M");' needs to be set in
the script in which the FORM element is placed, not in the script where the
file is being processed.
- Virgil
There is already an exisitng product that has an applet that breaks
the files into smaller chunks and uploads those files to a server side
script (via Java Servlet) to re-assemble the file on the server side.
The product can also send the reconstructed file to an FTP server of
your choice. This method is great for Secure file transfers and to
by-pass proxies and firewalls on the client side.
For demos and to download an evaluation version
take a look at http://www.unlimitedftp.ca/uftps
Good Luck!
John
Eto Demerzel <et**********@fijivillage.com> wrote in message news:<MP************************@news-text.blueyonder.co.uk>... In article <3F***************@glassgiant.com>, Shawn Wilson's output was... Thats not a bad idea. Would this be possible with Javascript? or maybe VB Script? Would ideally want the program to somehow interface with the import scripts, so ideally, browser based. Any ideas?
JS, no for sure. VBScript - I doubt it, but I don't know VBScript. I doubt you'll be able to do it from the browser. I think you'd almost have to write a standalone executable and then use a browser to complete the upload normally. It's kind of a clunky solve, but feasible. Alternately, you could tell them to FTP it to a directory, then go to your program in a browser and select the file to use from the FTP directory. Again, clunky... :(
This sounds like just the sort of thing java applets are useful for.
Perhaps have an applet in the web-page which selects the file, breaks it into smaller chunks (and/or compresses), then calls a server-side script to re-assemble the parts and then do whatever the script was supposed to do with the file in the first place. You might find some useful info at http://javaboutique.internet.com/ or in comp.lang.java This discussion thread is closed Replies have been disabled for this discussion. Similar topics
1 post
views
Thread by Dennis English |
last post: by
|
8 posts
views
Thread by yinjennytam |
last post: by
|
7 posts
views
Thread by spike |
last post: by
|
2 posts
views
Thread by johnb41 |
last post: by
|
reply
views
Thread by ysh8o1 |
last post: by
|
8 posts
views
Thread by rdemyan via AccessMonster.com |
last post: by
|
3 posts
views
Thread by onurpay |
last post: by
|
35 posts
views
Thread by keerthyragavendran |
last post: by
|
5 posts
views
Thread by Ravi |
last post: by
|
3 posts
views
Thread by Ryan Liu |
last post: by
| | | | | | | | | | |