473,837 Members | 1,724 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

File() too slow

RG
I am trying to process a CSV file but am having trouble with my hosts
maximum execution time of 30 seconds.
This is how the script works at the moment.
User uploads their CSV file
The script goes through the file() and writes smaller chunk files.
The script then goes through processing the smaller files, populating the
database, deleting the processed file and refreshing itself, thus starting
again.

This system works for files up to 15000 rows, but I need to be able to
process larger files.
The bottleneck is with the initial splitting, since I use the file()
function to read the entire uploaded file.

Does anyone know a quicker way to split a file into smaller chunks.

TIA
RG

Jul 17 '05 #1
22 5324
> This system works for files up to 15000 rows, but I need to be able to
process larger files.
The bottleneck is with the initial splitting, since I use the file()
function to read the entire uploaded file.


file() is by far NOT a bottleneck at your present Problem as it is not
any slower than reading the using fread() and splitting it up afterwards.

The Bottleneck is (seems like) your Database compare / writing stuff.

It sounds like you read in eg. 15000 lines, break them down into chunks
and then proceed all the chunks line by line, checking the database,
comparing
the database data with your cvs and then deciding on what to do (delete,
insert
update ...)

At 15000 Entries THIS will make your Script timeout, not the file()
command.

If you do not believe it, try to manually read and explode it ...

regards

timo

Jul 17 '05 #2
RG

"Timo Henke" <we*******@fli7 e.de> wrote in message
news:bl******** *****@news.t-online.com...
This system works for files up to 15000 rows, but I need to be able to
process larger files.
The bottleneck is with the initial splitting, since I use the file()
function to read the entire uploaded file.


file() is by far NOT a bottleneck at your present Problem as it is not
any slower than reading the using fread() and splitting it up afterwards.

The Bottleneck is (seems like) your Database compare / writing stuff.

It sounds like you read in eg. 15000 lines, break them down into chunks
and then proceed all the chunks line by line, checking the database,
comparing
the database data with your cvs and then deciding on what to do (delete,
insert
update ...)

At 15000 Entries THIS will make your Script timeout, not the file()
command.

If you do not believe it, try to manually read and explode it ...

regards

timo

The initial splitting of the file is the bottleneck, I do not compare
anything in this procedure.
There is not a problem with the database comparing etc, it seems to take
about 2 seconds to do around 3000 mysql queries. This is fine.

I just need a quick way to initially split the large file into smaller
files.
TIA
RG

Jul 17 '05 #3
> The initial splitting of the file is the bottleneck, I do not compare
anything in this procedure.
There is not a problem with the database comparing etc, it seems to take
about 2 seconds to do around 3000 mysql queries. This is fine.


lets talk about filesizes. I JUST tried it. Reading (and splitting) a 19MB
File
with 322000 Lines took 0.39843 seconds on My Machine:
<?php

list($usec, $sec) = explode(" ",microtime ());
$start =((float)$usec + (float)$sec);

$indata = file("cvs");

list($usec, $sec) = explode(" ",microtime ());
$end =((float)$usec + (float)$sec);

printf("%.5f",$ end-$start);

?>

This IS NOT a bottleneck i believe?

Timo

Jul 17 '05 #4
RG

"Timo Henke" <we*******@fli7 e.de> wrote in message
news:bl******** *****@news.t-online.com...
The initial splitting of the file is the bottleneck, I do not compare
anything in this procedure.
There is not a problem with the database comparing etc, it seems to take
about 2 seconds to do around 3000 mysql queries. This is fine.


lets talk about filesizes. I JUST tried it. Reading (and splitting) a 19MB
File
with 322000 Lines took 0.39843 seconds on My Machine:
<?php

list($usec, $sec) = explode(" ",microtime ());
$start =((float)$usec + (float)$sec);

$indata = file("cvs");

list($usec, $sec) = explode(" ",microtime ());
$end =((float)$usec + (float)$sec);

printf("%.5f",$ end-$start);

?>

This IS NOT a bottleneck i believe?

Timo


Lightning fast.
I think I've found the problem: upload_max_file size is 2M
It is dieing, very poorly.
You must have changed your php.ini?
Thanks for the pointers
RG
Jul 17 '05 #5
> Lightning fast.

yeehaa :-)
I think I've found the problem: upload_max_file size is 2M
It is dieing, very poorly.
You must have changed your php.ini?
jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.
Thanks for the pointers


hope it would work out well for you

regards

timo

Jul 17 '05 #6
RG

"Timo Henke" <we*******@fli7 e.de> wrote in message
news:bl******** *****@news.t-online.com...
Lightning fast.


yeehaa :-)
I think I've found the problem: upload_max_file size is 2M
It is dieing, very poorly.
You must have changed your php.ini?


jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.
Thanks for the pointers


hope it would work out well for you

regards

timo


I don't suppose there's a workaround for this?
RG
Jul 17 '05 #7
RG

"Timo Henke" <we*******@fli7 e.de> wrote in message
news:bl******** *****@news.t-online.com...
Lightning fast.


yeehaa :-)
I think I've found the problem: upload_max_file size is 2M
It is dieing, very poorly.
You must have changed your php.ini?


jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.
Thanks for the pointers


hope it would work out well for you

regards

timo


I don't suppose there's a workaround for this?
RG
Jul 17 '05 #8
In article <3f************ ***********@mer cury.nildram.ne t>, RG's output
was...
I think I've found the problem: upload_max_file size is 2M
It is dieing, very poorly.
You must have changed your php.ini?


jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.


I don't suppose there's a workaround for this?
RG

IIRC - you can do something along the lines of:

ini_set("upload _max_filesize", "64M");
- I understand this will only change the max size for operations in that
particular script - not for the whole server/virtual server.
Jul 17 '05 #9
RG

"Eto Demerzel" <et**********@f ijivillage.com> wrote in message
news:MP******** *************** *@news-text.blueyonder .co.uk...
In article <3f************ ***********@mer cury.nildram.ne t>, RG's output
was...
> I think I've found the problem: upload_max_file size is 2M
> It is dieing, very poorly.
> You must have changed your php.ini?

jupp .. changed to 32M, because i often need to transfer bigger
data in our intranet.


I don't suppose there's a workaround for this?
RG

IIRC - you can do something along the lines of:

ini_set("upload _max_filesize", "64M");
- I understand this will only change the max size for operations in that
particular script - not for the whole server/virtual server.

Tried that out but it seems that the file is uploaded before the script is
executed which means, the file is dumped before the function is called.
I'm sure my host wont want to change these settings, Rackshack (cheap).
Looks like I'm gonna have to get some more expensive hosting.
Any suggestions: PHP with GD, Multiple MySQL databases, password protect
directories, 20gb month
Thanks
RG

Jul 17 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1805
by: Dennis English | last post by:
Hi The following code came be very slow in getting creation time,filename,file length and directoryname for files in a subdirectory because it keeps checking each file sequentially in a slow DSL connection Does anybody know how I can speed things? One suggestion was to get the remote server to only bring back creation time,filename,file length and directoryname of all
8
5791
by: yinjennytam | last post by:
Hi all, I'm new to .NET and XML and I have a question. Given an XML file, I want to navigate its content and look for one or two particular elements to get their values. At this point, it suffices to open the XML file for read-only access. Once I have processed these values, I might need to update a bunch of subelements of a certain element. For example, I may need to update the Field Name attribute plus the DataField element value...
7
2617
by: spike | last post by:
Im writing a program to search for a string in a binary file. And it works. The problem is: It is sooo slow! how can i make it faster? It takes 27 seconds just to search a 5 meg file. I guess it has something to do with the strequal() function... Btw, thanks to all of you who answered last time! code: ------------------------------------------------------------------------- #include <stdio.h>
2
3478
by: johnb41 | last post by:
In my app, I need to open up a multipage tiff file, and also display it's thumbnail images IN HIGH QUALITY. (High Quality meaning anti-aliased, and looking good; not rough and pixely) The thumbnail images are displayed in a ListView control. (I go through each page of the file, create a thumbnail of it, and put it into an ImageList. Then i hook that imagelist up to the ListView.) It works fine, but it is VERY slow. Creating and...
0
1379
by: ysh8o1 | last post by:
Hi, We upgraded our W2K for a W2K3 server at the beginning of the year. It's purpose is to handle TS distribution of a couple of MS-ACCESS 97 apps stored locally on the server. The server is stable. But there are some slow downs within the apps. Anything involving a file access is slow. Let's suppose that I want to open a file from within MS-ACCESS.
8
1941
by: rdemyan via AccessMonster.com | last post by:
Anyone have any ideas on how to determine when the back-end file (containing only tables) has been updated with new data. The date/time of the file won't work because it gets updated to the current date/time when the back-end file is compacted. I'm just looking for an easy way to determine when there has been a change to data in any table in the back-end file. Maybe something is updated in one of the system tables??
3
1163
by: onurpay | last post by:
Güncel : is a category name in the txt file. BCList2.txt : data file from another site problem : the txt file is too big. it has lots of categories in it. i need only "Güncel" category. also it is too slow to show that category in my page. so i want to cache it maybe in a txt file in my server. so second time that i need the data it won't be too slow. i want to take the data once in a day but show several times. the code below is working...
35
9372
by: keerthyragavendran | last post by:
hi i'm downloading a single file using multiple threads... how can i specify a particular range of bytes alone from a single large file... for example say if i need only bytes ranging from 500000 to 3200000 of a file whose size is say 20MB... how do i request a download which starts directly at 500000th byte... thank u cheers
5
3550
by: Ravi | last post by:
How can I prevent a header file from being included twice?
3
1853
by: Ryan Liu | last post by:
Hi, I spend most time works on Windows form application and VS 2008 runs at a acceptable speed. While I edit aspx file, especially change to Designer View, it seems very slow. Some time even very slow while in Markup view, maybe it is doing some backup compiling and sycn.
0
10897
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10583
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10638
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7823
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7009
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5679
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5859
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4481
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4056
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.