469,299 Members | 2,050 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,299 developers. It's quick & easy.

Large file handling in C#

Hi Everyone:

I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.

Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.

Thanks in advance.

Pargat

Feb 23 '07 #1
9 11367
On Feb 23, 4:36 pm, "pargat.si...@gmail.com" <pargat.si...@gmail.com>
wrote:
I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.

Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.
To improve UI responsiveness, use a BackgroundWorker component and
move the file processing there. Otherwise, I'm not sure what else you
can do to improve performance besides running on multiple processor /
core machine, or having the processing sleep once in a while. That
would release some time to the CPU for other processes, but would make
your program run even slower..

Feb 23 '07 #2
If you are ultimately putting the file data into a database, here's how I
handled it:

1. Make a table in the database that has the same structure as the file
2. Make a stored procedure that uses the sql BULK INSERT statement
to slam the file into the work table. Bulk Insert is BLISTERING fast.
3. Write stored procedures to do whatever heavy work on the work table
4. Have your application call the stored procedures.

The system I developed that did that was loading delimited text files on the
order of almost a gig at a time, and maybe a dozen or so files like that.
The batch ran in an hour or so.

--
Peace & happy computing,

Mike Labosh, MCSD MCT
"Escriba coda ergo sum." -- vbSensei
Feb 23 '07 #3
<pa**********@gmail.comwrote in message
news:11**********************@k78g2000cwa.googlegr oups.com...
Hi Everyone:

I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.

Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.

Thanks in advance.

Pargat
Please be more explicit, what exactly do you mean with " loop through No. of files", how are
you looping? what are you actually doing with the files?
What kind of application is this? Windows, console other?
Is this running on a server where other applications might run?

Willy.

Feb 23 '07 #4
Hi,

<pa**********@gmail.comwrote in message
news:11**********************@k78g2000cwa.googlegr oups.com...
Hi Everyone:

I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.

Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.
More details would help.

What kind of program is it?

Can you do it as a background process? or as a job?

how your code look like
Feb 24 '07 #5
On Feb 23, 5:00 pm, "Willy Denoyette [MVP]"
<willy.denoye...@telenet.bewrote:
<pargat.si...@gmail.comwrote in message

news:11**********************@k78g2000cwa.googlegr oups.com...
Hi Everyone:
I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.
Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.
Thanks in advance.
Pargat

Please be more explicit, what exactly do you mean with " loop through No. of files", how are
you looping? what are you actually doing with the files?
What kind of application is this? Windows, console other?
Is this running on a server where other applications might run?

Willy.
Thanks Willy. I have 7-8 different text file which hole the member
information like account file,fund file, plan file, demographic file
etc. I have also ABC.txt file and i am looping through this file and
pick the member from this file and then i have to fetch the records
from all other files for this member and write to other text file.

This is a simple window program which is being called by batch file.
Yes this is running on the M/C where other job runs as well.

Thanks,
Pargat

Feb 26 '07 #6
<pa**********@gmail.comwrote in message
news:11**********************@m58g2000cwm.googlegr oups.com...
On Feb 23, 5:00 pm, "Willy Denoyette [MVP]"
<willy.denoye...@telenet.bewrote:
><pargat.si...@gmail.comwrote in message

news:11**********************@k78g2000cwa.googleg roups.com...
Hi Everyone:
I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.
Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.
Thanks in advance.
>Pargat

Please be more explicit, what exactly do you mean with " loop through No. of files", how
are
you looping? what are you actually doing with the files?
What kind of application is this? Windows, console other?
Is this running on a server where other applications might run?

Willy.

Thanks Willy. I have 7-8 different text file which hole the member
information like account file,fund file, plan file, demographic file
etc. I have also ABC.txt file and i am looping through this file and
pick the member from this file and then i have to fetch the records
from all other files for this member and write to other text file.

This is a simple window program which is being called by batch file.
Yes this is running on the M/C where other job runs as well.

Thanks,
Pargat


That means that you need to repeatedly search the same files for the wanted records, this is
exactly why RDMS systems were invented ages ago. So, I would suggest you to drop these
records in a DB.
Not sure why this has to be Windows program though.

Willy.
Feb 26 '07 #7
On Feb 26, 5:54 am, "Willy Denoyette [MVP]"
<willy.denoye...@telenet.bewrote:
<pargat.si...@gmail.comwrote in message

news:11**********************@m58g2000cwm.googlegr oups.com...
On Feb 23, 5:00 pm, "Willy Denoyette [MVP]"
<willy.denoye...@telenet.bewrote:
<pargat.si...@gmail.comwrote in message
>news:11**********************@k78g2000cwa.googleg roups.com...
Hi Everyone:
I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.
Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.
Thanks in advance.
Pargat
Please be more explicit, what exactly do you mean with " loop through No. of files", how
are
you looping? what are you actually doing with the files?
What kind of application is this? Windows, console other?
Is this running on a server where other applications might run?
Willy.
Thanks Willy. I have 7-8 different text file which hole the member
information like account file,fund file, plan file, demographic file
etc. I have also ABC.txt file and i am looping through this file and
pick the member from this file and then i have to fetch the records
from all other files for this member and write to other text file.
This is a simple window program which is being called by batch file.
Yes this is running on the M/C where other job runs as well.
Thanks,
Pargat

That means that you need to repeatedly search the same files for the wanted records, this is
exactly why RDMS systems were invented ages ago. So, I would suggest you to drop these
records in a DB.
Not sure why this has to be Windows program though.

Willy.

Thanks Willy.

But the problem is the files itself. These files are created by some
other VB program and i don't know how can i drop the records in DB
because data in these files
is like 1234567 ABC CLIENTNAME PLANNAME etc means
1-20 is person_id, 20-30 is clientID 30-80 is client_name etc.

Thanks,
Pargat

Feb 26 '07 #8
pa**********@gmail.com wrote:
On Feb 26, 5:54 am, "Willy Denoyette [MVP]"
<willy.denoye...@telenet.bewrote:
><pargat.si...@gmail.comwrote in message

news:11**********************@m58g2000cwm.googleg roups.com...
>>On Feb 23, 5:00 pm, "Willy Denoyette [MVP]"
<willy.denoye...@telenet.bewrote:
<pargat.si...@gmail.comwrote in message
news:11**********************@k78g2000cwa.googl egroups.com...
Hi Everyone:
I have a C# program [VS2005] which loop through No. of files. In
UAT i had medium size files and every thing goes ok but in Production
files are big and i notice that my program is very slow and it's
taking 100% CPU time.
Is there anyway i can improve the performance. Something like Do
Events etc. Any suggestion is much appreciated.
Thanks in advance.
Pargat
Please be more explicit, what exactly do you mean with " loop through No. of files", how
are
you looping? what are you actually doing with the files?
What kind of application is this? Windows, console other?
Is this running on a server where other applications might run?
Willy.
Thanks Willy. I have 7-8 different text file which hole the member
information like account file,fund file, plan file, demographic file
etc. I have also ABC.txt file and i am looping through this file and
pick the member from this file and then i have to fetch the records
from all other files for this member and write to other text file.
This is a simple window program which is being called by batch file.
Yes this is running on the M/C where other job runs as well.
Thanks,
Pargat
That means that you need to repeatedly search the same files for the wanted records, this is
exactly why RDMS systems were invented ages ago. So, I would suggest you to drop these
records in a DB.
Not sure why this has to be Windows program though.

Willy.


Thanks Willy.

But the problem is the files itself. These files are created by some
other VB program and i don't know how can i drop the records in DB
because data in these files
is like 1234567 ABC CLIENTNAME PLANNAME etc means
1-20 is person_id, 20-30 is clientID 30-80 is client_name etc.

Thanks,
Pargat
Well, you can't just drop them in a database per se, you would have to
read the data from the files and add each record to the database.

It's a bit of work to get it into the database, but once that is done,
you can very easily fetch all data associated with a user in a fraction
of the time it takes to loop through the files for it. The time needed
to get the data would typically go from several seconds to a few
milliseconds.

--
Göran Andersson
_____
http://www.guffa.com
Feb 26 '07 #9
On Tue, 27 Feb 2007 01:00:22 +0800, Göran Andersson <gu***@guffa.com>
wrote:
Well, you can't just drop them in a database per se, you would have to
read the data from the files and add each record to the database.

It's a bit of work to get it into the database, but once that is done,
you can very easily fetch all data associated with a user in a fraction
of the time it takes to loop through the files for it. The time needed
to get the data would typically go from several seconds to a few
milliseconds.
To the original poster...

First, I have to agree that the best solution is probably to just copy all
of the data into a relational database, one table for each file, where the
member ID (whatever you're using...name, ID number, whatever) is the key
that ties the tables together. You can then just run through all the keys
and let the database itself handle extraction of the data. It will even
be able to return all of the data as a single record, which you can just
then rewrite out to whatever new text file you want (assuming, of course,
that even once all the data is in a relational database, you really need a
new text file).

If you really don't want to do that, then to improve performance you need
to process the data differently. My first question would be: is the data
ordered in the original files in the first place? If so, then you should
not have to scan all of the files repeatedly...you should be able to read
through them in sequence, merging matching data as you find it.

If the data is not ordered in the original files, then the next best thing
would be to do what a real database would do anyway: index the files.
Scan through each of the files once, creating a new file (or in-memory
data structure, if your original data is few enough that indices for all
of your files will all fit in memory) that contains an sorted index based
on the member ID, so that you can quickly find the records of interest. A
binary search on the index will allow you to find any given record in a
very short period of time...much better than scanning each file repeatedly.

If the files you are processing are small enough (say, less than 1GB of
total data across all of the files, though this is not a hard-and-fast
limit), then you might have some success simply reading all of the files
into memory in some easily-searched data structure (eg a .NET
dictionary-list or sorted linked list). Let .NET do all the work of
looking the data up, and on a computer with enough physical RAM this
should speed things up a lot as well.

These are all general suggestions. You haven't provided much in the way
of detail of your actual data or problem, so it's hard to provide anything
more useful than that.

Pete
Feb 28 '07 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Josh Usovsky | last post: by
6 posts views Thread by Greg | last post: by
6 posts views Thread by Thomas Due | last post: by
6 posts views Thread by WideBoy | last post: by
6 posts views Thread by comp.lang.php | last post: by
11 posts views Thread by Gina_Marano | last post: by
1 post views Thread by Geralt96 | last post: by
reply views Thread by harlem98 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.