By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,269 Members | 1,519 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,269 IT Pros & Developers. It's quick & easy.

Best way to process a file with 20,000 records

P: n/a
Me
Hi all

What is the best way to upload and then process a csv file that
contains around 20,000 records?
The data needs to be validated and then added to the database. All
records that dont satisfy the validation requirements will need to be
recorded. However the update can still go ahead for those that do

I am using Dotnet 1.1 with Sql Server 2000 as the datasource

Is it a good idea to do this in a different thread, or have a windows
service perform this task?

Any ideas/suggestions anyone?

Jan 31 '06 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On 31 Jan 2006 00:47:28 -0800, "Me" <il***@igsoftwaresolutions.co.uk>
wrote:
Hi all

What is the best way to upload and then process a csv file that
contains around 20,000 records?
The data needs to be validated and then added to the database. All
records that dont satisfy the validation requirements will need to be
recorded. However the update can still go ahead for those that do

I am using Dotnet 1.1 with Sql Server 2000 as the datasource

Is it a good idea to do this in a different thread, or have a windows
service perform this task?

Any ideas/suggestions anyone?


If this is a daily scheduled job?, then you can write a console
application that parses the file and validates the data, that is
called by the scheduler on the installation machine. Use the command
line to give the application the file name if it changes frequently.

I'm afraid there is no way to do automatic validation of the data. You
will have to code the validation process.

Rows which fail validation can be written out to another file with a
statement describing the failure to validate,

I do this a lot and much larger files can be handled. In fact, the
environment I work in (main frame and AS/400) has files like this with
millions of rows.

Here is the basic methodology:

// get the file name from the command line
// open the file
// read a line
// split it into an array with the string.split method
// loop through the array elements validating each one
// if the row is valid pass it to a method that does the insert
// if not, tag the row with a statement describing the reason it
failed
// write the failed line out to the error file
// read another row and repeat

That's a pretty high level overview, but you can get the idea.
Otis Mukinfus
http://www.otismukinfus.com
http://www.tomchilders.com
Jan 31 '06 #2

P: n/a
I would do what Otis suggested. I process files with millions of rows
as well. Parse out the bad lines to a "NotImported.csv" file with an
error description. Then with the good records, i issue a BULK INSERT
statement and provide the path to the csv file to import it quickly
into SQL server.

Jan 31 '06 #3

P: n/a
You can try my parser I sell that will handle all the csv file parsing
and creating for you. I would basically do what the other people have
said, only I would use dts for the bulk insert rather than bulk insert
as it can handle true csv files whereas the bulk insert and bcp cannot.

Bruce Dunwiddie
http://www.csvreader.com

Jan 31 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.