473,324 Members | 2,248 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Best way to process a file with 20,000 records

Me
Hi all

What is the best way to upload and then process a csv file that
contains around 20,000 records?
The data needs to be validated and then added to the database. All
records that dont satisfy the validation requirements will need to be
recorded. However the update can still go ahead for those that do

I am using Dotnet 1.1 with Sql Server 2000 as the datasource

Is it a good idea to do this in a different thread, or have a windows
service perform this task?

Any ideas/suggestions anyone?

Jan 31 '06 #1
3 1324
On 31 Jan 2006 00:47:28 -0800, "Me" <il***@igsoftwaresolutions.co.uk>
wrote:
Hi all

What is the best way to upload and then process a csv file that
contains around 20,000 records?
The data needs to be validated and then added to the database. All
records that dont satisfy the validation requirements will need to be
recorded. However the update can still go ahead for those that do

I am using Dotnet 1.1 with Sql Server 2000 as the datasource

Is it a good idea to do this in a different thread, or have a windows
service perform this task?

Any ideas/suggestions anyone?


If this is a daily scheduled job?, then you can write a console
application that parses the file and validates the data, that is
called by the scheduler on the installation machine. Use the command
line to give the application the file name if it changes frequently.

I'm afraid there is no way to do automatic validation of the data. You
will have to code the validation process.

Rows which fail validation can be written out to another file with a
statement describing the failure to validate,

I do this a lot and much larger files can be handled. In fact, the
environment I work in (main frame and AS/400) has files like this with
millions of rows.

Here is the basic methodology:

// get the file name from the command line
// open the file
// read a line
// split it into an array with the string.split method
// loop through the array elements validating each one
// if the row is valid pass it to a method that does the insert
// if not, tag the row with a statement describing the reason it
failed
// write the failed line out to the error file
// read another row and repeat

That's a pretty high level overview, but you can get the idea.
Otis Mukinfus
http://www.otismukinfus.com
http://www.tomchilders.com
Jan 31 '06 #2
I would do what Otis suggested. I process files with millions of rows
as well. Parse out the bad lines to a "NotImported.csv" file with an
error description. Then with the good records, i issue a BULK INSERT
statement and provide the path to the csv file to import it quickly
into SQL server.

Jan 31 '06 #3
You can try my parser I sell that will handle all the csv file parsing
and creating for you. I would basically do what the other people have
said, only I would use dts for the bulk insert rather than bulk insert
as it can handle true csv files whereas the bulk insert and bcp cannot.

Bruce Dunwiddie
http://www.csvreader.com

Jan 31 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
by: Peter A. Schott | last post by:
I've got a file that seems to come across more like a dictionary from what I can tell. Something like the following format: ###,1,val_1,2,val_2,3,val_3,5,val_5,10,val_10...
5
by: yvan | last post by:
Approximately once a month, a client of ours sends us a bunch of comma-delimited text files which I have to clean up and then import into their MS SQL database. All last week, I was using a Cold...
16
by: LP | last post by:
Hi, Every morning a .NET application downloads a file with cumulative data which needs to be appended to SQL Server table. This program needs to identify records that have not been previously...
1
by: Kevin Frey | last post by:
I've been spending considerable time thinking about the various implications that come into play when building a production-quality web application and this has prompted me to ask whether there are...
1
by: Shawn Ferguson | last post by:
Hello All, Before I starting writing the code, I would like to figure out what the best approach to this problem would be. I want to create a process that imports a text file (Pipe | delimited),...
0
by: =?Utf-8?B?UHVjY2E=?= | last post by:
Hi, I'm using vs2005, .net 2 for windows application. I have to add a new form to my application. This will will process several records from a datatable. In this form, 1. I need to have a log...
6
by: kamsmartx | last post by:
I'm new to programming and need some help figuring out an algorithm. I need to design some kind of algorithm which will help us define capacity for one of our portfolios....here's the problem...
14
by: Patrick A | last post by:
All, I have an Access DB. On a nightly basis, I want to look at an Other DB (not Access, but SQL) and: + Add any new records from Other.Clients into Access.Clients Is this something I...
1
by: =?Utf-8?B?QW1lbGlh?= | last post by:
Hello, Apologies for the easy coding questions but I have something I need to code in .Net and am fairly new to coding and don't want to produce bad code so after some advice from the...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.