By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,934 Members | 1,527 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,934 IT Pros & Developers. It's quick & easy.

Quickest way to load data

100+
P: 132
Hello there.

I'm having several issues with regards to loading data into an Oracle database. I am developing in ASP.Net with C# and the source data is in a CSV file.

What I am looking for is to find the best approach people on here have found to be the most efficient way to load the data within these boundaries. I don't want to load all of the CSV data but I do need to load about 70% of it; maybe a little more and the CSV file itself holds some 30 million records.

I am finding a file that has just 750,000 records is taking an hour to insert into an oracle table, so any suggestions on a quickest method will be greatly appreaciated please?

I have tried dynamic code and stored procedures with the same results, which was somewhat unexpected.

Thank you.

Mark :)
Dec 1 '08 #1
Share this Question
Share on Google+
4 Replies


balabaster
Expert 100+
P: 797
Doing a quick search on Google, I found that Oracle has a module called SQL*Loader which is similar to SQL Server's BCP bulk upload or DTS for importing data from various sources. You can configure this to load the data in from the CSV and given that Oracle wrote the code, it's likely to be the most efficient mechanism at parsing the code into the Oracle tables...
Dec 1 '08 #2

100+
P: 132
Hi there and thank you for the reply.

Yes I have tried Oracle's Sql Loader and it works super fast, the problem I have is that I am using a CSV file and while it is easy enough to cherry pick the columns I want in the file after parsing it, the format of the sql loader CTL file requires you to have a matching column line up, so as my table has more columns than this particular csv file, the whole thing falls apart.

It is the like-for-like column matching in the control file -v- the database table column ordering which is causing me problems.

Using the Position parameter would be good but this is a CSV file so is not applicable.

As it currently stands, the database table columns can be simp,lified to:

col1, col2, col3, col4, col5, col6

But my control file may have col1, col4, col5

hence if I use sqlldr with my ctl file and csv input, then the wrong data is being uploaded to the wrong column, albeit very quickly done..!

If anybody has a way around this, then that would be helpful indeed.

Thank you.

Mark
Dec 1 '08 #3

balabaster
Expert 100+
P: 797
Okay, could you two-step it? Bulk upload into a dummy table which has a trigger attached to it that can then do the forward copy of the data into the real table?

In an ideal world, it would be configurable so that you could cherry pick your columns directly, but given that it's not, could you do something like this?:
  • Create a script that builds a dummy table and attaches a trigger to it to pass incoming data out to the correct columns in the real table, thus allowing you to pick and choose which fields are/aren't included.
  • Script the SQL*Loader import to grab the data from the CSV into the dummy table which will have the same amount of columns as your CSV.
  • Data is pushed into your dummy table by the SQL*Loader, the trigger would then forward the data out to the real table, Then your script can drop the dummy table at the end of processing.

It's not what you might call "tidy" but it would make use of the performance benefits of the loader and give it the added flexibility you're after that it doesn't provide out of the box, it would be relatively easy to maintain as it's a single script.
Dec 1 '08 #4

P: 4
Have you looked at this Fast CSV Reader on codeproject ? The author Sebastien Lorion has tested it for performance...

-Shiva
mycodetrip.com
Dec 1 '08 #5

Post your reply

Sign in to post your reply or Sign up for a free account.