473,386 Members | 1,886 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Bulk Data insert excluding the duplicates

Hi,

I have a tmp_table with say 10,000 records and a main_table with say 50 million records..

Few records of tmp_table are already existing in main_table so i need to insert only the new records into the main_table.

I have a primary key in both the tables;

I am running a daemon which does this constantly.

Can anyone please help me to find an efficient solution.

I tried PL/SQL exception handling, the MINUS... any other way to do it.

Thanks in Advance.

Rohit
Mar 11 '08 #1
7 6054
amitpatel66
2,367 Expert 2GB
Hi,

I have a tmp_table with say 10,000 records and a main_table with say 50 million records..

Few records of tmp_table are already existing in main_table so i need to insert only the new records into the main_table.

I have a primary key in both the tables;

I am running a daemon which does this constantly.

Can anyone please help me to find an efficient solution.

I tried PL/SQL exception handling, the MINUS... any other way to do it.

Thanks in Advance.

Rohit
Try this:

1. Make use of a CURSOR to hold the records that are not inserted in main table
2. use FORALL BULK INSERT to insert the records from CURSOR in to MAIN TABLE (LIMIT to 1000)
3. COMMIT
4. Repeat STEP 2 for every 1000 RECORDS
Mar 11 '08 #2
Try this:

1. Make use of a CURSOR to hold the records that are not inserted in main table
2. use FORALL BULK INSERT to insert the records from CURSOR in to MAIN TABLE (LIMIT to 1000)
3. COMMIT
4. Repeat STEP 2 for every 1000 RECORDS

But how do i handle duplicate records.
The first step is the problem. how do i get the
set of records that are not inserted yet.
Mar 11 '08 #3
amitpatel66
2,367 Expert 2GB
But how do i handle duplicate records.
The first step is the problem. how do i get the
set of records that are not inserted yet.
How about this code:

Expand|Select|Wrap|Line Numbers
  1.  
  2. DECLARE
  3. TYPE tempdata IS TABLE of temp_table%ROWTYPE;
  4. td temp_data;
  5. CURSOR temp_data IS SELECT * FROM temp_table WHERE <primary_key_column> NOT IN (SELECT <primary_key_column> FROM main_table);
  6. BEGIN
  7. OPEN temp_data;
  8. LOOP
  9. FETCH temp_data BULK COLLECT INTO td LIMIT 1000;
  10. FORALL I IN td.FIRST..td.LAST 
  11. INSERT INTO main_table td(I);
  12. COMMIT;
  13. EXIT WHEN temp_data%NOTFOUND;
  14. END LOOP;
  15. END;
  16.  
Note: Not Tested!!
Mar 11 '08 #4
How about this code:

Expand|Select|Wrap|Line Numbers
  1.  
  2. DECLARE
  3. TYPE tempdata IS TABLE of temp_table%ROWTYPE;
  4. td temp_data;
  5. CURSOR temp_data IS SELECT * FROM temp_table WHERE <primary_key_column> NOT IN (SELECT <primary_key_column> FROM main_table);
  6. BEGIN
  7. OPEN temp_data;
  8. LOOP
  9. FETCH temp_data BULK COLLECT INTO td LIMIT 1000;
  10. FORALL I IN td.FIRST..td.LAST 
  11. INSERT INTO main_table td(I);
  12. COMMIT;
  13. EXIT WHEN temp_data%NOTFOUND;
  14. END LOOP;
  15. END;
  16.  
Note: Not Tested!!
this should work, I will have to tweak it a bit

thanks a lot.
Mar 12 '08 #5
amitpatel66
2,367 Expert 2GB
this should work, I will have to tweak it a bit

thanks a lot.
Yes, the Cursor will get you the info which is not there in main table that you can happily then insert into main table.

Do post back in case of any further issues.
Mar 12 '08 #6
Dave44
153 100+
this should work, I will have to tweak it a bit

thanks a lot.
I agree, bulk processing is a great way to go. I would just change the cursor query to use an anti-join instead of the not in; the anti-join will perform faster almost always, especially with large tables.
Mar 13 '08 #7
amitpatel66
2,367 Expert 2GB
I agree, bulk processing is a great way to go. I would just change the cursor query to use an anti-join instead of the not in; the anti-join will perform faster almost always, especially with large tables.
I have noticed problem many times while using an ANTI-JOIN producing incorrect results. Using SET operations is another way of doing it.
Mar 13 '08 #8

Sign in to post your reply or Sign up for a free account.

Similar topics

2
by: php newbie | last post by:
Hello, I am trying to load a simple tab-delimited data file to SQL Server. I created a format file to go with it, since the data file differs from the destination table in number of columns. ...
5
by: me | last post by:
I'm also having problems getting the bulk insert to work. I don't know anything about it except what I've gleened from BOL but I'm not seeming to get anywhere...Hopefully there is some little (or...
7
by: iqbal | last post by:
Hi all, We have an application through which we are bulk inserting rows into a view. The definition of the view is such that it selects columns from a table on a remote server. I have added the...
20
by: akej via SQLMonster.com | last post by:
Hi, i have table with 15 columns CREATE TABLE . ( PRIMARY KEY , NULL , NULL , NULL , NULL , (50) NULL , NULL
6
by: pk | last post by:
Sorry for the piece-by-piece nature of this post, I moved it from a dormant group to this one and it was 3 separate posts in the other group. Anyway... I'm trying to bulk insert a text file of...
16
by: Philip Boonzaaier | last post by:
I want to be able to generate SQL statements that will go through a list of data, effectively row by row, enquire on the database if this exists in the selected table- If it exists, then the colums...
11
by: Ted | last post by:
OK, I tried this: USE Alert_db; BULK INSERT funds FROM 'C:\\data\\myData.dat' WITH (FIELDTERMINATOR='\t', KEEPNULLS, ROWTERMINATOR='\r\n');
2
by: Ted | last post by:
I have BULK INSERT T-SQL statements that work for all of my basic data tables except for one. Here is the problem statement (with obvious replacement of irrelevant path info): BULK INSERT...
2
by: Weyus | last post by:
All, Just want to make sure that I understand what's going on here. I have a table with IGNORE_DUP_KEY set on a unique, multi-column index. What I'm seeing is this: 1) When performing a...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.