473,657 Members | 2,505 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

BCP Order on SQL Server

I have two SQL Server 2000 machines. The same file is sent nightly to
each server and a stored proc uses BULK INSERT to load it into a
staging table for processing.

Once I've bcp'ed it in, I put it into a temp table with an IDENTITY
column appended to it. (I need this identity column to group by later
on to remove duplicates.)

ie

select tempo.*,
IDENTITY(int, 1,1) AS ID_Num
into #test1
from tempExtract tempo

My question is : can I expect the ID_Num and the corresponding line of
the file copied to the table to be the same on each server? Ie will
each BCP into the staging table occur in the same order on both
servers given that the file, the BULK INSERT command and the indexes
are the same on each server.
Jul 20 '05 #1
10 4495
Thomas Richards (to**********@r ocketmail.com) writes:
I have two SQL Server 2000 machines. The same file is sent nightly to
each server and a stored proc uses BULK INSERT to load it into a
staging table for processing.

Once I've bcp'ed it in, I put it into a temp table with an IDENTITY
column appended to it. (I need this identity column to group by later
on to remove duplicates.)

ie

select tempo.*,
IDENTITY(int, 1,1) AS ID_Num
into #test1
from tempExtract tempo

My question is : can I expect the ID_Num and the corresponding line of
the file copied to the table to be the same on each server? Ie will
each BCP into the staging table occur in the same order on both
servers given that the file, the BULK INSERT command and the indexes
are the same on each server.


No, you would need to have the identity column on the table you load
the file into. I don't know for sure that you can trust the IDENTITY
value to match the input file exactly, and if it works, it is likely
to by mere chance. That is, there is no committment from Microsoft
that it should work, and it could change in a future version of SQL
Server.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #2
Erland Sommarskog <es****@sommars kog.se> wrote in message news:<Xn******* *************** @127.0.0.1>...
Thomas Richards (to**********@r ocketmail.com) writes:
I have two SQL Server 2000 machines. The same file is sent nightly to
each server and a stored proc uses BULK INSERT to load it into a
staging table for processing.

Once I've bcp'ed it in, I put it into a temp table with an IDENTITY
column appended to it. (I need this identity column to group by later
on to remove duplicates.)

ie

select tempo.*,
IDENTITY(int, 1,1) AS ID_Num
into #test1
from tempExtract tempo

My question is : can I expect the ID_Num and the corresponding line of
the file copied to the table to be the same on each server? Ie will
each BCP into the staging table occur in the same order on both
servers given that the file, the BULK INSERT command and the indexes
are the same on each server.


No, you would need to have the identity column on the table you load
the file into. I don't know for sure that you can trust the IDENTITY
value to match the input file exactly, and if it works, it is likely
to by mere chance. That is, there is no committment from Microsoft
that it should work, and it could change in a future version of SQL
Server.


Thanks for that. The problem that I'm trying to get round is that I
have a key field and then one or more addresses. The key field and the
fields that make up the address are all chars/varchars. I want to pick
one arbitrary address to associate with the key and put in another
table. There are no business rules (eg always take the one with the
lowest street number) that will always identify just one of the
addresses. Originally, I thought group by key and line number and pick
the one with the lowest number. I would prefer to do this as it would
get the first entry in the file which would more than likely give me
the better address details. However as you've pointed out I can't
depend on the order when bcp'ed in. Is there any other way to do this
or would I have to get line number added to the file before SQL Server
processes it?
Jul 20 '05 #3
Thomas Richards (to**********@r ocketmail.com) writes:
Thanks for that. The problem that I'm trying to get round is that I
have a key field and then one or more addresses. The key field and the
fields that make up the address are all chars/varchars. I want to pick
one arbitrary address to associate with the key and put in another
table. There are no business rules (eg always take the one with the
lowest street number) that will always identify just one of the
addresses. Originally, I thought group by key and line number and pick
the one with the lowest number. I would prefer to do this as it would
get the first entry in the file which would more than likely give me
the better address details. However as you've pointed out I can't
depend on the order when bcp'ed in. Is there any other way to do this
or would I have to get line number added to the file before SQL Server
processes it?


The only way to be sure is to add the line numbers yourself. This can be
done in two ways: 1) Manipulate the file, by running it through a program
that adds a line number. 2) Instead of writing a to file, have the program
to insert the data. In fact, you can still use bulk load, but you would
bulk from variables, using the BCP API.

However, BULK INSERT into a table with an IDENTITY gives you fairly good
odds, and as I understand your case, it does not seem to be a disaster,
if number is not what you expect. So I would go for that.
--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #4
Thanks,

I'm going to have problems adding the line numbers to the file unless
there's an easy way to do it that uses windows built-in functionality.
Unfortunately the file comes from a mainframe extract so the I can't
change the program either.

Just to be clear about what I'm doing, I bulk insert into a table that
has a KEY field and then one or more addresses eg:

KEY, ADDRESS1, ADDRESS2
-----------------------------
FRED, HOG STREET, HOGLAND
FRED, HOG STREET, HOGLANDIO

I need to take the key (ie FRED) and one address (doesn't matter
which) and put it into another table. However, I have to get the same
address on each server. That's more important than trying to get the
first one in the file.

I'm going to try out the bulk insert with identity. If I put a
clustered key on the table that is bulk inserted to on all columns, I
would assume that would force the order in the table to be the same on
both servers - what do you think?

Cheers
Tom
The only way to be sure is to add the line numbers yourself. This can be
done in two ways: 1) Manipulate the file, by running it through a program
that adds a line number. 2) Instead of writing a to file, have the program
to insert the data. In fact, you can still use bulk load, but you would
bulk from variables, using the BCP API.

However, BULK INSERT into a table with an IDENTITY gives you fairly good
odds, and as I understand your case, it does not seem to be a disaster,
if number is not what you expect. So I would go for that.

Jul 20 '05 #5
My latest thinking on this is to create a table the same as the table
holding the 'key' and address components but with an extra identity
type field.

Then insert into this table ordering by key + all columns. This will
force the sequence number to match the same row on each server and the
'key' fields to be sequentially next to each other. Then I can do a
group by, picking up the lowest sequence number.

eg

SELECT KEY,
ADDRESS1,
ADDRESS2,
IDENTITY(int,1, 1) as Seq
INTO tempTable
FROM tempExtract
WHERE 1=2

INSERT INTO tempTable
SELECT KEY,
ADDRESS1,
ADDRESS2
FROM tempExtract
ORDER BY KEY,
ADDRESS1,
ADDRESS2

-- Finally get a key with just one address
SELECT KEY,
ADDRESS1,
ADDRESS2
FROM tempTable
WHERE SEQ = (SELECT MIN (Seq)
FROM tempTable sub
WHERE sub.KEY = tempTable.KEY)

Can you see any holes in that?!

Cheers
Tom
Jul 20 '05 #6
Thomas Richards (to**********@r ocketmail.com) writes:
I'm going to have problems adding the line numbers to the file unless
there's an easy way to do it that uses windows built-in functionality.
Adding such line numbers is a very simple program that can be written
VBscript, Perl, C or whatever your preference is. The one catch is that
this is not very effcient if the file is huge.
I'm going to try out the bulk insert with identity. If I put a
clustered key on the table that is bulk inserted to on all columns, I
would assume that would force the order in the table to be the same on
both servers - what do you think?
What matters is the order that the rows hit the tables. My guess is that
a completely indexless table is better.
INSERT INTO tempTable
SELECT KEY,
ADDRESS1,
ADDRESS2
FROM tempExtract
ORDER BY KEY,
ADDRESS1,
ADDRESS2
...

Can you see any holes in that?!


The sad case is that neither is there any guarantee with an INSERT
statement that the IDENITTY value will reflect the ORDER BY statement.
But if you add OPTION (MAXDOP 1) to the query it usually works. MAXDOP 1
turns off parallelism which is the major reason the ORDER BY gets messed up.

But maybe you should rethink completely. Maybe you should bulk into one
server, remove the duplicates, and the replicate the result to the
second server. This could be done by a linked query, or bulking out and
in again.


--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #7
My latest thinking on this is to create a table the same as the table
holding the 'key' and address components but with an extra identity
type field.

Then insert into this table ordering by key + all columns. This will
force the sequence number to match the same row on each server and the
'key' fields to be sequentially next to each other. Then I can do a
group by, picking up the lowest sequence number.

eg

SELECT KEY,
ADDRESS1,
ADDRESS2,
IDENTITY(int,1, 1) as Seq
INTO tempTable
FROM tempExtract
WHERE 1=2

INSERT INTO tempTable
SELECT KEY,
ADDRESS1,
ADDRESS2
FROM tempExtract
ORDER BY KEY,
ADDRESS1,
ADDRESS2

-- Finally get a key with just one address
SELECT KEY,
ADDRESS1,
ADDRESS2
FROM tempTable
WHERE SEQ = (SELECT MIN (Seq)
FROM tempTable sub
WHERE sub.KEY = tempTable.KEY)

Can you see any holes in that?!

Cheers
Tom
Jul 20 '05 #8
Thomas Richards (to**********@r ocketmail.com) writes:
My latest thinking on this is to create a table the same as the table
holding the 'key' and address components but with an extra identity
type field.

Then insert into this table ordering by key + all columns. This will
force the sequence number to match the same row on each server and the
'key' fields to be sequentially next to each other. Then I can do a
group by, picking up the lowest sequence number.
...


That seems to the same suggestion, to which I answered once, so I
simply repear that answer:

The sad case is that neither is there any guarantee with an INSERT
statement that the IDENITTY value will reflect the ORDER BY statement.
But if you add OPTION (MAXDOP 1) to the query it usually works. MAXDOP 1
turns off parallelism which is the major reason the ORDER BY gets messed up.

But maybe you should rethink completely. Maybe you should bulk into one
server, remove the duplicates, and the replicate the result to the
second server. This could be done by a linked query, or bulking out and
in again.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #9
Sorry, what I meant to post was:

I'm thinking of doing the following to ensure I get the same row on
each server (assuming identical collations).
It is supposed to only bring back the row that matches the TOP value
of all the fields concatenated. I've tested it on identical servers
and it appears to work and the theory seems fine to me. Can you see
anything wrong with this??

Thanks
Tom

SELECT KEY,
ADDRESS1,
ADDRESS2
FROM TABLE main
WHERE ISNULL(KEY,'Z') +
ISNULL(ADDRESS1 , 'Z') +
ISNULL(ADDRESS2 , 'Z') =
(SELECT TOP 1 ISNULL(KEY, 'Z') +
ISNULL(ADDRESS1 , 'Z') +
ISNULL(ADDRESS2 , 'Z') =
FROM TABLE sub
WHERE sub.KEY = main.KEY
ORDER BY ISNULL(KEY,'Z') +
ISNULL(ADDRESS1 , 'Z') +
ISNULL(ADDRESS2 , 'Z'))
Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2623
by: Bennett Haselton | last post by:
I want to get an ASP.Net hosting account with my ISP, and I'm trying to find out what level of access to the server is requried in order for me to view the server in Server Explorer in Visual Studio .Net, where you can see the available database, expand it to get a list of tables, etc. What level of access is required? Of course you need to have an account with the right user rights, but is there some specific service that needs to be...
4
3840
by: Tryfon Gavriel | last post by:
Hi all I recently noticed when trying to optimise a major query of a chess website I am the webmaster of, that adding an order by for "gamenumber" which is a clustered index field as in for example "order by timeleft desc, gamenumber desc" actually speeded up the queries and reduced sql server 2000 timeouts. I have an ASP error log and I am fairly sure that a dramatic reduction in sql server timeouts is simply attributed to adding an...
8
4183
by: Bill Sonia | last post by:
I've written a Windows Service to send e-mails on events like OnStart, OnStop, OnShutDown using System.Web.Mail. It works for everything but OnShutdown. My guess is that for OnShutDown, once my send mail code is executed, other necessary Windows Services have been terminated before it can actually send the mail. I've updated my HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services registry DependOnService value including SMPTSVC and...
0
987
by: Mamatha | last post by:
Hi i have a code to retrieve table structures of SQL Server database. It gives all the table names in particular databse with the column names and datatypes also. But the output displays not in the exact order of what i am designing in SQL Server database,mainly coulmnnames are not in order. I want to displayed output exactly like what i was designed in SQL Server table.
104
10857
by: Beowulf | last post by:
I have the view below and if I use vwRouteReference as the rowsource for a combo box in an MS Access form or run "SELECT * FROM vwRouteReference" in SQL Query Analyzer, the rows don't come through sorted by Numb. Everything I've read on the web suggests that including the TOP directive should enable ORDERY BY in views. Does someone have an idea why the sorting is not working correctly for this particular view? thanks. CREATE VIEW...
13
4438
by: bevanward | last post by:
Hi All I am finding unexpected results when inserted into a newly created table that has a field of datatype int identity (1,1). Basically the order I sort on when inserting into the table is not reflected in the order of the values from the identity field. Have I been wrong in assuming that it should reflect the order from the sort?
24
1985
by: Hurricane | last post by:
When I create a view in SQL and include an ORDER BY clause i can see it in Management Studio. However, when I call the same view from an ASP page the order goes completely haywire. Any ideas?
10
4909
by: Sharif Islam | last post by:
this is my query= "SELECT i.itemid,title,SortKey from Items AS i JOIN Links AS L ON (i.ItemID=L.ItemID) WHERE L.instructorID='12232' AND courseID='12' AND type='Audio' order by CASE WHEN Sortkey is not null then 1 else 0 end" My SortKey can be NULL. Here's the output I am getting: (the || is to denote sortkey column) 37542 Tape 1 ||
3
1674
by: Shum | last post by:
Hi .. another quey... i just noticed that the entries being done to the sql server tables ( insert query in C# )are not in the correct order.. the last entry is not always the last record in the table.. How to overcome it? i want all the records in the order of insertion...
0
8420
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8842
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8740
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8516
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8617
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6176
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4173
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4330
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1970
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.