473,581 Members | 2,789 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

format files for use with bulk insert

Ted
I used bcp to produce the apended format file.

How can it be modified to recognize the quotes that surround the text
fields and not insert the quotes along with the text? Invariably, the
first four columns have text surrounded by quotes and are terminated by
tabs. If the first column has "abc", only abc ought to be inserted
into that field in the table.

Thanks

Ted

=============== ===format file=========== =============
<?xml version="1.0" ?>
- <BCPFORMAT
xmlns="http://schemas.microso ft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http ://www.w3.org/2001/XMLSchema-instance">
- <RECORD>
<FIELD ID="1" xsi:type="CharP refix" PREFIX_LENGTH=" 2" MAX_LENGTH="6"
COLLATION="Lati n1_General_CI_A I" />
<FIELD ID="2" xsi:type="CharP refix" PREFIX_LENGTH=" 2" MAX_LENGTH="7"
COLLATION="Lati n1_General_CI_A I" />
<FIELD ID="3" xsi:type="CharP refix" PREFIX_LENGTH=" 2" MAX_LENGTH="48"
COLLATION="Lati n1_General_CI_A I" />
<FIELD ID="4" xsi:type="Nativ eFixed" LENGTH="4" />
<FIELD ID="5" xsi:type="CharP refix" PREFIX_LENGTH=" 2" MAX_LENGTH="8"
COLLATION="Lati n1_General_CI_A I" />
<FIELD ID="6" xsi:type="Nativ ePrefix" PREFIX_LENGTH=" 1" />
</RECORD>
- <ROW>
<COLUMN SOURCE="1" NAME="f_supplie r_code" xsi:type="SQLVA RYCHAR" />
<COLUMN SOURCE="2" NAME="f_product _code" xsi:type="SQLVA RYCHAR" />
<COLUMN SOURCE="3" NAME="f_product _name" xsi:type="SQLVA RYCHAR" />
<COLUMN SOURCE="4" NAME="f_asset_c lasses_id" xsi:type="SQLIN T" />
<COLUMN SOURCE="5" NAME="f_size" xsi:type="SQLVA RYCHAR" />
<COLUMN SOURCE="6" NAME="f_dist_un it" xsi:type="SQLFL T8" />
</ROW>
</BCPFORMAT>

Aug 9 '06 #1
6 9693
Ted (r.*********@ro gers.com) writes:
I used bcp to produce the apended format file.

How can it be modified to recognize the quotes that surround the text
fields and not insert the quotes along with the text? Invariably, the
first four columns have text surrounded by quotes and are terminated by
tabs. If the first column has "abc", only abc ought to be inserted
into that field in the table.
What you have in your post is a format file for native format, and
that's not what your into. Below I repeat the quick guide to format
files, that I posted a few days ago.

And, yeah, you had a format file in XML which Microsoft seem to think
we should use these days. But there is no new functionality in the
new format what I have been able to find, so I stick to the old format.

But if a text column is consistently quoted, you can handle this with a
format file where you specify each field. A format file that fits the
second row in the example above could look like:

8.0
5
1 SQLCHAR 0 0 ";\"" 1 col1 ""
2 SQLCHAR 0 0 "\";" 2 col2 ""
3 SQLCHAR 0 0 ";" 3 col3 ""
4 SQLCHAR 0 0 ";" 4 col3 ""
5 SQLCHAR 0 0 "\r\n" 5 col3 ""

The first row is the version of the file format. Next is the number of
fields in the file. Following lines describe one field each.

First column is record number. Second column is data type of the field
in the file. For a text file this is always SQLCHAR or always SQLNCHAR
for a Unicode file. Other data types are only used with binary formats.

The third column is prefix-length, used only for binary files. Fourth
column is the length, and is used for fixed-length fields. Fifth field
is the terminator, and it is here you specify the quotes.

Six column is the database column, with 1 denoting the first column. 0
means that this field is not to be imported. Seventh column is the
column name, but it's informational. BCP/BULK INSERT does not use it.
Last colunm is the collation for the data in the file.

Overall, keep in mind that BCP/BULK INSERT reads a binary file and a
row terminator is really only the terminator for the last field.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 9 '06 #2
Ted
Thanks Erland,

I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.

Thanks

Ted

Aug 12 '06 #3
Ted (r.*********@ro gers.com) writes:
I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.
That's OK for a one-off, but certainly not you don't want to do as a
matter of routine.

I don't remember, but did you say that your first field is quoted,
like this:

"field1","field 2","field3", 4,5

Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.

In a way BCP and format files are fascinating. I still discover how to do
things, that I did not think were possible. But the collorary of that is
that the solutions are far from obvious.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 12 '06 #4
Ted

Erland Sommarskog wrote:
Ted (r.*********@ro gers.com) writes:
I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.
This is OK for the data files (loaded once only, when the database is
first created) used for initializing key tables used for parameters for
analysis of data retrieved daily.

Unfortunately, the data files retrieved every work day is formatted the
same way.
>
That's OK for a one-off, but certainly not you don't want to do as a
matter of routine.

I don't remember, but did you say that your first field is quoted,
like this:

"field1","field 2","field3", 4,5
Yes, that is correct.
Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.
Guess what I'm trying next? ;-)

This should allow me to process the daily feed without messing with its
format. :-)

Thanks.
In a way BCP and format files are fascinating. I still discover how to do
things, that I did not think were possible. But the collorary of that is
that the solutions are far from obvious.
I'll stop learning new things only when I'm dead! Don't you love a
challenge?

Thanks

Ted

Aug 13 '06 #5
Ted

Erland Sommarskog wrote:
I don't remember, but did you say that your first field is quoted,
like this:

"field1","field 2","field3", 4,5

Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.
This works great, but there is a fly in the ointment.

Suppose we have records like:

"field1","field 2","field3", 4,5

But for which field2 can be null. E.G.

"value11","valu e12","value13", 1,2
"value21",,"val ue23",12,22
"value31","valu e32","value33", 13,23
"value41",,"val ue43",14,24

One can obtain something like this, e.g., by exporting data from a
spreadsheet.

I don't see a way for a format file to handle this except possibly by
using only a comma or tab or end of record string as the delimiter or
field terminator, leaving the quotes included with the text in the
field, and then use a combination of the functions SUBSTRING and LEN to
remove the leading and trailing quote character after the data has been
loaded.

Any ideas for alternatives?

Cheers,

Ted

Aug 13 '06 #6
Ted (r.*********@ro gers.com) writes:
This works great, but there is a fly in the ointment.

Suppose we have records like:

"field1","field 2","field3", 4,5

But for which field2 can be null. E.G.

"value11","valu e12","value13", 1,2
"value21",,"val ue23",12,22
"value31","valu e32","value33", 13,23
"value41",,"val ue43",14,24
Ouch. Yes, this is a case where you lose.
I don't see a way for a format file to handle this except possibly by
using only a comma or tab or end of record string as the delimiter or
field terminator, leaving the quotes included with the text in the
field, and then use a combination of the functions SUBSTRING and LEN to
remove the leading and trailing quote character after the data has been
loaded.
And this fails is you have that delimiter in the text. I guess the
point with the quotes is to cover the case that the delimiter appears
in a field value.

If you use Perl, you can preprocess the file with:

perl -pi -e "s/\x22,,\x22/\x22,\x22\x22,\ x22/g"

although neither this is foolproof: a field may include the string
",," - but is far less probable than a single comma.

It's possible that SQL Server Integration Services can handle this
sort of situation. No, I'm not suggesting that you go back to the
Import wizard, but that you write packages of your own. Unfortunately,
I'm not into SSIS myself at all.

--
Erland Sommarskog, SQL Server MVP, es****@sommarsk og.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 13 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
9758
by: Bernd Lambertz | last post by:
I have a problem with bcp and format files. We changed our databases from varchar to nvarchar to support unicode. No problems so fare with that. It is working fine. But now I need a format file for the customer table and and it is not working. It is working fine with the old DB with varchar, but with nvarchar I'm not able to copy the...
7
12109
by: iqbal | last post by:
Hi all, We have an application through which we are bulk inserting rows into a view. The definition of the view is such that it selects columns from a table on a remote server. I have added the servers using sp_addlinkedserver on both database servers. When I call the Commit API of oledb I get the following error: Error state: 1,...
7
2711
by: Bob | last post by:
Hi, I am trying to use BULK INSERT with format file. All of our data has few bytes of header in the data file which I would like to skip before doing BULK INSERT. Is it possible to write format file to skip these few bytes of header before doing BULK INSERT? For example, I have a 1 GB data file with 1000 byte header. Except for first...
2
3153
by: Tmuld | last post by:
Hello! I have data exported from a Reference Manager 11, and need to import it into and SQL database. Each record has different number of fields. It is used to cite journal articles. (more about the format at http://www.adeptscience.co.uk/kb/article/A626)
3
33221
by: moonriver | last post by:
Right now I develop an application to retrieve over 30,000 records from a binary file and then load them into a SQL Server DB. So far I load those records one by one, but the performance is very poor: it takes more than 6 hours to finish the loading. So could I make use of the bulk-insert mechanism of SQL Server to reduce the loading time in C#...
10
31688
by: Daniel P. | last post by:
How can I use ADO.NET to insert lots of records in a very fast way? Thanks!
9
9998
by: David Rysdam | last post by:
I have a large amount of data that I copy in and out of Sybase very often. Now I also want to copy this data in and out of postgres. I have an existing script that creates the entire database(s) from scratch in Sybase and then uses the Sybase bulk copy tool "bcp" to copy the data in. I already did a brute force port of this script to...
2
3717
by: KR | last post by:
I am trying to copy the data in excel file into a table using the bcp and this is the code that I have. However the bcp utility does not seem to create a format file, which I thought it should do. I am probably going about this all wrong so any help would be useful. exec master..xp_cmdshell '(FOR %i IN ("E:\WUTemp\*") DO (bcp...
0
4167
by: Peter Nofelt | last post by:
Hi all, ISSUE: ==================== In SQL 2005 (sp2) I get the following error when preforming a bulk insert with an associated xml format file: "Could not bulk insert. Unknown version of format file" Question:
0
7808
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8312
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7914
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
1
5683
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5366
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3809
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3835
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1410
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1145
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.