473,320 Members | 2,104 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

format files for use with bulk insert

Ted
I used bcp to produce the apended format file.

How can it be modified to recognize the quotes that surround the text
fields and not insert the quotes along with the text? Invariably, the
first four columns have text surrounded by quotes and are terminated by
tabs. If the first column has "abc", only abc ought to be inserted
into that field in the table.

Thanks

Ted

==================format file========================
<?xml version="1.0" ?>
- <BCPFORMAT
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <RECORD>
<FIELD ID="1" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="6"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="2" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="7"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="3" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="48"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="4" xsi:type="NativeFixed" LENGTH="4" />
<FIELD ID="5" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="8"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="6" xsi:type="NativePrefix" PREFIX_LENGTH="1" />
</RECORD>
- <ROW>
<COLUMN SOURCE="1" NAME="f_supplier_code" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="2" NAME="f_product_code" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="3" NAME="f_product_name" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="4" NAME="f_asset_classes_id" xsi:type="SQLINT" />
<COLUMN SOURCE="5" NAME="f_size" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="6" NAME="f_dist_unit" xsi:type="SQLFLT8" />
</ROW>
</BCPFORMAT>

Aug 9 '06 #1
6 9663
Ted (r.*********@rogers.com) writes:
I used bcp to produce the apended format file.

How can it be modified to recognize the quotes that surround the text
fields and not insert the quotes along with the text? Invariably, the
first four columns have text surrounded by quotes and are terminated by
tabs. If the first column has "abc", only abc ought to be inserted
into that field in the table.
What you have in your post is a format file for native format, and
that's not what your into. Below I repeat the quick guide to format
files, that I posted a few days ago.

And, yeah, you had a format file in XML which Microsoft seem to think
we should use these days. But there is no new functionality in the
new format what I have been able to find, so I stick to the old format.

But if a text column is consistently quoted, you can handle this with a
format file where you specify each field. A format file that fits the
second row in the example above could look like:

8.0
5
1 SQLCHAR 0 0 ";\"" 1 col1 ""
2 SQLCHAR 0 0 "\";" 2 col2 ""
3 SQLCHAR 0 0 ";" 3 col3 ""
4 SQLCHAR 0 0 ";" 4 col3 ""
5 SQLCHAR 0 0 "\r\n" 5 col3 ""

The first row is the version of the file format. Next is the number of
fields in the file. Following lines describe one field each.

First column is record number. Second column is data type of the field
in the file. For a text file this is always SQLCHAR or always SQLNCHAR
for a Unicode file. Other data types are only used with binary formats.

The third column is prefix-length, used only for binary files. Fourth
column is the length, and is used for fixed-length fields. Fifth field
is the terminator, and it is here you specify the quotes.

Six column is the database column, with 1 denoting the first column. 0
means that this field is not to be imported. Seventh column is the
column name, but it's informational. BCP/BULK INSERT does not use it.
Last colunm is the collation for the data in the file.

Overall, keep in mind that BCP/BULK INSERT reads a binary file and a
row terminator is really only the terminator for the last field.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 9 '06 #2
Ted
Thanks Erland,

I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.

Thanks

Ted

Aug 12 '06 #3
Ted (r.*********@rogers.com) writes:
I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.
That's OK for a one-off, but certainly not you don't want to do as a
matter of routine.

I don't remember, but did you say that your first field is quoted,
like this:

"field1","field2","field3",4,5

Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.

In a way BCP and format files are fascinating. I still discover how to do
things, that I did not think were possible. But the collorary of that is
that the solutions are far from obvious.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 12 '06 #4
Ted

Erland Sommarskog wrote:
Ted (r.*********@rogers.com) writes:
I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.
This is OK for the data files (loaded once only, when the database is
first created) used for initializing key tables used for parameters for
analysis of data retrieved daily.

Unfortunately, the data files retrieved every work day is formatted the
same way.
>
That's OK for a one-off, but certainly not you don't want to do as a
matter of routine.

I don't remember, but did you say that your first field is quoted,
like this:

"field1","field2","field3",4,5
Yes, that is correct.
Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.
Guess what I'm trying next? ;-)

This should allow me to process the daily feed without messing with its
format. :-)

Thanks.
In a way BCP and format files are fascinating. I still discover how to do
things, that I did not think were possible. But the collorary of that is
that the solutions are far from obvious.
I'll stop learning new things only when I'm dead! Don't you love a
challenge?

Thanks

Ted

Aug 13 '06 #5
Ted

Erland Sommarskog wrote:
I don't remember, but did you say that your first field is quoted,
like this:

"field1","field2","field3",4,5

Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.
This works great, but there is a fly in the ointment.

Suppose we have records like:

"field1","field2","field3",4,5

But for which field2 can be null. E.G.

"value11","value12","value13",1,2
"value21",,"value23",12,22
"value31","value32","value33",13,23
"value41",,"value43",14,24

One can obtain something like this, e.g., by exporting data from a
spreadsheet.

I don't see a way for a format file to handle this except possibly by
using only a comma or tab or end of record string as the delimiter or
field terminator, leaving the quotes included with the text in the
field, and then use a combination of the functions SUBSTRING and LEN to
remove the leading and trailing quote character after the data has been
loaded.

Any ideas for alternatives?

Cheers,

Ted

Aug 13 '06 #6
Ted (r.*********@rogers.com) writes:
This works great, but there is a fly in the ointment.

Suppose we have records like:

"field1","field2","field3",4,5

But for which field2 can be null. E.G.

"value11","value12","value13",1,2
"value21",,"value23",12,22
"value31","value32","value33",13,23
"value41",,"value43",14,24
Ouch. Yes, this is a case where you lose.
I don't see a way for a format file to handle this except possibly by
using only a comma or tab or end of record string as the delimiter or
field terminator, leaving the quotes included with the text in the
field, and then use a combination of the functions SUBSTRING and LEN to
remove the leading and trailing quote character after the data has been
loaded.
And this fails is you have that delimiter in the text. I guess the
point with the quotes is to cover the case that the delimiter appears
in a field value.

If you use Perl, you can preprocess the file with:

perl -pi -e "s/\x22,,\x22/\x22,\x22\x22,\x22/g"

although neither this is foolproof: a field may include the string
",," - but is far less probable than a single comma.

It's possible that SQL Server Integration Services can handle this
sort of situation. No, I'm not suggesting that you go back to the
Import wizard, but that you write packages of your own. Unfortunately,
I'm not into SSIS myself at all.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 13 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Bernd Lambertz | last post by:
I have a problem with bcp and format files. We changed our databases from varchar to nvarchar to support unicode. No problems so fare with that. It is working fine. But now I need a format...
7
by: iqbal | last post by:
Hi all, We have an application through which we are bulk inserting rows into a view. The definition of the view is such that it selects columns from a table on a remote server. I have added the...
7
by: Bob | last post by:
Hi, I am trying to use BULK INSERT with format file. All of our data has few bytes of header in the data file which I would like to skip before doing BULK INSERT. Is it possible to write...
2
by: Tmuld | last post by:
Hello! I have data exported from a Reference Manager 11, and need to import it into and SQL database. Each record has different number of fields. It is used to cite journal articles. (more...
3
by: moonriver | last post by:
Right now I develop an application to retrieve over 30,000 records from a binary file and then load them into a SQL Server DB. So far I load those records one by one, but the performance is very...
10
by: Daniel P. | last post by:
How can I use ADO.NET to insert lots of records in a very fast way? Thanks!
9
by: David Rysdam | last post by:
I have a large amount of data that I copy in and out of Sybase very often. Now I also want to copy this data in and out of postgres. I have an existing script that creates the entire database(s)...
2
by: KR | last post by:
I am trying to copy the data in excel file into a table using the bcp and this is the code that I have. However the bcp utility does not seem to create a format file, which I thought it should...
0
by: Peter Nofelt | last post by:
Hi all, ISSUE: ==================== In SQL 2005 (sp2) I get the following error when preforming a bulk insert with an associated xml format file: "Could not bulk insert. Unknown version of...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.