By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,696 Members | 1,512 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,696 IT Pros & Developers. It's quick & easy.

format files for use with bulk insert

P: n/a
Ted
I used bcp to produce the apended format file.

How can it be modified to recognize the quotes that surround the text
fields and not insert the quotes along with the text? Invariably, the
first four columns have text surrounded by quotes and are terminated by
tabs. If the first column has "abc", only abc ought to be inserted
into that field in the table.

Thanks

Ted

==================format file========================
<?xml version="1.0" ?>
- <BCPFORMAT
xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
- <RECORD>
<FIELD ID="1" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="6"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="2" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="7"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="3" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="48"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="4" xsi:type="NativeFixed" LENGTH="4" />
<FIELD ID="5" xsi:type="CharPrefix" PREFIX_LENGTH="2" MAX_LENGTH="8"
COLLATION="Latin1_General_CI_AI" />
<FIELD ID="6" xsi:type="NativePrefix" PREFIX_LENGTH="1" />
</RECORD>
- <ROW>
<COLUMN SOURCE="1" NAME="f_supplier_code" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="2" NAME="f_product_code" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="3" NAME="f_product_name" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="4" NAME="f_asset_classes_id" xsi:type="SQLINT" />
<COLUMN SOURCE="5" NAME="f_size" xsi:type="SQLVARYCHAR" />
<COLUMN SOURCE="6" NAME="f_dist_unit" xsi:type="SQLFLT8" />
</ROW>
</BCPFORMAT>

Aug 9 '06 #1
Share this Question
Share on Google+
6 Replies


P: n/a
Ted (r.*********@rogers.com) writes:
I used bcp to produce the apended format file.

How can it be modified to recognize the quotes that surround the text
fields and not insert the quotes along with the text? Invariably, the
first four columns have text surrounded by quotes and are terminated by
tabs. If the first column has "abc", only abc ought to be inserted
into that field in the table.
What you have in your post is a format file for native format, and
that's not what your into. Below I repeat the quick guide to format
files, that I posted a few days ago.

And, yeah, you had a format file in XML which Microsoft seem to think
we should use these days. But there is no new functionality in the
new format what I have been able to find, so I stick to the old format.

But if a text column is consistently quoted, you can handle this with a
format file where you specify each field. A format file that fits the
second row in the example above could look like:

8.0
5
1 SQLCHAR 0 0 ";\"" 1 col1 ""
2 SQLCHAR 0 0 "\";" 2 col2 ""
3 SQLCHAR 0 0 ";" 3 col3 ""
4 SQLCHAR 0 0 ";" 4 col3 ""
5 SQLCHAR 0 0 "\r\n" 5 col3 ""

The first row is the version of the file format. Next is the number of
fields in the file. Following lines describe one field each.

First column is record number. Second column is data type of the field
in the file. For a text file this is always SQLCHAR or always SQLNCHAR
for a Unicode file. Other data types are only used with binary formats.

The third column is prefix-length, used only for binary files. Fourth
column is the length, and is used for fixed-length fields. Fifth field
is the terminator, and it is here you specify the quotes.

Six column is the database column, with 1 denoting the first column. 0
means that this field is not to be imported. Seventh column is the
column name, but it's informational. BCP/BULK INSERT does not use it.
Last colunm is the collation for the data in the file.

Overall, keep in mind that BCP/BULK INSERT reads a binary file and a
row terminator is really only the terminator for the last field.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 9 '06 #2

P: n/a
Ted
Thanks Erland,

I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.

Thanks

Ted

Aug 12 '06 #3

P: n/a
Ted (r.*********@rogers.com) writes:
I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.
That's OK for a one-off, but certainly not you don't want to do as a
matter of routine.

I don't remember, but did you say that your first field is quoted,
like this:

"field1","field2","field3",4,5

Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.

In a way BCP and format files are fascinating. I still discover how to do
things, that I did not think were possible. But the collorary of that is
that the solutions are far from obvious.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 12 '06 #4

P: n/a
Ted

Erland Sommarskog wrote:
Ted (r.*********@rogers.com) writes:
I finally have it working, but with a bit of a kludge. I imported the
data file into Open Office Calc and then exporteds it without the
quotes. And then I modified the format file so it didn't expect quotes
around text.
This is OK for the data files (loaded once only, when the database is
first created) used for initializing key tables used for parameters for
analysis of data retrieved daily.

Unfortunately, the data files retrieved every work day is formatted the
same way.
>
That's OK for a one-off, but certainly not you don't want to do as a
matter of routine.

I don't remember, but did you say that your first field is quoted,
like this:

"field1","field2","field3",4,5
Yes, that is correct.
Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.
Guess what I'm trying next? ;-)

This should allow me to process the daily feed without messing with its
format. :-)

Thanks.
In a way BCP and format files are fascinating. I still discover how to do
things, that I did not think were possible. But the collorary of that is
that the solutions are far from obvious.
I'll stop learning new things only when I'm dead! Don't you love a
challenge?

Thanks

Ted

Aug 13 '06 #5

P: n/a
Ted

Erland Sommarskog wrote:
I don't remember, but did you say that your first field is quoted,
like this:

"field1","field2","field3",4,5

Then you need a little trick to handle the first quote:

8.0
6
1 SQLCHAR 0 0 "\"" 0 "" ""
2 SQLCHAR 0 0 "\","\" 1 col1 ""
3 SQLCHAR 0 0 "\","\" 2 col2 ""
4 SQLCHAR 0 0 "\"," 3 col3 ""
5 SQLCHAR 0 0 "," 4 col4 ""
6 SQLCHAR 0 0 "\r\n" 5 col5 ""

The trick is that you say that you say that there is an empty field before
the first quote. Then you specify 0 for the database column, meaning that
you don't import it.
This works great, but there is a fly in the ointment.

Suppose we have records like:

"field1","field2","field3",4,5

But for which field2 can be null. E.G.

"value11","value12","value13",1,2
"value21",,"value23",12,22
"value31","value32","value33",13,23
"value41",,"value43",14,24

One can obtain something like this, e.g., by exporting data from a
spreadsheet.

I don't see a way for a format file to handle this except possibly by
using only a comma or tab or end of record string as the delimiter or
field terminator, leaving the quotes included with the text in the
field, and then use a combination of the functions SUBSTRING and LEN to
remove the leading and trailing quote character after the data has been
loaded.

Any ideas for alternatives?

Cheers,

Ted

Aug 13 '06 #6

P: n/a
Ted (r.*********@rogers.com) writes:
This works great, but there is a fly in the ointment.

Suppose we have records like:

"field1","field2","field3",4,5

But for which field2 can be null. E.G.

"value11","value12","value13",1,2
"value21",,"value23",12,22
"value31","value32","value33",13,23
"value41",,"value43",14,24
Ouch. Yes, this is a case where you lose.
I don't see a way for a format file to handle this except possibly by
using only a comma or tab or end of record string as the delimiter or
field terminator, leaving the quotes included with the text in the
field, and then use a combination of the functions SUBSTRING and LEN to
remove the leading and trailing quote character after the data has been
loaded.
And this fails is you have that delimiter in the text. I guess the
point with the quotes is to cover the case that the delimiter appears
in a field value.

If you use Perl, you can preprocess the file with:

perl -pi -e "s/\x22,,\x22/\x22,\x22\x22,\x22/g"

although neither this is foolproof: a field may include the string
",," - but is far less probable than a single comma.

It's possible that SQL Server Integration Services can handle this
sort of situation. No, I'm not suggesting that you go back to the
Import wizard, but that you write packages of your own. Unfortunately,
I'm not into SSIS myself at all.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx
Aug 13 '06 #7

This discussion thread is closed

Replies have been disabled for this discussion.