473,395 Members | 1,791 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Format File Easy one (not for me)!

I have a format file which is working but not correctly. It is, for
some reason, dropping the first line of the input .csv file. The
problem is something with the second coulumn of data having quotes in
it. Any ideas? Below is some info.

Format file (I use firstrow=2 in Bulk Import command:
8.0
6
1 SQLCHAR 0 3000 ",\"" 1 Provider_Raw_ID Latin1_General_CI_AS
2 SQLCHAR 0 3000 "\"," 0 none_name Latin1_General_CI_AS
3 SQLCHAR 0 3000 "," 0 none_Spec_orig Latin1_General_CI_AS
4 SQLCHAR 0 3000 "," 3 SpecialtyCode Latin1_General_CI_AS
5 SQLCHAR 0 3000 "," 2 Category Latin1_General_CI_AS
6 SQLCHAR 0 3000 "\r\n" 4 NetworkComparedTo Latin1_General_CI_AS

Sample input file:
ID,NAME,SPEC_ORIGINAL,SPEC,CATEGORY,NetworkCompare dTo
1,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
2,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
3,"Aaron, Arnold H, DO",General Practice,GP,PCP,netcomp1
4,"Abae, Mick, MD",Reproductive Endocrinology,OBEN,OB,netcomp1
5,"Abanilla, Fernando M, MD",Nephrology,IMNE,SPEC,netcomp1
6,"Abaunza, Ramiro J, MD",Obstetrics/Gynecology,OBGY,OB,netcomp1
7,"Abaunza-Fiallos, Yanina J, MD",Pediatrics,PD,PED,netcomp1
8,"Abbas, Rahat, MD",Internal Medicine,IM,PCP,netcomp1

Thanks a lot!!!

Andrew


*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #1
5 1487

"Andrew Slentz" <aj******@yahoo.com> wrote in message
news:40**********************@news.newsgroups.ws.. .
I have a format file which is working but not correctly. It is, for
some reason, dropping the first line of the input .csv file. The
problem is something with the second coulumn of data having quotes in
it. Any ideas? Below is some info.

Format file (I use firstrow=2 in Bulk Import command:
8.0
6
1 SQLCHAR 0 3000 ",\"" 1 Provider_Raw_ID Latin1_General_CI_AS
2 SQLCHAR 0 3000 "\"," 0 none_name Latin1_General_CI_AS
3 SQLCHAR 0 3000 "," 0 none_Spec_orig Latin1_General_CI_AS
4 SQLCHAR 0 3000 "," 3 SpecialtyCode Latin1_General_CI_AS
5 SQLCHAR 0 3000 "," 2 Category Latin1_General_CI_AS
6 SQLCHAR 0 3000 "\r\n" 4 NetworkComparedTo Latin1_General_CI_AS

Sample input file:
ID,NAME,SPEC_ORIGINAL,SPEC,CATEGORY,NetworkCompare dTo
1,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
2,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
3,"Aaron, Arnold H, DO",General Practice,GP,PCP,netcomp1
4,"Abae, Mick, MD",Reproductive Endocrinology,OBEN,OB,netcomp1
5,"Abanilla, Fernando M, MD",Nephrology,IMNE,SPEC,netcomp1
6,"Abaunza, Ramiro J, MD",Obstetrics/Gynecology,OBGY,OB,netcomp1
7,"Abaunza-Fiallos, Yanina J, MD",Pediatrics,PD,PED,netcomp1
8,"Abbas, Rahat, MD",Internal Medicine,IM,PCP,netcomp1

Thanks a lot!!!

Andrew


*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!


As a previous poster noted, Erland recently posted some useful comments on
format files which might be helpful. You might also want to consider DTS,
which is usually easier to set up than a format file (at least that's my
opinion), and is reasonably 'intelligent' about delimiters and file formats.
It also has facilities for transforming data as its loaded, if you need to
handle issues such as different date formats, for example.

But DTS is a much more complex tool, and dynamically setting source file
names and destination database names is definitely awkward unless you have
some basic VBScript/COM knowledge. If you don't do much data importing, then
it might be overkill, but if you're comfortable with that sort of
programming, it could be another option to consider.

Simon
Jul 20 '05 #2
[posted and mailed, please reply in news]

Andrew Slentz (aj******@yahoo.com) writes:
I have a format file which is working but not correctly. It is, for
some reason, dropping the first line of the input .csv file. The
problem is something with the second coulumn of data having quotes in
it. Any ideas? Below is some info.
...
Sample input file:
ID,NAME,SPEC_ORIGINAL,SPEC,CATEGORY,NetworkCompare dTo
1,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
2,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
3,"Aaron, Arnold H, DO",General Practice,GP,PCP,netcomp1
4,"Abae, Mick, MD",Reproductive Endocrinology,OBEN,OB,netcomp1
5,"Abanilla, Fernando M, MD",Nephrology,IMNE,SPEC,netcomp1
6,"Abaunza, Ramiro J, MD",Obstetrics/Gynecology,OBGY,OB,netcomp1
7,"Abaunza-Fiallos, Yanina J, MD",Pediatrics,PD,PED,netcomp1
8,"Abbas, Rahat, MD",Internal Medicine,IM,PCP,netcomp1


Alas, BCP is not able to handle this file, because BCP does not
know what a column header is. It thinks that the entire file is
rows and field in those rows.

I made longer post on BCP last night, and I recommend you to look it
up, because it may give you some understanding on how BCP works. You
will not that the datafile in that post, too, had column header, but
in that case BCP was able to sort out the problem. That was because
the line with the column names did not contain the delimiter for the
first field.

I don't know DTS, but it is possible that DTS could handle this better.
Else you would have some program that stripped that first line. Or
simply added a dummy delimiter to each row but the first, and then
change the format file accordingly.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #3
Would it be possible to use FIRSTROW=2? I am actually using Bulk Insert
and it's an available option. I tried it with no luck and got so far as
to have the first column in the header row in the first column of the
database but the rest of the data was right.

Also... If I could get something like a "|" or something like that
placed at the end of the header column would that work as a distinct
delimeter???

Thanks for all of your help!

Andrew

*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!
Jul 20 '05 #4
Andrew Slentz (aj******@yahoo.com) writes:
Would it be possible to use FIRSTROW=2? I am actually using Bulk Insert
and it's an available option. I tried it with no luck and got so far as
to have the first column in the header row in the first column of the
database but the rest of the data was right.
No, FIRSTROW=2 won't cut it. BCP looks for the delimiter for the first
column, then the delimiter for the second, the for the third, and so on
to the last, and then back to the beginning. There is no inherent meaning
of line feed.
Also... If I could get something like a "|" or something like that
placed at the end of the header column would that work as a distinct
delimeter???


But you have to have that delitimer on every record in the file. If you
put | on each line, but the header line, you could say that the first
field is terminated by |, and then say that this field does not map to
a database column.

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #5
[posted and mailed, please reply in news]

Andrew Slentz (aj******@yahoo.com) writes:
Would it be possible to use FIRSTROW=2? I am actually using Bulk Insert
and it's an available option. I tried it with no luck and got so far as
to have the first column in the header row in the first column of the
database but the rest of the data was right.

Also... If I could get something like a "|" or something like that
placed at the end of the header column would that work as a distinct
delimeter???


Here are some more suggestions how you could change the file. This was
was your latest posting, that I have seen:

ID,NAME,SPEC_ORIGINAL,SPEC,CATEGORY,NetworkCompare dTo
1,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
2,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
3,"Aaron, Arnold H, DO",General Practice,GP,PCP,netcomp1
4,"Abae, Mick, MD",Reproductive Endocrinology,OBEN,OB,netcomp1
5,"Abanilla, Fernando M, MD",Nephrology,IMNE,SPEC,netcomp1
6,"Abaunza, Ramiro J, MD",Obstetrics/Gynecology,OBGY,OB,netcomp1
7,"Abaunza-Fiallos, Yanina J, MD",Pediatrics,PD,PED,netcomp1
8,"Abbas, Rahat, MD",Internal Medicine,IM,PCP,netcomp1

This could work:

ID,"NAME",SPEC_ORIGINAL,SPEC,CATEGORY,NetworkCompa redTo
1,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
2,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
3,"Aaron, Arnold H, DO",General Practice,GP,PCP,netcomp1
4,"Abae, Mick, MD",Reproductive Endocrinology,OBEN,OB,netcomp1
5,"Abanilla, Fernando M, MD",Nephrology,IMNE,SPEC,netcomp1
6,"Abaunza, Ramiro J, MD",Obstetrics/Gynecology,OBGY,OB,netcomp1
7,"Abaunza-Fiallos, Yanina J, MD",Pediatrics,PD,PED,netcomp1
8,"Abbas, Rahat, MD",Internal Medicine,IM,PCP,netcomp1

This would work, because the header row now has the same delimiters
as the rest of the file. In this case you would use FIRSTROW=2 to
skip the headerline.

Here's another one:

ID,NAME,SPEC_ORIGINAL,SPEC,CATEGORY,NetworkCompare dTo
!1,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
!2,"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1
!3,"Aaron, Arnold H, DO",General Practice,GP,PCP,netcomp1
!4,"Abae, Mick, MD",Reproductive Endocrinology,OBEN,OB,netcomp1
!5,"Abanilla, Fernando M, MD",Nephrology,IMNE,SPEC,netcomp1
!6,"Abaunza, Ramiro J, MD",Obstetrics/Gynecology,OBGY,OB,netcomp1
!7,"Abaunza-Fiallos, Yanina J, MD",Pediatrics,PD,PED,netcomp1
!8,"Abbas, Rahat, MD",Internal Medicine,IM,PCP,netcomp1

Here you would add one more field to the format file, which would be
termimated by ! and where the database column would be 0. In this case,
you should leave FIRSTROW=1. The entire header row, including the newline
would then be the first field of the first record, which you would simply
ignore.

Yet a twist:

NAME,SPEC_ORIGINAL,SPEC,CATEGORY,NetworkComparedTo ,ID
"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1,1
"Aaron, Arnold H, DO",Family Practice General Practice,FP,PCP,netcomp1,2
"Aaron, Arnold H, DO",General Practice,GP,PCP,netcomp1,3
"Abae, Mick, MD",Reproductive Endocrinology,OBEN,OB,netcomp1,4
"Abanilla, Fernando M, MD",Nephrology,IMNE,SPEC,netcomp1,5
"Abaunza, Ramiro J, MD",Obstetrics/Gynecology,OBGY,OB,netcomp1,6
"Abaunza-Fiallos, Yanina J, MD",Pediatrics,PD,PED,netcomp1,7
"Abbas, Rahat, MD",Internal Medicine,IM,PCP,netcomp1,8

Again, you would add one more column to the format file, this time
terminated by ", and again you should have FIRSTROW=1.

The actual format file for these cases are left as an exercise to the
reader. :-)

--
Erland Sommarskog, SQL Server MVP, es****@sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techinf...2000/books.asp
Jul 20 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: neutrino | last post by:
Greetings to the Python gurus, I have a binary file and wish to see the "raw" content of it. So I open it in binary mode, and read one byte at a time to a variable, which will be of the string...
11
by: Dale | last post by:
How to recognize whether file has XML format or not? Here is the code segment: XmlDocument* pDomDocument = new XmlDocument(); try { pDomDocument->Load(strFileName ) ; } catch(Exception* e) {
5
by: google | last post by:
first, a little background... i have a C program which preprocesses some data, and then outputs the results into a text file. that text file, in turn, is used as input to a FORTRAN...
4
by: I_have_nothing | last post by:
Hi! Is there any easy way to printf an integer in a way like 1,234,567? I know "%d" can be usd to print it as 1234567. Any type field in format specification can do that? Or any easy way to do...
4
by: Matt | last post by:
I receive a file from a client and the first 8 bytes are in Big Endian format to denote the length of the following data. This is one of the security requirements. I also have to do the same...
4
by: AWesner | last post by:
For readability sake I’m going to first state that: LF = Line Feed CHR(10) CR = Carriage Return CHR(13) Since Rich Text Format is a standard formalized by Microsoft Corporation I get to ask...
14
by: Manish | last post by:
The project I am developing doesn't involves database. I want to parse the mailbox file (.mbx) and store the summary in the text file for fast retrieval and display of information in the Inbox...
67
by: James Harris | last post by:
I have a requirement to store timestamps in a database. Simple enough you might think but finding a suitably general format is not easy. The specifics are 1) subsecond resolution - milliseconds...
0
by: Independent | last post by:
Python programmers may find the application to decoding an encrypted map image format known as Memory Map to produce a standard PNG image file interesting. Someone obviously very well versed in...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.