473,396 Members | 1,734 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

DTS question reguarding text import and tranformation


MS SQL 2000 sp4 on WinXp Pro SP2
I am very new to this so please let me know what I can do to make it
easier for you to understand the problem.
I have a non delimited text file. This text file has several columns
that for the most part are fixed length but..
The fixed format starts with a variable length number( char 10) as the
first column and a max (char 30) description field and another (char
50) long description field.

if the (char 30) description field has a measurement ( ie 4.5" ) then
the first field stats with a ". The subsequent description will end
with a "" or double quote. The fields can only work if the double
qoutes are replaced with a single space and then the leading single
quote is deleted. If this is done in this order then the fixed field
lengths work.

example of actual data***************

This example does not have any quotes in it*****

1015304 Sof Sand Block Holder RUDOLPH
INTE BH y N N nbrd EA 1.00000 .83 1.25

This example does have the quotes**********

"10154 Rud Zeb NonDisf Cush Cur ve 180Gr 7"" RUDOLPH
INTE DUZ180F5 y N N nbrd EA 1.00000 .43 .65"

This is the next several lines***************

1015401 Rud Blk Disf Cush Curved 100/180Gr RUDOLPH
INTE DUB100/1 y N N nbrd EA 1.00000 .46 .69

1015402 Rud Blk Disf Cush Curved 100Gr RUDOLPH
INTE DUB100F5 y N N nbrd EA 1.00000 .46 .69

1015405 Rud Myl Disif Violet 100 Gr RUDOLPH
INTE GLMM100F y N N nbrd EA 1.00000 .43 .65

1015406 Rud Myl Purple Disf 80Gr RUDOLPH
INTE GLMM080F y N N nbrd EA 1.00000 .43 .65

1015407 Rud Myl Light Green Disf 120Gr RUDOLPH
INTE GLMM120F y N N nbrd EA 1.00000 .43 .65

"1015408 Rud Myl Yellow Disf 180G r 7"" RUDOLPH
INTE GLMM180F y N N nbrd EA 1.00000 .43 .65"

end of example*******************

right now I am doing this by hand in notepad and then importing to
excel to generate a csv file. Can anyone suggest a better way to do
this or just point me to an example.

Thanks very much for any
Jan 24 '07 #1
2 4504
In the past I had to perform similar tasks very often to import text files
generated from mainframe systems (or other third party sources that could
not directly import) to SQL Server. Normally I have done it two ways based
on the tools available:

1) Use a pre-processing tool to clean up the data and then import to SQL
Server via BCP or DTS. Sometimes I had to write my own little utility
program or script to clean up the data, sometimes I was able to find tools
that are available out there. In your case a simple search and replace tool
should do the job (probably better to have command line interface so you can
automate the process). There are many available and if you do not feel
comfortable writing your own tool you just need to Google for "search and
replace" and you will find a few. I did a quick search and here are two
links, but please test and evaluate yourself:
http://www.microsoft.com/technet/scr...5/hey0208.mspx
http://www.thefreecountry.com/progra...dreplace.shtml

2) The second approach is to use BCP or DTS to import the text file to a
staging table with a single varchar (or nvarchar if you have UNICODE
characters in the data) column. This works well if you can fit one row of
the text file into the size of varchar(8000) or nvarchar(4000). Otherwise
you can still split it into multiple columns but processing afterwards
becomes more complex. Then you can use the various string functions in SQL
Server to perform the data clean up, split into columns and insert the final
results into your production table. In your case the REPLACE and SUBSTRING
functions could do it.

Hope this helps.

Regards,

Plamen Ratchev
http://www.SQLStudio.com

Jan 24 '07 #2
To add on to Plamen's response, you can include an ActiveX script to task in
your DTS package to remove the extraneous quotes and then process the
cleaned up file in your transformation. You can change your transformation
to process this file as fixed-length instead of delimited.

Below is the body of an ActiveX script that will correct the sample data you
posted:

Const ForReading = 1
Const ForWriting = 2
Dim FSO, inFile, outFile, record

Set FSO = CreateObject("Scripting.FileSystemObject")
Set inFile = FSO.OpenTextFile( _
"C:\MyFiles\MyInputFile.txt", ForReading)
Set outFile = FSO.OpenTextFile( _
"C:\MyFiles\MyOutputFile.txt", ForWriting, True)

Do While inFile.AtEndOfStream = False
record = InFile.ReadLine()
record = Replace(record, """""", " ") 'replace "" with single space
record = Replace(record, """", "") 'remove "
outFile.WriteLine record
Loop

inFile.Close
outFile.Close
Set inFile = Nothing
Set outFile = Nothing
Set FSO = Nothing

--
Hope this helps.

Dan Guzman
SQL Server MVP

"lwhite" <an***********@excite.comwrote in message
news:rb********************************@4ax.com...
>
MS SQL 2000 sp4 on WinXp Pro SP2
I am very new to this so please let me know what I can do to make it
easier for you to understand the problem.
I have a non delimited text file. This text file has several columns
that for the most part are fixed length but..
The fixed format starts with a variable length number( char 10) as the
first column and a max (char 30) description field and another (char
50) long description field.

if the (char 30) description field has a measurement ( ie 4.5" ) then
the first field stats with a ". The subsequent description will end
with a "" or double quote. The fields can only work if the double
qoutes are replaced with a single space and then the leading single
quote is deleted. If this is done in this order then the fixed field
lengths work.

example of actual data***************

This example does not have any quotes in it*****

1015304 Sof Sand Block Holder RUDOLPH
INTE BH y N N nbrd EA 1.00000 .83 1.25

This example does have the quotes**********

"10154 Rud Zeb NonDisf Cush Cur ve 180Gr 7"" RUDOLPH
INTE DUZ180F5 y N N nbrd EA 1.00000 .43 .65"

This is the next several lines***************

1015401 Rud Blk Disf Cush Curved 100/180Gr RUDOLPH
INTE DUB100/1 y N N nbrd EA 1.00000 .46 .69

1015402 Rud Blk Disf Cush Curved 100Gr RUDOLPH
INTE DUB100F5 y N N nbrd EA 1.00000 .46 .69

1015405 Rud Myl Disif Violet 100 Gr RUDOLPH
INTE GLMM100F y N N nbrd EA 1.00000 .43 .65

1015406 Rud Myl Purple Disf 80Gr RUDOLPH
INTE GLMM080F y N N nbrd EA 1.00000 .43 .65

1015407 Rud Myl Light Green Disf 120Gr RUDOLPH
INTE GLMM120F y N N nbrd EA 1.00000 .43 .65

"1015408 Rud Myl Yellow Disf 180G r 7"" RUDOLPH
INTE GLMM180F y N N nbrd EA 1.00000 .43 .65"

end of example*******************

right now I am doing this by hand in notepad and then importing to
excel to generate a csv file. Can anyone suggest a better way to do
this or just point me to an example.

Thanks very much for any
Jan 24 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Doug van Vianen | last post by:
Hi, I am working on an Applet which provides some mouse practice for new computer users in our local seniors' computer club. The applet contains several cards, in a card layout, which are...
7
by: Jane Austine | last post by:
As you add more items, say text lines, in Text widget, it gets too slow and almost impractical to use on. Take idle for example. If the text gets bigger(e.g. print...
1
by: Tomislav Jakopec | last post by:
Hi to all! I have one (or more) problem. I set up a system to provide web access to legacy application. Legacy application run at RISC mashine, web server is tomcat an linux. Response time is...
1
by: DCM Fan | last post by:
Access 2K, SP3 on Windows 2K, SP4 All, I have an import spec set up with quoted Identifiers and comma-separated values. The text file is produced by a 3rd-party program of which I have no...
5
by: Johnny Meredith | last post by:
I have seven huge fixed width text file that I need to import to Access. They contain headers, subtotals, etc. that are not needed. There is also some corrupt data that we know about and can...
1
by: ghadley_00 | last post by:
Hi, I have a MS access database table for which I regularly need to import fixed width text data. At present I have to to cut and paste the text data from its source to a text file, save the...
13
by: DH | last post by:
Hi, I'm trying to strip the html and other useless junk from a html page.. Id like to create something like an automated text editor, where it takes the keywords from a txt file and removes them...
4
by: chimambo | last post by:
I have 2 problems: 1. I want to import a single text file into an access table using a Macro. I am however getting an error that I need to put a specification name argument. What does this mean?...
16
by: Wayne | last post by:
I've read that one method of repairing a misbehaving database is to save all database objects as text and then rebuild them from the text files. I've used the following code posted by Lyle...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.