469,148 Members | 1,257 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,148 developers. It's quick & easy.

Data Load Query

Hi,

I'm extracting data from a mainframe application with a view to loading
it into a MS SQL database. I'm trying to determine the most efficient
way to format the mainframe extract file to make loading into the
database easier.

The problem I have is that the existing record structure includes an
array that can vary between 1 to 50. If I include this array in a
single record the table I use to import the data would need 50 columns
though not all these would be populated. There is a field in the record
to identify how many occurances of the array there are.

Current Record Structure :
Account Number
Account Name
Other Account Details
TotalNumberOfArrayFieldsPopulated
Array :
Value1
Value2
Value3
....
up to Value50 (if required)

i.e.

12344,Mr Agent,$29.95,2,BX123,BX124
12345,Mr Jones,$14.95,3,XX123,XX124,XX125
12345,Mr Jones,$14.00,1,XY123
12345,Mr Jones,$15.95,2,XZ124,XZ125
12346,Mr Smith,$19.95,3,AX123,AX124,AX125
12346,Mr Smith,$19.00,1,BY123
12347,Mr Acant,$99.95,7,CX123,CX124,CX125,CX126,CX127,CX128 ,CX129

There may be up to 3 records created for each Account Number with
different values in the array fields.

Am I better to break this file into two files .. one with the core
customer information and a second file with a row for each array value
which has a link to the customer information file.
Or
Is there a way to efficiently process the original file once it is
loaded into the staging tables in the database ?

i.e.

File 1 - Core Customer Information
====================================
Current Record Structure :
Record Number
Account Number
Account Name
Other Account Details
TotalNumberOfArrayFieldsPopulated

File 2 - Array Information
====================================
Record Number
Array :
Value1
Value2
Value3
....
up to Value50 (if required)

File 1
========================
12344,Mr Agent,$29.95,2
12345,Mr Jones,$14.95,3
12345,Mr Jones,$14.00,1
12345,Mr Jones,$15.95,2
12346,Mr Smith,$19.95,3
12346,Mr Smith,$19.00,1
12347,Mr Acant,$99.95,7

File 2
========================
12344,BX123
12344,BX124
12345,XX123
12345,XX124
12345,XX125
12345,XY123
12345,XZ124
12345,XZ125
12346,AX123
12346,AX124
12346,AX125
12346,BY123
12347,CX123
12347,CX124
12347,CX125
12347,CX126
12347,CX127
12347,CX128
12347,CX129

At times the individual array values will be used for look ups though
essentially the Customer Information record will be the primary lookup
data.

I'm leaning toward changing my COBOL code and creating the 2nd output
unless someone can suggest a simple way to process the information once
loaded into the table.

Any help that could be suggested would be greatly appreciated.

Sep 21 '05 #1
1 1680
For a varying number of fields per record, you might consider XML, as
it's a good format for that kind of data. But I have to say that I have
very limited experience of importing XML data into MSSQL myself - check
out OPENXML in Books Online, the SQLXML tools from Microsoft (which
include an XML bulk load COM component), and you could also post in
microsoft.public.sqlserver.xml to get some feedback on that approach.

Simon

Sep 21 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Aaron | last post: by
reply views Thread by Montagna, Dan | last post: by
reply views Thread by Donald Tyler | last post: by
2 posts views Thread by johnny | last post: by
4 posts views Thread by JIM.H. | last post: by
reply views Thread by zhoujie | last post: by
1 post views Thread by Mortomer39 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.