473,398 Members | 2,188 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

Processing a Flat File

3
Hi guys, i have a flat file, that i would like to populate into a database. The problem is that the file format is not standard, each record differs from the other depending on the fields available. I have tried using some ETL tools but they usually expect a file in a specific format.

The file has a format similar to:

Variable 1=Blah|Variable2=2|Variable3=woot|Variable4=hoop
Variable 1=Blah|Variable2=2|Variable4=hoop|Variable5=Hive
Variable 1=Blah|Variable2=2|Variable3=woot

The portion before the = indicates which variable it is and the order in which they appear cannot be guaranteed.

Any idea how to get this done? Im sure this can be done with a scripting language like perl but dont have much experience in perl programming so if you could point me in the right direction it would be appreciated.
Apr 16 '07 #1
4 1475
KevinADC
4,059 Expert 2GB
it looks like | is the field delimiter so you would use the split function to return an array of the fields:

Expand|Select|Wrap|Line Numbers
  1. my @array = split(/\|/);
Apr 16 '07 #2
larre
3
it looks like | is the field delimiter so you would use the split function to return an array of the fields:

Expand|Select|Wrap|Line Numbers
  1. my @array = split(/\|/);

Thanks i got it working now.
Apr 19 '07 #3
larre
3
Thanks i got it working now, but im facing another issue with Regex

as stated the file is in the format |field=blah|field2=|field3=3 and so on.
So the following is the file format

1. the variables are not in any specific order, (The First Field is always the same though, but thats not required)

2. The field could be in the record with a Null value (field1=xxx|field2=|field3=yyy)

3. The variable could be totally missing from the record so |field2=| would not even be there. (Field1=xxx|field3=yyyy)

I have the following code,

Expand|Select|Wrap|Line Numbers
  1. if ($a =~ /\|(BAL=.+?)\|/i )
  2.                 {print OUTFILE "$1 \|";
  3.                 } 
  4.                 else {
  5.                 print OUTFILE "BAL=  \| ";
  6.                 }
And so on...

The code above returns the correct value where the field has a value (BAL=xxxx)
or if it's not even in the record, But when i have the scenerio where it exists in the record but with null i.e (|BAL=|) it returns (|BAL=|) as well as the following fields in the record upto the next field with a value i.e something=xxx|.

This is throwing off my formating as i need a standardised format to load into MYSQL.

How do i get around this issue and ensure that even if there is nothing it just returns the BAL=| for all fields?
Apr 19 '07 #4
KevinADC
4,059 Expert 2GB
avoid using $a and $b in your code except for using with sort(), they are special variables that perl uses internally for sorting lists. Your regexp is probably failing becuase you have use the '+' (one or more) quantifier instead of '*' (zero or more):

Expand|Select|Wrap|Line Numbers
  1. if ($n =~ /\|(BAL=.*?)\|/i )
but you may need to quantify the last pipe '|' if there is a chance it's not present:

Expand|Select|Wrap|Line Numbers
  1. if ($a =~ /\|(BAL=.+?)\|?/i )

the '?' in that context means zero or one
Apr 19 '07 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

13
by: raykyoto | last post by:
Hi all, I'm sure this is a popular question that comes up every few months here. Indeed, I've looked at some of the past postings, but I would like to ask things differently. Basically, I'm...
1
by: Tim Fierro | last post by:
Hello, I have had many years using flat file databases (File Express from way back) but am now at a company where a relational database is needed and would carry us into the future. Since I...
22
by: Daniel Billingsley | last post by:
Ok, I wanted to ask this separate from nospam's ridiculous thread in hopes it could get some honest attention. VB6 had a some simple and fast mechanisms for retrieving values from basic text...
4
by: Ben | last post by:
So, at my place of employment, we use a national standard to transmit data between certain applications. This standard consists of a fixed width, flat file 4500-some-odd chars wide that contain...
2
by: Neural | last post by:
Hi, I was wondering if anybody knew of any other ways of efficiently parsing a flat file into SQL Server 2000 using C#. The flat files are tab delimited. And the general file size is around 1 GB...
3
by: ilh.cho | last post by:
Hello, I am looking for a good textbook that covers techniques of processing files in different forms (flat, tagged, indexed, xml, ...) which also has some coverage on database processing. Any...
9
by: FFMG | last post by:
In my site I have a config table, (MySQL), with about 30 entries; the data is loaded on every single page load. This is not the only call to the db, (we do a total of about 8 calls to the db). As...
2
by: murthydb2 | last post by:
Hi My requirement is that i have to write a stored procedure in db2 and that will be executed in a batch file . Any system error or validation error that occurs inside the db2 sp during...
15
by: lxyone | last post by:
Using a flat file containing table names, fields, values whats the best way of creating html pages? I want control over the html pages ie 1. layout 2. what data to show 3. what controls to...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.