473,508 Members | 2,360 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Need help with parsing a multilined log file into objects

Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #1
9 1968
"Paulers" <Su*******@gmail.comwrote in news:1168395270.670571.150900@
77g2000hsv.googlegroups.com:
I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.
Do you have a known delimiter? How do you determine one row from another?
Jan 10 '07 #2
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegro ups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #3
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegro ups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks
Jan 10 '07 #4
To recap:

Does the data in the log file look EXACTLY like that?

Is there actually a newline betewen 'Control Message (=' and 'Message
Type'?

Does the timestamp, ('03:34:06') mark the start of the message?

Does the timestamp, ('03:34:06') mark the start of EVERY message?

Is the timestamp ALWAYS in the SAME format?

Please post the definition of the target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11*********************@o58g2000hsb.googlegro ups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
>What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegr oups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #5
Yes I pasted this straight from the log file. all messages begin with a
timestamp. The timestamp is always in the same format
(\d{2}:\d{2}:\d{2})

Im sorry I dont understand what you mean by definition of the target
object. I basically have a class with properties set and getters that I
would like to populate for each object. Is that what you would like to
see?

thanks for your help! :)
Stephany Young wrote:
To recap:

Does the data in the log file look EXACTLY like that?

Is there actually a newline betewen 'Control Message (=' and 'Message
Type'?

Does the timestamp, ('03:34:06') mark the start of the message?

Does the timestamp, ('03:34:06') mark the start of EVERY message?

Is the timestamp ALWAYS in the SAME format?

Please post the definition of the target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11*********************@o58g2000hsb.googlegro ups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegro ups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks
Jan 10 '07 #6
I would do something like the following.

1) Read through the entire log file and look for the start
of each section (each group of lines that start with your time stamp.

2) Read each of these sections in to an array list.
ArrayList1

3) Add each individual array list into another array list
ArrayList2 (you now have an array list of array lists)

4) Iterate through each element of ArrayList1 and grab
ArrayList2 then pass that through to another function that parses the data
from that one segment.

5) Each segment can be gone through line by line and be
taken apart for its individual sections using a mixture of String.Split,
String.Substring, etc.

6) This parsing function would return a new instance of
your data structure with all fields populated as needed.

Keep in mind that this is simpler as long as each line in each section is
always in the same format and always in the same order. If they are not in
the same order (I tend to think they would be since this appears to be a CDR
log form a PBX) you would need to perhaps use a regex string to match the
line up with the right parsing code.

All this is not hard, just tricky and takes some work.

Good luck. I had to do some of this just a short while ago and it was very
interesting.

"Paulers" <Su*******@gmail.comwrote in message
news:11*********************@o58g2000hsb.googlegro ups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
>What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegr oups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #7
Yes! - Any instance of that class is your target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@k58g2000hse.googlegr oups.com...
Yes I pasted this straight from the log file. all messages begin with a
timestamp. The timestamp is always in the same format
(\d{2}:\d{2}:\d{2})

Im sorry I dont understand what you mean by definition of the target
object. I basically have a class with properties set and getters that I
would like to populate for each object. Is that what you would like to
see?

thanks for your help! :)
Stephany Young wrote:
>To recap:

Does the data in the log file look EXACTLY like that?

Is there actually a newline betewen 'Control Message (=' and 'Message
Type'?

Does the timestamp, ('03:34:06') mark the start of the message?

Does the timestamp, ('03:34:06') mark the start of EVERY message?

Is the timestamp ALWAYS in the SAME format?

Please post the definition of the target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11*********************@o58g2000hsb.googlegr oups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegr oups.com...
Hello,

I have a log file that contains many multi-line messages. What is
the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have
tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time
in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access
the
object from within some if statements but not in others. I am
wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before,
if
so I'd love to hear about your approach.

thanks


Jan 10 '07 #8
Paulers,

How many people need in a day this information. That shoulld be in my idea
the base of your decission.

If it is OOP, Poop or whatever is less important.

Just my thought,

Cor
"Paulers" <Su*******@gmail.comschreef in bericht
news:11*********************@o58g2000hsb.googlegro ups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
>What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegr oups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #9
Thanks for the wonderful advice I really appreciate it. I was wondering
if you had any pointers on how to extract just the lines of the
messages that I need so I can add them to the arraylist. I can loop
through file and grab all the first lines with the time stamp with a
regular expression but what is the though process behind obtaining the
rest of the lines of the message I am parsing without grabbing lines of
the next message?

thanks!

Ray Cassick wrote:
I would do something like the following.

1) Read through the entire log file and look for the start
of each section (each group of lines that start with your time stamp.

2) Read each of these sections in to an array list.
ArrayList1

3) Add each individual array list into another array list
ArrayList2 (you now have an array list of array lists)

4) Iterate through each element of ArrayList1 and grab
ArrayList2 then pass that through to another function that parses the data
from that one segment.

5) Each segment can be gone through line by line and be
taken apart for its individual sections using a mixture of String.Split,
String.Substring, etc.

6) This parsing function would return a new instance of
your data structure with all fields populated as needed.

Keep in mind that this is simpler as long as each line in each section is
always in the same format and always in the same order. If they are not in
the same order (I tend to think they would be since this appears to be a CDR
log form a PBX) you would need to perhaps use a regex string to match the
line up with the right parsing code.

All this is not hard, just tricky and takes some work.

Good luck. I had to do some of this just a short while ago and it was very
interesting.

"Paulers" <Su*******@gmail.comwrote in message
news:11*********************@o58g2000hsb.googlegro ups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmail.comwrote in message
news:11**********************@77g2000hsv.googlegro ups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks
Jan 10 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
3594
by: python newbie | last post by:
Hi, I have a wxPython app which dump errors when I close it ( in the debug output at bottom of Komodo, when I close my app. ) Where I got the code for my GUI: Straight from the wxProject.py...
2
3916
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home...
4
4683
by: M Shafaat | last post by:
Hi! How can I make label, linklabel . controls to accept multilined text at design time and in the properties window of Visual Studio? Regards M Shafaat
12
5525
by: colincolehour | last post by:
I am new to Python and am working on my first program. I am trying to compare a date I found on a website to todays date. The problem I have is the website only shows 3 letter month name and the...
13
2364
by: gavino | last post by:
This seems easy but I have been asking tcl and python IRC chat all day and no one gave an answer. I have 100 servers which need a new backup server added to a text file, and then the backup agent...
1
945
by: phl | last post by:
Hi, I am working on a website which will display images, sections of text, some links and maybe some contents within tables etc. The data will be got from various xml feeds. I might get one XML...
2
1797
by: Anders B | last post by:
I want to make a program that reads the content of a LUA array save file.. More precicely a save file from a World of Warcraft plugin called CharacterProfiler, which dumps alot of information about...
25
3085
by: Jon Slaughter | last post by:
I have some code that loads up some php/html files and does a few things to them and ultimately returns an html file with some php code in it. I then pass that file onto the user by using echo. Of...
9
1707
by: igor.tatarinov | last post by:
Hi, I am pretty new to Python and trying to use it for a relatively simple problem of loading a 5 million line text file and converting it into a few binary files. The text file has a fixed format...
0
7223
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7115
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
7036
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7489
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
5047
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4705
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
1547
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
762
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
414
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.