473,574 Members | 2,625 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Need help with parsing a multilined log file into objects

Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #1
9 1973
"Paulers" <Su*******@gmai l.comwrote in news:1168395270 .670571.150900@
77g2000hsv.goog legroups.com:
I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.
Do you have a known delimiter? How do you determine one row from another?
Jan 10 '07 #2
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******** **************@ 77g2000hsv.goog legroups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #3
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******** **************@ 77g2000hsv.goog legroups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks
Jan 10 '07 #4
To recap:

Does the data in the log file look EXACTLY like that?

Is there actually a newline betewen 'Control Message (=' and 'Message
Type'?

Does the timestamp, ('03:34:06') mark the start of the message?

Does the timestamp, ('03:34:06') mark the start of EVERY message?

Is the timestamp ALWAYS in the SAME format?

Please post the definition of the target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******** *************@o 58g2000hsb.goog legroups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
>What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******* *************** @77g2000hsv.goo glegroups.com.. .
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #5
Yes I pasted this straight from the log file. all messages begin with a
timestamp. The timestamp is always in the same format
(\d{2}:\d{2}:\d {2})

Im sorry I dont understand what you mean by definition of the target
object. I basically have a class with properties set and getters that I
would like to populate for each object. Is that what you would like to
see?

thanks for your help! :)
Stephany Young wrote:
To recap:

Does the data in the log file look EXACTLY like that?

Is there actually a newline betewen 'Control Message (=' and 'Message
Type'?

Does the timestamp, ('03:34:06') mark the start of the message?

Does the timestamp, ('03:34:06') mark the start of EVERY message?

Is the timestamp ALWAYS in the SAME format?

Please post the definition of the target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******** *************@o 58g2000hsb.goog legroups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******** **************@ 77g2000hsv.goog legroups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks
Jan 10 '07 #6
I would do something like the following.

1) Read through the entire log file and look for the start
of each section (each group of lines that start with your time stamp.

2) Read each of these sections in to an array list.
ArrayList1

3) Add each individual array list into another array list
ArrayList2 (you now have an array list of array lists)

4) Iterate through each element of ArrayList1 and grab
ArrayList2 then pass that through to another function that parses the data
from that one segment.

5) Each segment can be gone through line by line and be
taken apart for its individual sections using a mixture of String.Split,
String.Substrin g, etc.

6) This parsing function would return a new instance of
your data structure with all fields populated as needed.

Keep in mind that this is simpler as long as each line in each section is
always in the same format and always in the same order. If they are not in
the same order (I tend to think they would be since this appears to be a CDR
log form a PBX) you would need to perhaps use a regex string to match the
line up with the right parsing code.

All this is not hard, just tricky and takes some work.

Good luck. I had to do some of this just a short while ago and it was very
interesting.

"Paulers" <Su*******@gmai l.comwrote in message
news:11******** *************@o 58g2000hsb.goog legroups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
>What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******* *************** @77g2000hsv.goo glegroups.com.. .
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #7
Yes! - Any instance of that class is your target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******** **************@ k58g2000hse.goo glegroups.com.. .
Yes I pasted this straight from the log file. all messages begin with a
timestamp. The timestamp is always in the same format
(\d{2}:\d{2}:\d {2})

Im sorry I dont understand what you mean by definition of the target
object. I basically have a class with properties set and getters that I
would like to populate for each object. Is that what you would like to
see?

thanks for your help! :)
Stephany Young wrote:
>To recap:

Does the data in the log file look EXACTLY like that?

Is there actually a newline betewen 'Control Message (=' and 'Message
Type'?

Does the timestamp, ('03:34:06') mark the start of the message?

Does the timestamp, ('03:34:06') mark the start of EVERY message?

Is the timestamp ALWAYS in the SAME format?

Please post the definition of the target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******* **************@ o58g2000hsb.goo glegroups.com.. .
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******* *************** @77g2000hsv.goo glegroups.com.. .
Hello,

I have a log file that contains many multi-line messages. What is
the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have
tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time
in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access
the
object from within some if statements but not in others. I am
wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before,
if
so I'd love to hear about your approach.

thanks


Jan 10 '07 #8
Paulers,

How many people need in a day this information. That shoulld be in my idea
the base of your decission.

If it is OOP, Poop or whatever is less important.

Just my thought,

Cor
"Paulers" <Su*******@gmai l.comschreef in bericht
news:11******** *************@o 58g2000hsb.goog legroups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
>What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******* *************** @77g2000hsv.goo glegroups.com.. .
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks

Jan 10 '07 #9
Thanks for the wonderful advice I really appreciate it. I was wondering
if you had any pointers on how to extract just the lines of the
messages that I need so I can add them to the arraylist. I can loop
through file and grab all the first lines with the time stamp with a
regular expression but what is the though process behind obtaining the
rest of the lines of the message I am parsing without grabbing lines of
the next message?

thanks!

Ray Cassick wrote:
I would do something like the following.

1) Read through the entire log file and look for the start
of each section (each group of lines that start with your time stamp.

2) Read each of these sections in to an array list.
ArrayList1

3) Add each individual array list into another array list
ArrayList2 (you now have an array list of array lists)

4) Iterate through each element of ArrayList1 and grab
ArrayList2 then pass that through to another function that parses the data
from that one segment.

5) Each segment can be gone through line by line and be
taken apart for its individual sections using a mixture of String.Split,
String.Substrin g, etc.

6) This parsing function would return a new instance of
your data structure with all fields populated as needed.

Keep in mind that this is simpler as long as each line in each section is
always in the same format and always in the same order. If they are not in
the same order (I tend to think they would be since this appears to be a CDR
log form a PBX) you would need to perhaps use a regex string to match the
line up with the right parsing code.

All this is not hard, just tricky and takes some work.

Good luck. I had to do some of this just a short while ago and it was very
interesting.

"Paulers" <Su*******@gmai l.comwrote in message
news:11******** *************@o 58g2000hsb.goog legroups.com...
The type of messages I want to extract from the log look like this
(type 101's) but not all messages are in the same format. I would like
to extract the values and populate an opject with the same properties
and store them in a collection that I can iterate through. I just do
not know how to get the values for each of these type 101 messages into
their own objects.

03:34:06 server12 Trace: [ 384]abox1->bbox1: Control Message (=
Message Type 101); Message Length 921 bytes
Get Call (= Subtype 9); DialogueID: (2007) 000007d7;
SendSeqNo: (1)00000001
Trunk Group ID: (1) 00000001
Trunk Number: 3
Service ID: (0) 00000000
Dialed Number: INTELLI_CARE
ANI: 1111111111
Called Number: 8000025225
DNIS: 8000025225

Stephany Young wrote:
What is the message delimiter and what is the field delimiter?

In addition, please post a sample of the logfile and the definition of
the
target object.
"Paulers" <Su*******@gmai l.comwrote in message
news:11******** **************@ 77g2000hsv.goog legroups.com...
Hello,

I have a log file that contains many multi-line messages. What is the
best approach to take for extracting data out of each message and
populating object properties to be stored in an ArrayList? I have tried
looping through the logfile using regex, if statements and flags to
find the start and end of each message but I do not see a good time in
this process to create a new instance of my Message object. While
messing around with it I tried to create a new instance in different
places of the loops but when I try to populate it I can not access the
object from within some if statements but not in others. I am wondering
if there is a better approach perhaps one that is more abstract and
takes advantage of OOP. Has anyone done something like this before, if
so I'd love to hear about your approach.

thanks
Jan 10 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
3610
by: python newbie | last post by:
Hi, I have a wxPython app which dump errors when I close it ( in the debug output at bottom of Komodo, when I close my app. ) Where I got the code for my GUI: Straight from the wxProject.py file which comes with the samples: ---- C:\Python23\Lib\site-packages\wx\samples\wxProject\wxProject.py ---- It basically consists of a...
2
3936
by: Cigdem | last post by:
Hello, I am trying to parse the XML files that the user selects(XML files are on anoher OS400 system called "wkdis3"). But i am permenantly getting that error: Directory0: \\wkdis3\ROOT\home Canonicalpath-Directory4: \\wkdis3\ROOT\home\bwe\ You selected the file named AAA.XML getXmlAlgorithmDocument(): IOException Not logged in
4
4692
by: M Shafaat | last post by:
Hi! How can I make label, linklabel . controls to accept multilined text at design time and in the properties window of Visual Studio? Regards M Shafaat
12
5534
by: colincolehour | last post by:
I am new to Python and am working on my first program. I am trying to compare a date I found on a website to todays date. The problem I have is the website only shows 3 letter month name and the date. Example: Jun 15 How would I go about comparing that to a different date? The purpose of my program is to load a webpage and see if the...
13
2374
by: gavino | last post by:
This seems easy but I have been asking tcl and python IRC chat all day and no one gave an answer. I have 100 servers which need a new backup server added to a text file, and then the backup agent restarted. If I have a list of the servers, all with same root password, and the connection is ssh. How do I connect to the server, cat the line to...
1
947
by: phl | last post by:
Hi, I am working on a website which will display images, sections of text, some links and maybe some contents within tables etc. The data will be got from various xml feeds. I might get one XML files which contains all the info I need to display, I am not quite sure how it's going to be yet. So I will read from xml file, create objects,...
2
1801
by: Anders B | last post by:
I want to make a program that reads the content of a LUA array save file.. More precicely a save file from a World of Warcraft plugin called CharacterProfiler, which dumps alot of information about your characters into that save file. Anyhow, I want to extract a couple of lines of it and save it into a database and I need help on figuring...
25
3099
by: Jon Slaughter | last post by:
I have some code that loads up some php/html files and does a few things to them and ultimately returns an html file with some php code in it. I then pass that file onto the user by using echo. Of course then the file doesn't get seen by the user. Is there any command that essentially executes the code and then echo's it? something that...
9
1714
by: igor.tatarinov | last post by:
Hi, I am pretty new to Python and trying to use it for a relatively simple problem of loading a 5 million line text file and converting it into a few binary files. The text file has a fixed format (like a punchcard). The columns contain integer, real, and date values. The output files are the same values in binary. I have to parse the values...
0
7732
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
8243
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7822
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6456
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5626
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5302
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3742
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3754
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1062
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.