473,385 Members | 1,357 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Converting LF/FF delimited logs to XML w/ Python?

This is a very noob-ish question so I apologize in advance, but I'm
hoping to get some input and advice before I get too over my head.

I'm trying to convert some log files from a formfeed- and
linefeed-delimited form into XML. I'd been thinking of using Python to
do this, but I'll be honest and say that I'm very inexperienced with
Python, so before I dive in I wanted to see whether some more
experienced minds thought I was choosing the right tool.

Basically, what I want to do is convert from instant messaging logs
produced by CenterIM, which look like this (Where "^L" represents ASCII
12, the formfeed character):

^L
IN
MSG
1190126325
1190126325
hi
^L
OUT
MSG
1190126383
1190126383
hello

To an XML-based format* like this:

<chat account="joeblow" service="AIM" version="0.4">
<message sender="janedoe" time="1190126325">hi</message>
<message sender="joeblow" time="1190126383">hello</message>
</chat>

Obviously there's information in the bottom example not present in the
top (account names, protocol), but I'll grab those from the file name or
prompt the user.

Given that I'd be learning as I go along, is Python a good tool for
doing this? (Am I totally insane to be trying this as a beginner?) And
if so, where should I start? I'd like to avoid massive
wheel-reinvention if at all possible.

I'm not afraid to RTFM but there's a lot of information around on Python
and I'm not sure what's most relevant. Suggestions on what to read,
books to buy, etc., are all welcomed.

Thanks in advance,
Kadin.

* For the curious, this is sort of poor attempt at the "Universal Log
Format" as used by Adium on OS X.

--
http://kadin.sdf-us.org/
Dec 5 '07 #1
1 1237
On Dec 5, 3:19 pm, Kadin2048 <usenet.ka...@xoxy.netwrote:
This is a very noob-ish question so I apologize in advance, but I'm
hoping to get some input and advice before I get too over my head.

I'm trying to convert some log files from a formfeed- and
linefeed-delimited form into XML. I'd been thinking of using Python to
do this, but I'll be honest and say that I'm very inexperienced with
Python, so before I dive in I wanted to see whether some more
experienced minds thought I was choosing the right tool.

Basically, what I want to do is convert from instant messaging logs
produced by CenterIM, which look like this (Where "^L" represents ASCII
12, the formfeed character):

^L
IN
MSG
1190126325
1190126325
hi
^L
OUT
MSG
1190126383
1190126383
hello

To an XML-based format* like this:

<chat account="joeblow" service="AIM" version="0.4">
<message sender="janedoe" time="1190126325">hi</message>
<message sender="joeblow" time="1190126383">hello</message>
</chat>

Obviously there's information in the bottom example not present in the
top (account names, protocol), but I'll grab those from the file name or
prompt the user.

Given that I'd be learning as I go along, is Python a good tool for
doing this? (Am I totally insane to be trying this as a beginner?) And
if so, where should I start? I'd like to avoid massive
wheel-reinvention if at all possible.

I'm not afraid to RTFM but there's a lot of information around on Python
and I'm not sure what's most relevant. Suggestions on what to read,
books to buy, etc., are all welcomed.

Thanks in advance,
Kadin.

* For the curious, this is sort of poor attempt at the "Universal Log
Format" as used by Adium on OS X.

--http://kadin.sdf-us.org/
I've used lxml and DOM/minidom. Both took my a while to figure out and
I still don't always understand them. Anyway, lxml is similar to the
method Chris mentioned.

http://docs.python.org/lib/module-xml.dom.html
http://www.oreilly.com/catalog/pytho...pter/ch01.html
http://pyxml.sourceforge.net/topics/

Mike
Dec 5 '07 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Simon Foster | last post by:
I have some code which attempts to convert Python arrays (tuples of tuples of tuples...) etc. into C arrays with equivalent contents. The prototype code is shown below. My only question is, is...
5
by: Roy Smith | last post by:
I've got a silly little problem that I'm solving in C++, but I got to thinking about how much easier it would be in Python. Here's the problem: You've got a list of words (actually, they're...
10
by: Padmaja | last post by:
Hello I want to convert a huge delimited text file (of appr 85MB) directly into XML. I dont want to loop through the text file and write as xml, as it is very large file and taking much time. ...
1
by: y1369 | last post by:
Is there a way to preserve the labels when converting Access dataset into Stata format using Stat/Transfer? The labels are built-in with look-up (or drop-down menu) within Access. But after...
9
by: Bernie Yaeger | last post by:
Is there a way to convert or copy a .xml file to a comma delimited text file using vb .net? Thanks for any help. Bernie Yaeger
8
by: Michael B. Trausch | last post by:
I was wondering if anyone has had any experience with this. Someone I know is trying to move away from Microsoft Works, and I am trying to look into a solution that would convert their data in a...
5
by: RyanL | last post by:
I'm a newbie! I have a non-delimited data file that I'd like to convert to delimited. Example... Line in non-delimited file: 0139725635999992000010100534+42050-102800FM-15+1198KAIA Should...
8
by: =?Utf-8?B?TTFpUw==?= | last post by:
I’m trying to parse out Amazon S3 server logs which are space delimited. However date fields are in the following form: When I try to use the following code to split the record on the...
1
by: nick777 | last post by:
Hope the Community can bear with me as I muddle with the vocabulary since I am not really sure if I am going about this the correct way. My question is as follows: If I had some sample data in...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.