473,387 Members | 1,722 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

handling xml embedded within xml

I have a log file within which is contained a dump of an xml message

.... rubbish
///asd laksj aslf
<nif_DEBUG time="Fri, 16 May 2008 13:40:17, 330">
<?xml version="1.0" encoding="UTF-8"?>
<ns>
<PDQ Lang="fr-FR" ID="XM;1928">content</PDQ>
</ns>
</nif_DEBUG>
... more junk
.... then more xml
""")
This example is of course a summary.

I want to write a streaming filter which will throw out all the junk
and just return a series of nice strings of each complete xml
message. Ideally I also want to filter which messages I am interested
in.

e.g. the output from the above would be
<?xml version="1.0" encoding="UTF-8"?>
<ns>
<PDQ Lang="fr-FR" ID="XM;1928">content</PDQ>
</ns>

Two problems.
1. clearing away junk that is nothing like XML.
2. handling the <? xml declaration that lies inside the other xml
tags.

the first I can handle relatively simply by reading through the string
until I get what looks like a valid XML tag. I can then pass the rest
onto an xml parser like xml.sax. However the parser then excepts out
with :
XMLSyntaxError: XML declaration allowed only at the start of the
document

I would like a more forgiving parser that handles bad xml by a call
back that I can just say carry on to.
Bear in mind also I probably will not have the end of the stream while
initially processing.

All suggestions and pointers welcome
Andrew
Jun 27 '08 #1
0 639

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Hans-Joachim Widmaier | last post by:
Hi all. Handling files is an extremely frequent task in programming, so most programming languages have an abstraction of the basic files offered by the underlying operating system. This is...
7
by: Noor | last post by:
please tell the technique of centralize exception handling without try catch blocks in c#.
3
by: mirandacascade | last post by:
Verion of Python: 2.4 O/S: Windows XP ElementTree resides in the c:\python24\lib\site-packages\elementtree\ folder When a string that does not contain well-formed XML is passed as an argument...
11
by: Mark Yudkin | last post by:
The documentation is unclear (at least to me) on the permissibility of accessing DB2 (8.1.5) concurrently on and from Windows 2000 / XP / 2003, with separate transactions scope, from separate...
3
by: dgiagio | last post by:
Hi, I'm creating a SMTP application and I would like to hear opinions about error handling. Currently there are two functions that communicate with the remote peer: ssize_t...
22
by: Andy McDonagh | last post by:
Dear python experts, I am new to python and this site, so I apologize if this is off topic (i.e. is it a SciPy question?). I will try to demonstrate my problem below:...
16
by: NewToCPP | last post by:
I have seen at several places that C++ programmers writing for RealTime Embedded applications dont use Exception Handling. They dont like Throw/catch concept. WHY? Thanks.
0
by: YellowFin Announcements | last post by:
Whitepaper: "Yellowfin Reporting" enables Embedded Business Intelligence -------------------------------------------------------------------------------- Embedded reports are a standard...
0
by: =?Utf-8?B?WmlnZ3lTaG9ydA==?= | last post by:
I have a sample XAML file: <StackPanel xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"> <Button Name="button2" Height="23" Width="300" Margin="31,37,0,0" Click="doit">Hello...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.