471,579 Members | 1,253 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,579 software developers and data experts.

Python & XML & DTD (warning: noob attack!)

Hello all,

I have an XML file with an internal DTD which looks roughly like this:

<?xml version="1.0"?>
<!DOCTYPE root [
<!ELEMENT root (node)*>
<!ELEMENT node (description, info, node*)>
<!ELEMENT description (#PCDATA)>
<!ELEMENT info EMPTY>
<!ATTLIST info
text CDATA #REQUIRED

]>
<root>
<node>
<description>node 1</description>
<info text="info 1" />
<node>
<description>node 1-1</description>
<info text="info 1-1" />
</node>
</node>
<node>
<description>node 2</description>
<info text="info 2" />
<node>
<description>node 2-1</description>
<info text="info 2-1" />
</node>
<node>
<description>node 2-2</description>
<info text="info 2-2" />
</node>
</node>
</root>

I want to parse this file into my application, modify the data (this includes
maybe creating and/or deleting nodes), and write it back into the file --
including the DTD. (It doesn't necessarily need validation, though.)

I tried xml.dom.ext.PrettyPrint, but it produces only

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE root>
<root>
...
</root>

actually lacking the document type definition.

Any help is appreciated!

Thanks in advance & cheers =)
*igor*
Jul 18 '05 #1
4 1772
Igor Fedorow wrote:

I have an XML file with an internal DTD which looks roughly like this:
[snip]
I want to parse this file into my application, modify the data (this includes
maybe creating and/or deleting nodes), and write it back into the file --
including the DTD. (It doesn't necessarily need validation, though.)

I tried xml.dom.ext.PrettyPrint, but it produces only
[snip]
actually lacking the document type definition.

Any help is appreciated!


Unfortunately I don't know of any way you could generate the DTD again,
and I've never seen a package which supports what you ask for (not that
it isn't possible, mind you).

On the other hand, are you sure you need the DTD? We use XML in
dozens of ways and absolutely have never benefited from attempts
to use DTDs, and don't appear to suffer from the lack thereof.

Also, aren't DTDs sort of considered either obsolete or at least
vastly inferior to the newer approaches such as XML Schema, or both?

So my recommendation is to ditch the DTD and see if any problems
arise as a result.

-Peter
Jul 18 '05 #2
On Thu, 29 Jan 2004 15:44:25 +0100, Peter Hansen wrote:
Igor Fedorow wrote:

I have an XML file with an internal DTD which looks roughly like this:
[snip]
I want to parse this file into my application, modify the data (this
includes maybe creating and/or deleting nodes), and write it back into the
file -- including the DTD. (It doesn't necessarily need validation,
though.)

I tried xml.dom.ext.PrettyPrint, but it produces only [snip] actually
lacking the document type definition.

Any help is appreciated!


Unfortunately I don't know of any way you could generate the DTD again, and
I've never seen a package which supports what you ask for (not that it isn't
possible, mind you).

On the other hand, are you sure you need the DTD? We use XML in dozens of
ways and absolutely have never benefited from attempts to use DTDs, and
don't appear to suffer from the lack thereof.

Also, aren't DTDs sort of considered either obsolete or at least vastly
inferior to the newer approaches such as XML Schema, or both?

So my recommendation is to ditch the DTD and see if any problems arise as a
result.

-Peter


Actually, I don't really *need* it, but I would simply like to have it -- which
obviously isn't possible...

Anyway, thank you for your help!

Cheers =)
*igor*
Jul 18 '05 #3
On 2004-01-29 07:04:44 -0500, Igor Fedorow <ig*********@obda.net> said:
I tried xml.dom.ext.PrettyPrint, but it produces only

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE root>
<root>
...
</root>

actually lacking the document type definition.


Why not simply use that, then replace the <!DOCTYPE root> with the DTD? I'm
sure you can parse it out from the original file.

--
"Remember, no matter where you go, there you are."
-Buckaroo Banzai
Kevin Ballard
http://kevin.sb.org
Jul 18 '05 #4
Peter Hansen <pe***@engcorp.com> wrote:
Unfortunately I don't know of any way you could generate the DTD again
It is possible to preserve the internal subset in DOM Level 3. You can
read it from the property DocumentType.internalSubset, and it will be
included in documents serialised by an LSSerializer.

It is not, however, possible to write to the internalSubset, and you can't
create a new DocumentType object with a non-empty internalSubset, for some
reason. So the only standard way to copy an internalSubset would be to make
the new document by parsing something with the same value, eg.:

dtd= oldDocument.doctype.internalSubset
parser= oldDocument.implementation.createLSParser(1, None)
input= oldDocument.implementation.createLSInput()
input.stringData= '<!DOCTYPE x [%s]><x/>' % dtd
newDocument= parser.parse(input)
I've never seen a package which supports what you ask for
Plug time: the only package I know of to support DOM Level 3 is my own:

http://www.doxdesk.com/software/py/pxdom.html

Currently this is based on the November 2003 CR spec; there have been a
number of fixes and changes to L3 functionality since, but I'm waiting for
W3C to publish the next draft (presumably Proposed Recommendation) before
releasing 1.0.
Also, aren't DTDs sort of considered either obsolete or at least
vastly inferior to the newer approaches such as XML Schema, or both?


Certainly they have their drawbacks: they're namespace-ignorant, not
flexible enough for some purposes, and they're a legacy bag on the side of
XML rather than something built on top of it in XML syntax.

Still, they're well-understood and widely supported, and simpler to learn
than Schema at least.

--
Andrew Clover
mailto:an*@doxdesk.com
http://www.doxdesk.com/
Jul 18 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Thomas Scheffler | last post: by
11 posts views Thread by Jeremy | last post: by
14 posts views Thread by Arne | last post: by
reply views Thread by lumer26 | last post: by
reply views Thread by Vinnie | last post: by
1 post views Thread by lumer26 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.