Connecting Tech Pros Worldwide Forums | Help | Site Map

Programmatic Alteration of Internal DTD Subset

Chris W
Guest
 
Posts: n/a
#1: Oct 10 '08
Hi All,

I have hundreds of small XML files of the form (extrabeous stuff removed):

<?xml version="1.0"?>
<!DOCTYPE page PUBLIC "-//LOCAL//DTD PAGE 0.1//EN" "page.dtd">
<page>
<graphic boardno="entityname1" />
<graphic boardno="entityname2" />
</page>

that I would like to process into this form:

<?xml version="1.0"?>
<!DOCTYPE page [
<!ENTITY entityname1 SYSTEM "entityname1.gif" NDATA gif>
<!ENTITY entityname2 SYSTEM "entityname2.gif" NDATA gif>
<!NOTATION gif SYSTEM "image/gif">
]>
<page>
<graphic boardno="entityname1" />
<graphic boardno="entityname2" />
</page>

That is, I'd like to load each file, find all the boardno attributes,
insert an ENTITY declaration, insert a NOTATION declaration, and write
the result to a file. The XML markup is unchanged, just the internal
DTD is altered. Finding the boardno attributes in a DOM is trivial, but
manipulating the internal DTD subset and getting it to file is eluding me.

Apart from doing the DTD manipulation as a text file, any suggested tool
sets/approaches. Perl, Python, Java, whatever.

Regards,
Chris W



Mukul Gandhi
Guest
 
Posts: n/a
#2: Oct 11 '08

re: Programmatic Alteration of Internal DTD Subset


I explored the similar issue some time back.

You could look at my findings at,

http://gandhimukul.tripod.com/xml/xml.html

Please see, item no, 6.

Regards,
Mukul

On Oct 10, 10:56*am, Chris W <chrisw_j...@yahoo.com.auwrote:
Quote:
Hi All,
>
I have hundreds of small XML files of the form (extrabeous stuff removed):
>
<?xml version="1.0"?>
<!DOCTYPE page PUBLIC "-//LOCAL//DTD PAGE 0.1//EN" "page.dtd">
<page>
* <graphic boardno="entityname1" />
* <graphic boardno="entityname2" />
</page>
>
that I would like to process into this form:
>
<?xml version="1.0"?>
<!DOCTYPE page [
* * * * <!ENTITY *entityname1 SYSTEM "entityname1.gif" NDATA gif>
* * * * <!ENTITY *entityname2 SYSTEM "entityname2.gif" NDATA gif>
* * * * <!NOTATION gif SYSTEM "image/gif">
]>
<page>
* <graphic boardno="entityname1" />
* <graphic boardno="entityname2" />
</page>
>
That is, I'd like to load each file, find all the boardno attributes,
insert an ENTITY declaration, insert a NOTATION declaration, and write
the result to a file. *The XML markup is unchanged, just the internal
DTD is altered. *Finding the boardno attributes in a DOM is trivial, but
manipulating the internal DTD subset and getting it to file is eluding me..
>
Apart from doing the DTD manipulation as a text file, any suggested tool
sets/approaches. *Perl, Python, Java, whatever.
>
Regards,
Chris W
Chris W
Guest
 
Posts: n/a
#3: Oct 11 '08

re: Programmatic Alteration of Internal DTD Subset


Mukul Gandhi wrote:
Quote:
I explored the similar issue some time back.
>
You could look at my findings at,
>
http://gandhimukul.tripod.com/xml/xml.html
>
Please see, item no, 6.
>
Regards,
Mukul
Thank you sir. Most helpful.

Chris W
Closed Thread