468,267 Members | 1,960 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,267 developers. It's quick & easy.

DTD or Schema -- Ignore Undefined Tags

I am wondering if there is a way to use a DTD or Schema to instruct an
XML parser to ignore tags that are not defined.

That is, if my list of acceptable tags is <bodyand <content>, then in
the following example:

<body>
We may have some text <b>and some <u>other tags</u></b>
<contentbut I want the text and undefined tags to be part of the
text-node
of the body tag.
</content>
</body>

So the tree would be like:
<body>
#Text
<content>
#Text
</content>
</body>

I want the first text node to contain "We may have some text <b>and
some <u>other tags</u></b>"

Is there some way of doing this with Schemas or DTDs? Or perhaps using
a stylesheet?

Using a stylesheet I would need to do find a way of matching all tags
that arent in a certain list and then re-writing them with $lt;
entities I suppose, but I'm really not sure what the best way to do
this is.

Any help is appreciated,

Greg

Dec 14 '06 #1
2 1900
gr************@gmail.com wrote:
Using a stylesheet I would need to do find a way of matching all tags
that arent in a certain list and then re-writing them with $lt;
entities I suppose
Not a good solution. Elements are semantically meaningful; &lt;foo&gt;
is NOT the same thing as <foo>.

If you're working with schemas, you can use xsd:any with lax validation
to indicate that the contents of certain elements should be accepted
even if not valid.

Another alternative, of course, is to insist only on well-formed
documents and not attempt to validate them.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Dec 14 '06 #2
gr************@gmail.com wrote:
I am wondering if there is a way to use a DTD or Schema to instruct an
XML parser to ignore tags that are not defined.
No.A schema or DTD is for doing exactly the reverse: enforcing the use
only of elements that have been declared.

BTW elements, not "tags": see http://xml.silmaril.ie/authors/makeup/
That is, if my list of acceptable tags is <bodyand <content>, then in
the following example:

<body>
We may have some text <b>and some <u>other tags</u></b>
<contentbut I want the text and undefined tags to be part of the
text-node
of the body tag.
</content>
</body>

So the tree would be like:
<body>
#Text
<content>
#Text
</content>
</body>
If you want to do this, process the XML in non-validated mode, just
well-formed but with no DTD or schema.
I want the first text node to contain "We may have some text <b>and
some <u>other tags</u></b>"

Is there some way of doing this with Schemas or DTDs? Or perhaps using
a stylesheet?
XSLT is your friend.
Using a stylesheet I would need to do find a way of matching all tags
that arent in a certain list and then re-writing them with $lt;
entities I suppose, but I'm really not sure what the best way to do
this is.
Whoah! This is a different question entirely. Are you implying that you
still want to *keep* the otherwise unrecognised element markup? Your
example above implied that you wanted to discard it.

You definitely don't want to fiddle with making them all &lt;...&gt; --
that way madness lies. See http://xml.silmaril.ie/authors/html/

///Peter
--
XML FAQ: http://xml.silmaril.ie/
Dec 14 '06 #3

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by Chuck Bowling | last post: by
2 posts views Thread by Ali | last post: by
3 posts views Thread by Michael | last post: by
reply views Thread by kermitthefrogpy | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.