471,077 Members | 759 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,077 software developers and data experts.

Function to return a valid element name

Hi!
Is there any function that converts a string containing characters
that are invalid for use in an element name to a valid one?

Thanks,
Andreas

Feb 26 '07 #1
6 2592
ad****@cs.tu-berlin.de wrote:
Is there any function that converts a string containing characters
that are invalid for use in an element name to a valid one?
Which programming language/framework are you using? The Microsoft .NET
framework has
XmlConvert.EncodeName
<http://msdn2.microsoft.com/en-us/library/system.xml.xmlconvert.encodename.aspx>
--

Martin Honnen
http://JavaScript.FAQTs.com/
Feb 26 '07 #2
On 26 Feb., 13:30, Martin Honnen <mahotr...@yahoo.dewrote:
adu...@cs.tu-berlin.de wrote:
Is there any function that converts a string containing characters
that are invalid for use in an element name to a valid one?

Which programming language/framework are you using? The Microsoft .NET
framework has
XmlConvert.EncodeName
<http://msdn2.microsoft.com/en-us/library/system.xml.xmlconvert.encode...>

--

Martin Honnen
http://JavaScript.FAQTs.com/
Aah yes, sorry I have not been precise. I am looking for a xml
function like translate() or replace().

Feb 26 '07 #3
ad****@cs.tu-berlin.de wrote:
>>Is there any function that converts a string containing characters
that are invalid for use in an element name to a valid one?
Aah yes, sorry I have not been precise. I am looking for a xml
function like translate() or replace().
In that case, I believe the answer is... translate(), or implement your
own recursive string processing if single-character substitutions aren't
sufficient for you. There's nothing standardized for this purpose, since
it isn't something commonly done.
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 26 '07 #4
On 26 Feb., 14:06, Joe Kesselman <keshlam-nos...@comcast.netwrote:
adu...@cs.tu-berlin.de wrote:
>Is there any function that converts a string containing characters
that are invalid for use in an element name to a valid one?
Aah yes, sorry I have not been precise. I am looking for a xml
function like translate() or replace().

In that case, I believe the answer is... translate(), or implement your
own recursive string processing if single-character substitutions aren't
sufficient for you. There's nothing standardized for this purpose, since
it isn't something commonly done.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Okay, thank you anyway.

Feb 26 '07 #5
One more observation: There are a heck of a lot of characters that are
valid in element names (just about any alphanumeric in just about any
language, plus some punctuation), since XML's defined in terms of
Unicode. Simply checking whether all the characters in an element name
are legal is something of a pain; figuring out what to replace the
(many!) other Unicode characters with is going to be (ahem) interesting.
The simplest solution would probably be to invent some sort of escaping
syntax (and then, as usual with such things, also escape the
escape-introduction sequence so the conversion is reliably unique and
reversible).

Unless you control ALL names in the document, that does introduce the
risk that a name created by someone else will contain something that
looks like an escape sequence.
BUT... frankly, you really don't *WANT* element names being made up on
the fly, since they're what describes the structure of your document.
Consider putting your non-XML descriptor in _content_, eg an attribute
value, rather than an element name. Among other things, XML already has
the ability to escape characters in text content.

(You still won't be able to use every possible character, even after
escaping it, if you're working in XML 1.0. I believe XML 1.1 -- which is
rarely used -- expanded the legal character set, but you may not want to
make support for 1.1 a prereqisite. The alternative is to fall back to
inventing your own escaping mechanism, eg by doing a base-64 encoding
upon the UTF8 data.)
In other words: What problem are you really trying to solve, and is the
rather ugly kluge you proposed really necessary and/or sufficient?
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 27 '07 #6
On 27 Feb., 04:41, Joe Kesselman <keshlam-nos...@comcast.netwrote:
One more observation: There are a heck of a lot of characters that are
valid in element names (just about any alphanumeric in just about any
language, plus some punctuation), since XML's defined in terms of
Unicode. Simply checking whether all the characters in an element name
are legal is something of a pain; figuring out what to replace the
(many!) other Unicode characters with is going to be (ahem) interesting.
The simplest solution would probably be to invent some sort of escaping
syntax (and then, as usual with such things, also escape the
escape-introduction sequence so the conversion is reliably unique and
reversible).

Unless you control ALL names in the document, that does introduce the
risk that a name created by someone else will contain something that
looks like an escape sequence.

BUT... frankly, you really don't *WANT* element names being made up on
the fly, since they're what describes the structure of your document.
Consider putting your non-XML descriptor in _content_, eg an attribute
value, rather than an element name. Among other things, XML already has
the ability to escape characters in text content.

(You still won't be able to use every possible character, even after
escaping it, if you're working in XML 1.0. I believe XML 1.1 -- which is
rarely used -- expanded the legal character set, but you may not want to
make support for 1.1 a prereqisite. The alternative is to fall back to
inventing your own escaping mechanism, eg by doing a base-64 encoding
upon the UTF8 data.)

In other words: What problem are you really trying to solve, and is the
rather ugly kluge you proposed really necessary and/or sufficient?

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Hi!
Thank you for your extended thoughts on this. As you might have
guessed, I´m pretty new to XML. In my case a tool from a toolchain can
export results as a xml-file. Until now this feature has not been used
but now we want to use it and therefore import it to another tool. As
you can imagine the output is not compatible to what the second tool
can import so I'm currently writing a xsl transformation. In order to
do this, some element values will become element names in the output
xml. Meanwhile I have found the problem I was facing when I posted
this not to be illegal characters in regard to xml (except some
spaces), but the fact that the second tool doesn´t accept a whole
bunch of characters used in the source xml. Consequently it seems to
me that translate() is my choice. If you can advice otherwise, please
tell me!

Regards,
Andreas

Feb 27 '07 #7

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Phil Powell | last post: by
9 posts views Thread by Penn Markham | last post: by
5 posts views Thread by mike | last post: by
1 post views Thread by Thiago Jorge | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.