473,738 Members | 2,009 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Handling " entity in attribute value

Hi,

I'd like to ask how XML parsers should handle attributes which consists
of " entity as value. I know XML allows to use both: single and
double quotes as attribute value terminator. That's clear.
But how should parser react for such situation:

I have CORDSYS element with string attribute which consists of value
with many " entities:

<COORDSYS
string="GEOGCS[&quot;GCS_WGS_1 984&quot;,DATUM[&quot;WGS84&quo t;,SPHEROID[&quot;WGS84&quo t;,6378137,298. 257223563]],PRIMEM[&quot;Greenwich &quot;,0],UNIT[&quot;Degree&qu ot;,0.017453292 5199433]]"/>

So, when I read it to DOM and after someoperations I try to save it to
file parsers replaces double-quote value terminators to single-quote as
follows:

<COORDSYS
string='GEOGCS[&quot;GCS_WGS_1 984&quot;,DATUM[&quot;WGS84&quo t;,SPHEROID[&quot;WGS84&quo t;,6378137,298. 257223563]],PRIMEM[&quot;Greenwich &quot;,0],UNIT[&quot;Degree&qu ot;,0.017453292 5199433]]'/>

Please, explain me how parser is expected to handle this element in
save operation.

Best regards

--
Mateusz Loskot
http://mateusz.loskot.net

Oct 21 '05 #1
5 3459
"Mateusz Loskot" <ma*****@loskot .net> wrote:
I'd like to ask how XML parsers should handle attributes which consists
of &quot; entity as value.
As data that contains the ASCII quotation mark.
I have CORDSYS element with string attribute which consists of value
with many &quot; entities:
OK.
So, when I read it to DOM and after someoperations I try to save it to
file parsers replaces double-quote value terminators to single-quote as
follows:


That's external to XML parsing. You are not processing XML any more but
data constructed by parsing an XML document and representing it as a tree.
What happens then depends on the tools you use. Most probably the internal
representation does not contain the enclosing quotation marks or the entity
references but the parsed attribute values a strings. When you later output
the data in some format, perhaps linearizing it as XML, the results depend
on how you do that.

If all occurrences of ASCII quote and ASCII apostrophe in the attribute
values are "escaped" using entity or character references, it does not
matter whether you use quotes or apostrophes as delimiters when converting
the data back to XML format. (Naturally you need to use matching
delimiters, i.e. the same character as opening and as closing delimiter.)

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Oct 21 '05 #2
Jukka K. Korpela wrote:
"Mateusz Loskot" <ma*****@loskot .net> wrote:
So, when I read it to DOM and after someoperations I try to save it to
file parsers replaces double-quote value terminators to single-quote as
follows:
That's external to XML parsing. You are not processing XML any more but
data constructed by parsing an XML document and representing it as a tree.


Yes, I know
What happens then depends on the tools you use.
Yes, I use TinyXML DOM parser.
Most probably the internal
representation does not contain the enclosing quotation marks or the entity
references but the parsed attribute values a strings. When you later output
the data in some format, perhaps linearizing it as XML, the results depend
on how you do that.


I did some investigation and now I know internals of TinyXML. During
Save operation TinyXML checks if attribute value contains double-quote
character (")
then it encloses attribute value in single-quotes ('). Certainly, it's
correct from XML spec point of view.
This checking is simply made using (let's say function) find('\"') in
attribute value.

TinyXML can be compiled in, let's say, C-style, then it uses its own
string class or with STL support, then it uses std::string.
When TinyXML is compiled in C-style then all &quot; entities are
"vislble" to parser as double-quotes so if you printf value of my
'string' attribute in way how it is hold by TinyXML then you will get
double-quotes instead of &quot; entities. But when TinyXML is compiled
with STL support then everything works fine. TinyXML holds 'string'
attribute with &quot; entities and does not convert it to double-quotes
internally.

Here is longer story with some source code:
http://sourceforge.net/forum/forum.p...orum_id=172103

I'm not sure if this approach is correct. I'm also not sure if this is
a TinyXML bug. That's why I've asked this question.
I'm going to do some further discussion with TinyXML developmend Team.

Thanks a lot

--
Mateusz Loskot
http://mateusz.loskot.net

Oct 21 '05 #3
"Mateusz Loskot" <ma*****@loskot .net> wrote:
During
Save operation TinyXML checks if attribute value contains double-quote
character (")
then it encloses attribute value in single-quotes ('). Certainly, it's
correct from XML spec point of view.
It is, but if the attribute value contains _both_ an ASCII quotation
mark " _and_ an ASCII apostrophe ' (which is admittedly rare), then
either of them _must_ be "escaped".
I'm not sure if this approach is correct.


I still don't know what the problem or question is about. You are saying
that the output format is correct. The internal format is not really an XML
issue and mostly a practical question: you need to know the internal format
in order to play with it.

What we _can_ say is that in processing XML data, &quot; and " (assuming a
context where " may appear) must be treated as identical. The distinction
should normally be lost in parsing, but if it is preserved in the internal
format, it should not affect processing of the data as XML. (The
distinction could be retained e.g. in order to be able to print out the
original XML source verbatim for some purpose.)

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Oct 23 '05 #4
In article <Xn************ *************** **@193.229.0.31 >,
Jukka K. Korpela <jk******@cs.tu t.fi> wrote:
It is, but if the attribute value contains _both_ an ASCII quotation
mark " _and_ an ASCII apostrophe ' (which is admittedly rare)


Not that rare: in an XSLT stylesheet an XPath may well contain a
string containing a quote. If you want an XPath string containing
both you're stuck!

-- Richard
Oct 23 '05 #5

Jukka K. Korpela wrote:
"Mateusz Loskot" <ma*****@loskot .net> wrote:
I'm not sure if this approach is correct.


I still don't know what the problem or question is about. You are saying
that the output format is correct. The internal format is not really an XML
issue and mostly a practical question: you need to know the internal format
in order to play with it.

What we _can_ say is that in processing XML data, &quot; and " (assuming a
context where " may appear) must be treated as identical.


Yes, I understand it. The problem seems to be more technical and
implementation related:

http://sourceforge.net/forum/forum.p...orum_id=172103

You can see that TinyXML parser works differently depending on C/C++
internal usage.

We are sure that when using every XML parser if I search XML element
for " then both &quot; and " (double-quotes) are expected to be
matched.

Cheers

--
Mateusz Loskot
http://mateusz.loskot.net

Oct 23 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
8064
by: David Furey | last post by:
Hi I have an XML documnet and a XSLT document as shown below THe XSLT document brings back a filtered docmument that has the VendorName that starts with a particular sub-string This works as expected with alphabet and number characters and the ' (single quote &apos; entity) character but does not work if a double quote character " is part of the string to filter on This returns all Vendor Names that begin with A (either case)
4
14801
by: barney | last post by:
Hello, I' m using .NET System.Xml.XmlDOcument. When I do the following: XmlDocument xml = new XmlDocument(); xml.Load("blah"); .... xml.Save("blub"); I've got the problem that the following expression: .... snip ...
5
3443
by: martin | last post by:
Hi, I would be extremly grateful for some help on producing an xml fragemt. The fragment that I wish to produce should look like this <Addresses> <Address>&qout;Somebody's Name&quot; &lt;me@mydomain.com&gt;</Address> </Addresses>
3
3581
by: DC Gringo | last post by:
I have an image control (that pulls an image off an ESRI map server): <ASP:IMAGE ID="imgZonedCountry" RUNAT="server"></ASP:IMAGE> In the code behind I am setting the ImageURL to a String value with url parameters like this: imgZonedCountry.ImageUrl = "http://server/servlet/com.esri.wms.Esrimap?param1=x&param2=y When it serves up the page, I'm getting "&amp;" in place of the "&"
7
2806
by: DC Gringo | last post by:
I am having a bear of a time with setting a URL query string as a text value in a dropdownlist and Server.URLEncode does not seem to do its job. theFullLink = theLinkPrefix & theImageryTypeTrimmed & Server.URLEncode("&f=") ddlMyDropDownList.Items.Add(New ListItem("MyTextValue", theFullLink & "af")) Which puts out the following HTML: <option value="~/folder/page.aspx?param1=value1%26f%3daf">-
14
5934
by: Arne | last post by:
A lot of Firefox users I know, says they have problems with validation where the ampersand sign has to be written as &amp; to be valid. I don't have Firefox my self and don't wont to install it only because of this, so I hope some of you gurus can enlighten me with this :) In what circumstances can the "&amp;" in the source code be involuntary changed to "&" by a browser when or other software, when editing and uploading the file to the web...
7
1798
by: Kirt | last post by:
i have walked a directory and have written the foll xml document. one of the folder had "&" character so i replaced it by "&amp;" #------------------test1.xml <Directory> <dirname>C:\Documents and Settings\Administrator\Desktop\1\bye w&amp;y </dirname> <file> <name>def.txt</name> <time>200607130417</time> </file>
13
2803
by: Ragnar | last post by:
Hi, 2 issues left with my tidy-work: 1) Tidy transforms a "&amp;" in the source-xml into a "&" in the tidied version. My XML-Importer cannot handle it 2) in a long <title>-string a wrap is produced like: <title>my very long title blab la blab la Blabla bla </title> Importer also has got problems with it
3
2409
by: LionelAndJen | last post by:
I have an XML file that has a free form comment field in which the data provider, very kindly, already uses "&quot;" when writing "doesn't", I have doesn&apos;t . it's PERFECT, because that xml is then fed to an XSLT sheet that transforms this text into a sql insert statement. UNFORTUNATELY, XSLT translates doesn&apos;t into doesn't in the output, which then destroys my sql statement. How do I tell my XSLT to leave well-enough alone ? I...
0
9473
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9208
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8208
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6053
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4569
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4824
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3279
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2744
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2193
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.