Hi,
I'd like to ask how XML parsers should handle attributes which consists
of " entity as value. I know XML allows to use both: single and
double quotes as attribute value terminator. That's clear.
But how should parser react for such situation:
I have CORDSYS element with string attribute which consists of value
with many " entities:
<COORDSYS
string="GEOGCS["GCS_WGS_1 984",DATUM["WGS84&quo t;,SPHEROID["WGS84&quo t;,6378137,298. 257223563]],PRIMEM["Greenwich ",0],UNIT["Degree&qu ot;,0.017453292 5199433]]"/>
So, when I read it to DOM and after someoperations I try to save it to
file parsers replaces double-quote value terminators to single-quote as
follows:
<COORDSYS
string='GEOGCS["GCS_WGS_1 984",DATUM["WGS84&quo t;,SPHEROID["WGS84&quo t;,6378137,298. 257223563]],PRIMEM["Greenwich ",0],UNIT["Degree&qu ot;,0.017453292 5199433]]'/>
Please, explain me how parser is expected to handle this element in
save operation.
Best regards
--
Mateusz Loskot http://mateusz.loskot.net 5 3459
"Mateusz Loskot" <ma*****@loskot .net> wrote: I'd like to ask how XML parsers should handle attributes which consists of " entity as value.
As data that contains the ASCII quotation mark.
I have CORDSYS element with string attribute which consists of value with many " entities:
OK.
So, when I read it to DOM and after someoperations I try to save it to file parsers replaces double-quote value terminators to single-quote as follows:
That's external to XML parsing. You are not processing XML any more but
data constructed by parsing an XML document and representing it as a tree.
What happens then depends on the tools you use. Most probably the internal
representation does not contain the enclosing quotation marks or the entity
references but the parsed attribute values a strings. When you later output
the data in some format, perhaps linearizing it as XML, the results depend
on how you do that.
If all occurrences of ASCII quote and ASCII apostrophe in the attribute
values are "escaped" using entity or character references, it does not
matter whether you use quotes or apostrophes as delimiters when converting
the data back to XML format. (Naturally you need to use matching
delimiters, i.e. the same character as opening and as closing delimiter.)
--
Yucca, http://www.cs.tut.fi/~jkorpela/
Jukka K. Korpela wrote: "Mateusz Loskot" <ma*****@loskot .net> wrote:
So, when I read it to DOM and after someoperations I try to save it to file parsers replaces double-quote value terminators to single-quote as follows: That's external to XML parsing. You are not processing XML any more but data constructed by parsing an XML document and representing it as a tree.
Yes, I know
What happens then depends on the tools you use.
Yes, I use TinyXML DOM parser.
Most probably the internal representation does not contain the enclosing quotation marks or the entity references but the parsed attribute values a strings. When you later output the data in some format, perhaps linearizing it as XML, the results depend on how you do that.
I did some investigation and now I know internals of TinyXML. During
Save operation TinyXML checks if attribute value contains double-quote
character (")
then it encloses attribute value in single-quotes ('). Certainly, it's
correct from XML spec point of view.
This checking is simply made using (let's say function) find('\"') in
attribute value.
TinyXML can be compiled in, let's say, C-style, then it uses its own
string class or with STL support, then it uses std::string.
When TinyXML is compiled in C-style then all " entities are
"vislble" to parser as double-quotes so if you printf value of my
'string' attribute in way how it is hold by TinyXML then you will get
double-quotes instead of " entities. But when TinyXML is compiled
with STL support then everything works fine. TinyXML holds 'string'
attribute with " entities and does not convert it to double-quotes
internally.
Here is longer story with some source code: http://sourceforge.net/forum/forum.p...orum_id=172103
I'm not sure if this approach is correct. I'm also not sure if this is
a TinyXML bug. That's why I've asked this question.
I'm going to do some further discussion with TinyXML developmend Team.
Thanks a lot
--
Mateusz Loskot http://mateusz.loskot.net
"Mateusz Loskot" <ma*****@loskot .net> wrote: During Save operation TinyXML checks if attribute value contains double-quote character (") then it encloses attribute value in single-quotes ('). Certainly, it's correct from XML spec point of view.
It is, but if the attribute value contains _both_ an ASCII quotation
mark " _and_ an ASCII apostrophe ' (which is admittedly rare), then
either of them _must_ be "escaped".
I'm not sure if this approach is correct.
I still don't know what the problem or question is about. You are saying
that the output format is correct. The internal format is not really an XML
issue and mostly a practical question: you need to know the internal format
in order to play with it.
What we _can_ say is that in processing XML data, " and " (assuming a
context where " may appear) must be treated as identical. The distinction
should normally be lost in parsing, but if it is preserved in the internal
format, it should not affect processing of the data as XML. (The
distinction could be retained e.g. in order to be able to print out the
original XML source verbatim for some purpose.)
--
Yucca, http://www.cs.tut.fi/~jkorpela/
In article <Xn************ *************** **@193.229.0.31 >,
Jukka K. Korpela <jk******@cs.tu t.fi> wrote: It is, but if the attribute value contains _both_ an ASCII quotation mark " _and_ an ASCII apostrophe ' (which is admittedly rare)
Not that rare: in an XSLT stylesheet an XPath may well contain a
string containing a quote. If you want an XPath string containing
both you're stuck!
-- Richard
Jukka K. Korpela wrote: "Mateusz Loskot" <ma*****@loskot .net> wrote:
I'm not sure if this approach is correct.
I still don't know what the problem or question is about. You are saying that the output format is correct. The internal format is not really an XML issue and mostly a practical question: you need to know the internal format in order to play with it.
What we _can_ say is that in processing XML data, " and " (assuming a context where " may appear) must be treated as identical.
Yes, I understand it. The problem seems to be more technical and
implementation related: http://sourceforge.net/forum/forum.p...orum_id=172103
You can see that TinyXML parser works differently depending on C/C++
internal usage.
We are sure that when using every XML parser if I search XML element
for " then both " and " (double-quotes) are expected to be
matched.
Cheers
--
Mateusz Loskot http://mateusz.loskot.net This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: David Furey |
last post by:
Hi
I have an XML documnet and a XSLT document as shown below
THe XSLT document brings back a filtered docmument that has the VendorName
that starts with a particular sub-string
This works as expected with alphabet and number characters and the ' (single
quote ' entity) character but does not work if a double quote character
" is part of the string to filter on
This returns all Vendor Names that begin with A (either case)
|
by: barney |
last post by:
Hello,
I' m using .NET System.Xml.XmlDOcument.
When I do the following:
XmlDocument xml = new XmlDocument();
xml.Load("blah");
....
xml.Save("blub");
I've got the problem that the following expression:
.... snip ...
|
by: martin |
last post by:
Hi,
I would be extremly grateful for some help on producing an xml fragemt.
The fragment that I wish to produce should look like this
<Addresses>
<Address>&qout;Somebody's Name" <me@mydomain.com></Address>
</Addresses>
|
by: DC Gringo |
last post by:
I have an image control (that pulls an image off an ESRI map server):
<ASP:IMAGE ID="imgZonedCountry" RUNAT="server"></ASP:IMAGE>
In the code behind I am setting the ImageURL to a String value with url
parameters like this:
imgZonedCountry.ImageUrl =
"http://server/servlet/com.esri.wms.Esrimap?param1=x¶m2=y
When it serves up the page, I'm getting "&" in place of the "&"
|
by: DC Gringo |
last post by:
I am having a bear of a time with setting a URL query string as a text value
in a dropdownlist and Server.URLEncode does not seem to do its job.
theFullLink = theLinkPrefix & theImageryTypeTrimmed &
Server.URLEncode("&f=")
ddlMyDropDownList.Items.Add(New ListItem("MyTextValue", theFullLink & "af"))
Which puts out the following HTML:
<option value="~/folder/page.aspx?param1=value1%26f%3daf">-
| |
by: Arne |
last post by:
A lot of Firefox users I know, says they have problems with validation
where the ampersand sign has to be written as & to be valid. I don't
have Firefox my self and don't wont to install it only because of this,
so I hope some of you gurus can enlighten me with this :)
In what circumstances can the "&" in the source code be involuntary
changed to "&" by a browser when or other software, when editing and
uploading the file to the web...
|
by: Kirt |
last post by:
i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\bye
w&y </dirname>
<file>
<name>def.txt</name>
<time>200607130417</time>
</file>
|
by: Ragnar |
last post by:
Hi,
2 issues left with my tidy-work:
1) Tidy transforms a "&" in the source-xml into a "&" in the tidied
version. My XML-Importer cannot handle it
2) in a long <title>-string a wrap is produced like:
<title>my very long title blab la blab la
Blabla bla </title>
Importer also has got problems with it
|
by: LionelAndJen |
last post by:
I have an XML file that has a free form comment field in which the
data provider, very kindly, already uses """ when writing
"doesn't", I have doesn't .
it's PERFECT, because that xml is then fed to an XSLT sheet that
transforms this text into a sql insert statement. UNFORTUNATELY, XSLT
translates doesn't into doesn't in the output, which then
destroys my sql statement. How do I tell my XSLT to leave well-enough
alone ? I...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it.
Here is my compilation command:
g++-12 -std=c++20 -Wnarrowing bit_field.cpp
Here is the code in...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
| |
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own....
Now, this would greatly impact the work of software developers. The idea...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
|
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...
| |