473,591 Members | 2,810 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

[XSLT] Vanishing tab character in attribute value

Hello,

when using this "identity" processing sheet:
<xsl:styleshe et version="1.0"
xmlns:xsl="http ://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="iso-8859-1" />

<xsl:template match="@*|node( )">
<xsl:copy>
<xsl:apply-templates select="@*|node ()"/>
</xsl:copy>
</xsl:template>

</xsl:stylesheet>
on this XML instance document:

<?xml version="1.0" encoding="iso-8859-1" ?>
<element attr="a tab" />
the result is:

<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^^
Tabulator(0x9)--^^

, i.e. the numerical entity from the input document is not
recreated at serialization time, but simply substituted for the real
character, a tab.

Unfortunately, this means that re-applying the identity stylesheet from
above on this document makes the tab character get replaced by a single
space character according to the Attribute-Value Normalization rules
(<http://www.w3.org/TR/REC-xml#AVNormalize >):

<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^
Space(0x20)-----^

In short: The above "identity" processing sheet does not deliver a
semantically identical document. Because if it did, the tab character in
the attribute value needed to be written as a numerical entity, so that
a later parser would recreate the tab character in the attribute value
(and normalize it away to a single space).

I'm using the Xalan J2 2.5D1 XSLT processor. Ist this a bug in that
implementation (resp. its XML serializer)?

Regards,
Christian
--
Christian Roth
Mac.Java.Pasta. Sopranosax.Sing le.
Jul 20 '05 #1
9 4244

"Christian Roth" <ro**@visualcli ck.de> wrote in message
news:1g000ah.cq v8s26o5x8gN%ro* *@visualclick.d e...
Hello,

when using this "identity" processing sheet:
<xsl:styleshe et version="1.0"
xmlns:xsl="http ://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="iso-8859-1" />

<xsl:template match="@*|node( )">
<xsl:copy>
<xsl:apply-templates select="@*|node ()"/>
</xsl:copy>
</xsl:template>

</xsl:stylesheet>
on this XML instance document:

<?xml version="1.0" encoding="iso-8859-1" ?>
<element attr="a tab" />
the result is:

<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^^
Tabulator(0x9)--^^

, i.e. the numerical entity from the input document is not
recreated at serialization time, but simply substituted for the real
character, a tab.

Unfortunately, this means that re-applying the identity stylesheet from
above on this document makes the tab character get replaced by a single
space character according to the Attribute-Value Normalization rules
(<http://www.w3.org/TR/REC-xml#AVNormalize >):

<?xml version="1.0" encoding="iso-8859-1"?>
<element attr="a tab"/>
^
Space(0x20)-----^

In short: The above "identity" processing sheet does not deliver a
semantically identical document.


XSLT is a language defining a transformation on a *tree*. It takes as input
a tree (regardless of what the parser did with the input stream of
characters) and produces as its output again a tree (then a serializer will
produce a string output -- only if necessary).

The identity transformation is really an identity -- assuming that we are
transforming a tree into a tree.

Your problem arises due to the fact that you serialize the result of the
first identity transformation and feed the second transformation with
something you consider different.

The solution is not to serialize the intermediate results.

Also, in XSLT 2.0 there are the so called "character maps".
(http://www.w3.org/TR/xslt20/#character-maps)
Using a character map one can specify that a specific character should be
substituted by a specified string of characters during serialization.

=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL
Jul 20 '05 #2
Dimitre Novatchev <dn********@yah oo.com> wrote:
Your problem arises due to the fact that you serialize the result of the
first identity transformation and feed the second transformation with
something you consider different.

The solution is not to serialize the intermediate results.


I probably misphrased my statement, it should not read that the XSLT
processor itself is at fault, but specifically the XML serializer (which
I think you are pointing to, as well).

However, it seems that many current default serializers of XSLT
processors widely available use a what I consider buggy XML serializer
implementation in that it needed to quote the tab character using a
numerical entity when serializing an XML element's attribute. Otherwise,
when reading the XML back, the internal tree would be different than it
was just before serializing it - a fact which I consider an XML
serializer's fault.

Am I overlooking something?

Regards, Christian.
--
Christian Roth
Mac.Java.Pasta. Sopranosax.Sing le.
Jul 20 '05 #3
In article <1g************ *************@v isualclick.de>,
Christian Roth <ro**@visualcli ck.de> wrote:
I probably misphrased my statement, it should not read that the XSLT
processor itself is at fault, but specifically the XML serializer (which
I think you are pointing to, as well).


The definition of the XML output method in the XSLT spec - which
basically says that if you read it in again you should get the same
data model - implies that the serializer should use a character
reference in this case.

So although it is a bug in the serialization, it is still a violation
of the XSLT spec.

-- Richard
--
Spam filter: to mail me from a .com/.net site, put my surname in the headers.

FreeBSD rules!
Jul 20 '05 #4
I probably misphrased my statement, it should not read that the XSLT
processor itself is at fault, but specifically the XML serializer (which
I think you are pointing to, as well).


The definition of the XML output method in the XSLT spec - which
basically says that if you read it in again you should get the same
data model - implies that the serializer should use a character
reference in this case.

So although it is a bug in the serialization, it is still a violation
of the XSLT spec.


Probably "the same data model" means that two representations of an xml
document, one of which is the normalised version of the other, are the same?

Is there any clarity about this?
=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL
Jul 20 '05 #5
I know I made this point in an earlier reply, but I beleive that a tab
entity reference is semantically the same as a tab character i.e. any XML
parser or serialiser is free to replace one by the other.

"Dimitre Novatchev" <dn********@yah oo.com> wrote in message
news:bi******** ****@ID-152440.news.uni-berlin.de...
I probably misphrased my statement, it should not read that the XSLT
processor itself is at fault, but specifically the XML serializer (whichI think you are pointing to, as well).
The definition of the XML output method in the XSLT spec - which
basically says that if you read it in again you should get the same
data model - implies that the serializer should use a character
reference in this case.

So although it is a bug in the serialization, it is still a violation
of the XSLT spec.


Probably "the same data model" means that two representations of an xml
document, one of which is the normalised version of the other, are the

same?
Is there any clarity about this?
=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL

Jul 20 '05 #6
Andy Fish <aj****@blueyon der.co.uk> wrote:
I know I made this point in an earlier reply, but I beleive that a tab
entity reference is semantically the same as a tab character i.e. any XML
parser or serialiser is free to replace one by the other.


No, it is not - at least in attribute values in XML 1.0 conforming
documents. See <http://www.w3.org/TR/REC-xml#AVNormalize > for a
description why or for the problem description if this was so, see my
original post in this thread.

Regards, Christian.
--
Christian Roth
Mac.Java.Pasta. Sopranosax.Sing le.
Jul 20 '05 #7
Dimitre Novatchev <dn********@yah oo.com> wrote:
Probably "the same data model" means that two representations of an xml
document, one of which is the normalised version of the other, are the same?


The problem is that both of these documents have already been normalized
according to the XML 1.0 normalization rules for attributes, and only
then they do not match. So, they are different - which is a bug, IMO.

--
Christian Roth
Mac.Java.Pasta. Sopranosax.Sing le.
Jul 20 '05 #8
Richard Tobin <ri*****@cogsci .ed.ac.uk> wrote:
So although it is a bug in the serialization, it is still a violation
of the XSLT spec.


Thank you for the confirmation. So I'll proceed filing a bug against
Xalan J2 2.5D1.

Regards, Christian.
--
Christian Roth
Mac.Java.Pasta. Sopranosax.Sing le.
Jul 20 '05 #9
Yes -- a few serialize to

(e.g. xsltProc)
=====
Cheers,

Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL
"Bob Foster" <bo********@com cast.net> wrote in message
news:8Ld1b.1704 43$Oz4.44405@rw crnsc54...
Did anyone think to try this with other XSLT processors, just to get an idea how other implementations interpret the standard?

Bob

"Christian Roth" <ro**@visualcli ck.de> wrote in message
news:1g01ih8.1o dbz691nlcz9mN%r o**@visualclick .de...
Richard Tobin <ri*****@cogsci .ed.ac.uk> wrote:
So although it is a bug in the serialization, it is still a violation
of the XSLT spec.


Thank you for the confirmation. So I'll proceed filing a bug against
Xalan J2 2.5D1.

Regards, Christian.
--
Christian Roth
Mac.Java.Pasta. Sopranosax.Sing le.


Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2693
by: Sergio del Amo | last post by:
Hi, I use the xslt functions provided by php. I am running in my computer the package xampp(www.apachefriends.org) which includes php/apache/mysql .. In this package the php includes the sablotron extension responsible for the xslt functions. The problem i have is that the obtained transformation is not the waited one. I try to proccess the same XML file with XSL file with a program called XMLspy and i obtained the desire and waited...
2
3396
by: nanookfan | last post by:
Hi all, I'm having a bizarre problem converting XML files to HTML using an XSLT. The problem is only occuring in my Netscape 7.0 browser. What makes it more bizarre is that it is only happening when I put my XML files and the .xsl files on my ISP's system for my home page. If I try to open the XML files in Netscape 7.0 on my own machine (ie, not on the ISP's system), the pages convert file and the result is displayed in HTML.
2
1707
by: Lionel Fourquaux | last post by:
In .Net 1.1, System.Xml.Xsl.XslTransform cannot output directly a document in an encoding that cannot represent all the characters used (e.g. write in us-ascii for compatibility, and convert all non-ascii chars to entities). While it can be worked around (by defining a TextWriter that does the conversion to entities), it's not very elegant (because the information on encodings must be duplicated outside the XSLT file) and breaks...
1
2033
by: arnold | last post by:
Hi, I've been knocking my head against the wall trying to create an XSL transform to perform "normalizations" of a set of XML files that have a common structure. % XML file before transform
3
3194
by: Bloody Viking | last post by:
Namaste, Y'all! I've got a problem I'm hoping one of you can solve. My stylesheet has the following bit of code: <xsl:template match="web|WEB"> <xsl:variable name="url"> <xsl:value-of select='translate(.," &#x9;&#xa;&#xd;", "")'/> </xsl:variable> <a><xsl:attribute name="href"><xsl:if
4
1511
by: Adrian von Bidder | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Yo! Mostly as a finger-exercise (and because I'm annoyed again and again how bad the existing solutions are), I'm hacking up a web-based forum (yes, the 64832th one, I know). I want to allow some simplified HTML as input language, and use some xsl
2
22757
jkmyoung
by: jkmyoung | last post by:
Here's a short list of useful xslt general tricks that aren't taught at w3schools. Attribute Value Template Official W3C explanation and example This is when you want to put dynamic values in the attribute of an element. Instead of using the <xsl:attribute> element, you can simply place the xpath in the attribute itself. The most common usage of this is in creating hyperlinks.
0
7934
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
7870
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8362
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
6639
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5400
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
3850
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
3891
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2378
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
1199
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.