473,503 Members | 7,578 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

XSLT entity problem in attribute

Namaste, Y'all!

I've got a problem I'm hoping one of you can solve. My stylesheet has
the following bit of code:

<xsl:template match="web|WEB">
<xsl:variable name="url">
<xsl:value-of select='translate(.," &#x9;&#xa;&#xd;", "")'/>
</xsl:variable>
<a><xsl:attribute name="href"><xsl:if
test="not(starts-with($url,'http'))">http://</xsl:if><xsl:value-of
select="$url"/></xsl:attribute>
<xsl:value-of select="$url"/>
</a>
</xsl:template>

Given the following XML:

<othinfo>
<italic>New York Times</italic>, (Available at <WEB
URL="http://www.mat.
jhu.edu/∼sormani/affirm-impact.html">http://www.mat.jhu.ed
u/∼sormani/affirm-impact.html</WEB>)
</othinfo>

the resultant HTML is:
<em>New York Times</em>, (Available at <a
href="http://www.mat.jhu.edu/%E2%88%BC
sormani/affirm-impact.html">http://www.mat.jhu.edu/&sim;sormani/affirm-impact.ht
ml</a>)

As you can see, ∼ is correctly translated to &sim; in the text
part of the resultant <a> element, but is for some reason translated to
%E2%88%BC in the href attribute. If I remove the translate function,
and just have
<xsl:variable name="url"><xsl:value-of select='.'/></xsl:variable>
the result is identical.

- Paul Lieberman

Mar 27 '06 #1
3 3184
Bloody Viking wrote:
Namaste, Y'all!

I've got a problem I'm hoping one of you can solve. My stylesheet has
the following bit of code:

<xsl:template match="web|WEB">
<xsl:variable name="url">
<xsl:value-of select='translate(.," &#x9;&#xa;&#xd;", "")'/>
</xsl:variable>
OK, so you're squashing out all spaces, TABs, CRs, and LFs.
<a><xsl:attribute name="href"><xsl:if
test="not(starts-with($url,'http'))">http://</xsl:if><xsl:value-of
select="$url"/></xsl:attribute>
<xsl:value-of select="$url"/>
</a>
</xsl:template>
OK, except I'm not clear why two innocuous slashes have been given
as numeric references. Makes no difference to the program, just
makes it harder to read.
Given the following XML:
Is there a reason why some normal characters have been given as
numeric references here as well?
<othinfo>
<italic>New York Times</italic>, (Available at <WEB
URL="http://www.mat.
jhu.edu/∼sormani/affirm-impact.html">http://www.mat.jhu.ed
u/∼sormani/affirm-impact.html</WEB>)
</othinfo>
What is that 8764 doing there? It should be a tilde (&#x7E;).
the resultant HTML is:
<em>New York Times</em>, (Available at <a
href="http://www.mat.jhu.edu/%E2%88%BC
sormani/affirm-impact.html">http://www.mat.jhu.edu/&sim;sormani/affirm-impact.ht
ml</a>)

As you can see, ∼ is correctly translated to &sim; in the text
Yes, but a &sim; is not what you want. You want a tilde there. A sim
is a math graphic character and has no place in a URI.
part of the resultant <a> element, but is for some reason translated to
%E2%88%BC in the href attribute.
That's because the 8764 shouldn't be there in the first place.

At a guess:

Your converter has re-encoded it as three bytes representing the
Unicode for a sim character because you can't use Unicode characters
in URIs (yet, wait until the Chinese proposal goes through :-)

It didn't convert it in the text because Unicode might be allowed
there, eventually (depending on the actual output encoding).
If I remove the translate function,
and just have
<xsl:variable name="url"><xsl:value-of select='.'/></xsl:variable>
the result is identical.


No, get rid of the 8764 from the input: it's just bad data. Or
translate it to a tilde character, which is correct. And maybe
turn those 47s back into slashes to make it easier to check.

///Peter
--
XML FAQ: http://xml.silmaril.ie/
Mar 27 '06 #2
In article <11*********************@u72g2000cwu.googlegroups. com>,
Bloody Viking <pa************@alum.mit.edu> wrote:
As you can see, ∼ is correctly translated to &sim; in the text
part of the resultant <a> element, but is for some reason translated to
%E2%88%BC in the href attribute.


This is what the HTML output method is supposed to do. The XSLT 1.0
spec says:

The html output method should escape non-ASCII characters in URI
attribute values using the method recommended in Section B.2.1 of
the HTML 4.0 Recommendation.

This should not be a problem for you: it represents the same
character. URIs are not allowed to contain non-ASCII characters, and
a web browser would do this translation itself if it were not already
done in the HTML.

-- Richard
Mar 27 '06 #3
In article <48************@individual.net>,
Peter Flynn <re*********@m.from.email.address> wrote:
Your converter has re-encoded it as three bytes representing the
Unicode for a sim character because you can't use Unicode characters
in URIs (yet, wait until the Chinese proposal goes through :-)


No need to wait for a Chinese proposal, we already have IRIs:

http://www.ietf.org/rfc/rfc3987.txt

-- Richard
Mar 27 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
4227
by: Christian Roth | last post by:
Hello, when using this "identity" processing sheet: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" encoding="iso-8859-1" /> ...
6
2807
by: Vincent Lefevre | last post by:
I would like to know if the base URI considered to resolve an unparsed entity defined by a relative URI should be the URI before or after its rewriting due to a possible catalog. Let's take an...
5
2254
by: Tristan Miller | last post by:
Greetings. I have an XML file listing various information about text glyphs (Unicode value, HTML entity name, SGML entity name, etc.). All glyphs have a Unicode value, but not all of them have...
8
2565
by: David Dorward | last post by:
I'm looking for an XSLT that I can use to transform XHTML 1.0 Strict into HTML 4.01. Does anyone know of a nice prewritten one? -- David Dorward ...
3
1216
by: Fredy Muñoz [MCP] | last post by:
Hello there! I have a couple of questions about generating HTML using an XSLT Stylesheet. I use the System.Xml and System.Xml.Xsl namespaces and a XslTransform object to make the...
1
4912
by: basavaraj koti | last post by:
I need to show image using xslt Below provided in my xml and xslt. <?xml version="1.0" encoding="iso-8859-1"?> <?xml-stylesheet type="text/xsl" href="../xyz.xsl"?> <Grade class="03"...
2
22747
jkmyoung
by: jkmyoung | last post by:
Here's a short list of useful xslt general tricks that aren't taught at w3schools. Attribute Value Template Official W3C explanation and example This is when you want to put dynamic values...
3
2100
sujathaeeshan
by: sujathaeeshan | last post by:
hi all, here is xml file..... <root> <Xmltype> <owner NAME="Legal Entity 1"></owner> <LegalEntity NAME="Legal Entity 1"></LegalEntity> <lob NAME="Line Of Business 1"></lob>...
0
7194
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7070
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7449
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5566
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
4993
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4666
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3160
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1495
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
372
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.