469,626 Members | 1,442 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,626 developers. It's quick & easy.

XSLT: probably beginner-type question

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Yo!

Mostly as a finger-exercise (and because I'm annoyed again and again how
bad the existing solutions are), I'm hacking up a web-based forum (yes,
the 64832th one, I know).

I want to allow some simplified HTML as input language, and use some xsl
to render that afterward. I do simple verification with a DTD (no
problem in that), but the rendering into html is where I'm stuck
currently.

The articles are stored as:
=====
<article>
<head>
<author>squiggle</author>
<title>Quack</title>
<date>2006-02-26 15:07:12</date>
</head>
<body>Says the Duck.</body>
</article>
=====

The body can be plain text, but mostly it should be simple HTML. I
decided on the following feature set: <em>, <tt>, <p>, <a href="">,
<img alt="" src=""> should be passed through unchanged. <title> should
be changed to <h3>, and <blockquote cite="url"> should be changed to
something like <div><blockquote cite="url">'the
quote'</blockquote><div><a href="url">url</a></div></div>, if an URL is
present, or only blockquote else.

The whole should be wrapped within some <div>'s, showing article title,
author, date etc.

Now I've written some small piece of XSL:
=====
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>

<!-- title becomes h3 -->
<xsl:template match="title">
<h3><xsl:apply-templates/></h3>
</xsl:template>

<!-- blockquote gets visible citation -->
<xsl:template match="blockquote">
<xsl:element name="blockquote">
<xsl:attribute name="cite">
<xsl:value-of select="@cite"/>
</xsl:attribute>
<div class="text"><xsl:apply-templates/></div>
<div class="cite">
<xsl:element name="a">
<xsl:attribute name="href">
<xsl:value-of select="@cite"/>
</xsl:attribute>
<xsl:value-of select="@cite"/>
</xsl:element>
</div>
</xsl:element>
</xsl:template>

<!-- a, em, img, p, tt go through unchanged -->
<xsl:template match="a|em|img|p|tt">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

<!-- the complete head is stripped out -->
<xsl:template match="head">
</xsl:template>
</xsl:stylesheet>
=====
And I'm right now extracting article titles etc. manually from the perl
code, and calling the xsl only on the body element. But this is not
really what I want... The point where I'm stuck is with the rules
which part of the xsl is called at which point on which part of the xml
file. And also: I'll certainly need to filter the img and a tags,
because I *only* want to allow the attributes I've talked about above.
(OTOH this is already taken care of by the DTD, so maybe I just won't
care.)

Pointers welcome, but most of the tutorials I've found so far stopped
where things got really interesting. Or maybe I just looked at the
problem from the wrong angle...

anyway, thanks in advance
- -- vbi
- --
Today is Prickle-Prickle, the 26th day of Discord in the YOLD 3172

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: get my key from http://fortytwo.ch/gpg/92082481

iKcEARECAGcFAkQ4zdRgGmh0dHA6Ly9mb3J0eXR3by5jaC9sZW dhbC9ncGcvZW1h
aWwuMjAwMjA4MjI/dmVyc2lvbj0xLjUmbWQ1c3VtPTVkZmY4NjhkMTE4NDMyNzYw
NzFiMjVlYjcwMDZkYTNlAAoJEIukMYvlp/fW834Anjg2J24MXgq/pitFzwHh9wPr
6q8NAKDCeA4etmeGEunQye/BAqQbj17n+w==
=2/6v
-----END PGP SIGNATURE-----
Apr 9 '06 #1
4 1373


Adrian von Bidder wrote:
The point where I'm stuck is with the rules
which part of the xsl is called at which point on which part of the xml
file.
Well first of all if you want HTML output then use
<xsl:output method="html" />
(unless you would create a complete HTML document and had a result html
root element).
As for the rules processing starts at the root node, looks for a
template and takes it from there. There are built in templates which
simply recursively process child nodes.
If you want to control that you can do e.g.
<xsl:template match="article">
and in the template process the nodes you want to process e.g.
<xsl:apply-templates select="head" />
<xsl:apply-templates select="body" />
</xsl:template>
Of course you can put in literal result elements (e.g. div) anywhere in
the template too.
And also: I'll certainly need to filter the img and a tags,
because I *only* want to allow the attributes I've talked about above.


Example
<xsl:template match="img">
<xsl:copy>
<xsl:apply-templates select="@alt | @src" />
</xsl:copy>
</xsl:template>
(Then you need a template for attributes to copy themselves.)

<xsl:template match="a">
<xsl:copy>
<xsl:apply-templates select="@href | node()" />
</xsl:copy>
</xsl:template>
--

Martin Honnen
http://JavaScript.FAQTs.com/
Apr 9 '06 #2
Adrian von Bidder wrote:
And I'm right now extracting article titles etc. manually from the perl
code, and calling the xsl only on the body element. But this is not
really what I want...
Define a template that matches "/" (the root of the document), and begin
processing from there ... for example, by invoking apply-templates
selecting //body, if that's what you want. (The root template is
generally also responsible for producing the root element of the output
document - the <html> element in this case.)

The point where I'm stuck is with the rules
which part of the xsl is called at which point on which part of the xml
file.
The rule is that apply-template invokes the template whose match
attribute matches the node it's being applied to. There are a few
(deliberately) EXTREMELY primitive rules for breaking conflicts, but
mostly it's your responsiblity to avoid letting conflicts and/or to
explicitly specify priority so the system knows which template to select.
And also: I'll certainly need to filter the img and a tags,
because I *only* want to allow the attributes I've talked about above.
(OTOH this is already taken care of by the DTD, so maybe I just won't
care.)


Create a more specific template for those, and rather than doing
apply-templates against @* (all attributes) do it only for the
attributes you want to copy through. Then do the apply against node() to
continue processing with the children.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Apr 9 '06 #3
Adrian von Bidder wrote:
Mostly as a finger-exercise (and because I'm annoyed again and again how
bad the existing solutions are), I'm hacking up a web-based forum (yes,
the 64832th one, I know).

I want to allow some simplified HTML as input language, and use some xsl
to render that afterward. I do simple verification with a DTD (no
problem in that), but the rendering into html is where I'm stuck
currently.

The articles are stored as:
=====
<article>
<head>
<author>squiggle</author>
<title>Quack</title>
<date>2006-02-26 15:07:12</date>
</head>
<body>Says the Duck.</body>
Don't do that. Put the text into at least something like <p></p>
Otherwise it will be impossible to infer it afterwards.
</article>
=====

The body can be plain text, but mostly it should be simple HTML. I
decided on the following feature set: <em>, <tt>, <p>, <a href="">,
<img alt="" src=""> should be passed through unchanged. <title> should
be changed to <h3>, and <blockquote cite="url"> should be changed to
something like <div><blockquote cite="url">'the
quote'</blockquote><div><a href="url">url</a></div></div>, if an URL is
present, or only blockquote else.

The whole should be wrapped within some <div>'s, showing article title,
author, date etc.

Now I've written some small piece of XSL:
=====
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>

<!-- title becomes h3 -->
<xsl:template match="title">
<h3><xsl:apply-templates/></h3>
</xsl:template>

<!-- blockquote gets visible citation -->
<xsl:template match="blockquote">
<blockquote cite="{@cite}">
....
<xsl:template match="blockquote">
<xsl:element name="blockquote">
<xsl:attribute name="cite">
<xsl:value-of select="@cite"/>
</xsl:attribute>
<div class="text"><xsl:apply-templates/></div>
<div class="cite">
<xsl:element name="a">
<xsl:attribute name="href">
<xsl:value-of select="@cite"/>
</xsl:attribute>
<xsl:value-of select="@cite"/>
</xsl:element>
<a href="{@cite}">
</div>
</xsl:element>
</xsl:template>

<!-- a, em, img, p, tt go through unchanged -->
<xsl:template match="a|em|img|p|tt">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

<!-- the complete head is stripped out -->
<xsl:template match="head">
</xsl:template>
Do that and it will never see author, title, or date.
</xsl:stylesheet>
=====
And I'm right now extracting article titles etc. manually from the perl
code, and calling the xsl only on the body element. But this is not
really what I want... The point where I'm stuck is with the rules
which part of the xsl is called at which point on which part of the xml
file. And also: I'll certainly need to filter the img and a tags,
If you think that tags are the same as elements, you may already be
in trouble. See http://xml.silmaril.ie/authors/makeup/
because I *only* want to allow the attributes I've talked about above.
(OTOH this is already taken care of by the DTD, so maybe I just won't
care.)


You're doing well, but you have to get the data model right first.
XSLT (using SAX) triggers a matching template (if any) for each
node it encounters, starting at the root node and processing as
deep as it can go before surfacing for air and going down again.
In your example this means the element nodes will be presented for
matching in the order article, head, author, title, date; and then
body, etc. If you explicitly specify no actions for a template,
then nothing further in any node that matches it will ever get
processed, unless it is referenced from some other template.
If no matching template is found for a node, its contents are
in effect fed back through the works, and the process continues,
acting on templates or passing down to the next level, until
nothing is left but parsed character data (PCDATA), which is
output as-is.

[Actually there is more to it than that, but that'll do]

///Peter
--
XML FAQ: http://xml.silmaril.ie/
Apr 9 '06 #4
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[...]

Martin, Joe, Peter - thanks for your input. I'll have a go at this when
I get time and come back with the next step in due time... :-)

(Right now: does the term 'cargo cult programming' tell you something?
I admit freely that I'm at this stage where xslt is concerned ;-)

cheers
- -- vbi

- --
get my gpg key here: http://fortytwo.ch/gpg/92082481

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: get my key from http://fortytwo.ch/gpg/92082481

iKcEARECAGcFAkQ77kZgGmh0dHA6Ly9mb3J0eXR3by5jaC9sZW dhbC9ncGcvZW1h
aWwuMjAwMjA4MjI/dmVyc2lvbj0xLjUmbWQ1c3VtPTVkZmY4NjhkMTE4NDMyNzYw
NzFiMjVlYjcwMDZkYTNlAAoJEIukMYvlp/fWma0AnRn67hT7xpwH17xezaIq65Hv
zUBOAJ9lCZuHRZSDG44pnr/NacfeljvWhA==
=8bpm
-----END PGP SIGNATURE-----
Apr 11 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Joachim Spoerhase | last post: by
9 posts views Thread by Christian Roth | last post: by
4 posts views Thread by timothy ma and constance lee | last post: by
5 posts views Thread by shauldar | last post: by
12 posts views Thread by Keith Chadwick | last post: by
2 posts views Thread by solex | last post: by
3 posts views Thread by thomas.porschberg | last post: by
4 posts views Thread by simon.a.hulbert | last post: by
reply views Thread by gheharukoh7 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.