473,320 Members | 1,825 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Quoting literals properly

Hi,

Consider the following XML document:

<article>
This is a sample <literal>document</literal>.
Some <literal>words</literal>, from some reason, are tagged with the
<literal>literal</literal> tag.
</article>

I'm using the following XSL to output html, where each literal is
surrounded by double quotes:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<html>
<xsl:apply-templates/>
</html>
</xsl:template>

<xsl:template match="literal">
<xsl:text>&quot;</xsl:text>
<xsl:apply-templates/>
<xsl:text>&quot;</xsl:text>
</xsl:template>

</xsl:stylesheet>

The result is the following:
<html>
This is a sample "document". Some "words", from some reason, are tagged
with the "literal" tag.
</html>

Problem is that this output doesn't obey the (somewhat strange) rules
of American English. The proper way to quote those literals would be:

This is a sample "document." Some "words," from some reason, are tagged
with the "literal" tag.
Notice that periods and commas that follow a literal end up being
inside the quotes.

My question is: Can anyone help me writing an XSL that would do this
quoting job correctly?

Thanks!

Yoni

Feb 15 '06 #1
12 1422
> My question is: Can anyone help me writing an XSL that would do this
quoting job correctly?


See recent discussion. XSLT is tuned much more for operating on nodes as
units than munging the contents of nodes. You can do it, but it isn't
going to be very pretty.

Might be easier to do what you're doing now, then run the output through
an appropriate sed script.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 16 '06 #2
yoni wrote:
Problem is that this output doesn't obey the (somewhat strange) rules
of American English. The proper way to quote those literals would be:

This is a sample "document." Some "words," from some reason, are tagged
with the "literal" tag.
Notice that periods and commas that follow a literal end up being
inside the quotes.


Then stop using this bizarre American (?) construction and use the
Queen's English. It's right, it's rational, it's defensible by citation
and it's much easier to implement.

http://en.wikipedia.org/wiki/Wikiped...uotation_marks

"When punctuating quoted passages include the mark of punctuation
inside the quotation marks only if the sense of the mark of punctuation
is part of the quotation. This is the style used in Australia, New
Zealand, and Britain, for example."

Feb 16 '06 #3
yoni wrote:
Hi,

Consider the following XML document:

<article>
This is a sample <literal>document</literal>.
Some <literal>words</literal>, from some reason, are tagged with the
<literal>literal</literal> tag.
</article>
An excellent, if possibly unintentional, example of tag abuse.

<literal> (at least in DocBook) is for identifying strings which must
be used character-for-character as they are shown, despite the possible
ambiguit of the surrounding text. They are normally displayed in a
monospace (typewriter) font to ensure that they are visually distinct
and that there is no l/1/I or 0/O misinterpretation.

DocBook (again) has the <wordasword> element type, which is probably
closer to what your example intends.
I'm using the following XSL to output html, where each literal is
surrounded by double quotes:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<html>
<xsl:apply-templates/>
</html>
</xsl:template>

<xsl:template match="literal">
<xsl:text>&quot;</xsl:text>
<xsl:apply-templates/>
<xsl:text>&quot;</xsl:text>
</xsl:template>
Except that that will output typewriter-style unidirectional quotes.
Use &#x2018; and &#x2019; if you want typographic ("curly") quotes.
</xsl:stylesheet>

The result is the following:
<html>
This is a sample "document". Some "words", from some reason, are tagged
with the "literal" tag.
</html>

Problem is that this output doesn't obey the (somewhat strange) rules
of American English. The proper way to quote those literals would be:

This is a sample "document." Some "words," from some reason, are tagged
with the "literal" tag.
Yet another good reason why the MLA is wrong :-) It's incredible
that supposedly intelligent people can continue to peddle this
canard year after year. Sadly, there seems to be no way to stop
them.
Notice that periods and commas that follow a literal end up being
inside the quotes.
Which is why using the right markup is actually quite important.
My question is: Can anyone help me writing an XSL that would do this
quoting job correctly?


Ewww. Gag. Spit. :-) Sure, no problem...

<?xml version="1.0" encoding="iso-8859-1" ?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">
<html>
<xsl:apply-templates/>
</html>
</xsl:template>

<xsl:template match="literal">
<xsl:text>&#x2018;</xsl:text>
<xsl:apply-templates/>
<xsl:choose>
<!-- add more tests for other punctuation in the first parens -->
<xsl:when
test="(substring(following-sibling::text()[1],1,1)='.' or
substring(following-sibling::text()[1],1,1)=',') and
generate-id(following-sibling::node()[1])=
generate-id(following-sibling::text()[1])">
<xsl:value-of
select="substring(following-sibling::text()[1],1,1)"/>
<xsl:text>&#x2019;</xsl:text>
<xsl:apply-templates
select="following-sibling::text()[1]" mode="mla"/>
</xsl:when>
<xsl:otherwise>
<xsl:text>&#x2019;</xsl:text>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

<!-- duplicate the test for any additional punctuation here also -->
<xsl:template
match="text()[substring(.,1,1)='.' or substring(.,1,1)=',']
[name(preceding-sibling::node()[1])='literal']"/>

<xsl:template match="text()" mode="mla">
<xsl:value-of select="substring(.,2)"/>
</xsl:template>

</xsl:stylesheet>

///Peter
--
XML FAQ: http://xml.silmaril.ie/
Feb 16 '06 #4
Peter Flynn wrote:
Ewww. Gag. Spit. :-) Sure, no problem...


Good job of typing while holding your nose... <smile/>

Of course this solution doesn't handle elipses properly. I suppose
another case could be added to handle that; "it's just a Simple Matter
Of Programming".

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 17 '06 #5
Thanks for the solution, Peter. It works great!
An excellent, if possibly unintentional, example of tag abuse.

<literal> (at least in DocBook) is for identifying strings which must
be used character-for-character as they are shown, despite the possible
ambiguit of the surrounding text. They are normally displayed in a
monospace (typewriter) font to ensure that they are visually distinct
and that there is no l/1/I or 0/O misinterpretation.

DocBook (again) has the <wordasword> element type, which is probably
closer to what your example intends.


My example document is probably not the best. The article documents
that I'm working on are used for an on-line help for a web application.
The words that are marked as literals would actually be UI elements
such as button names and screen names.
Initially, we had them displayed in bold, later it was decided to use
quotes instead, and that's how I came upon this problem.

Thanks again,

Yoni

Feb 17 '06 #6
yoni wrote:
Thanks for the solution, Peter. It works great!

An excellent, if possibly unintentional, example of tag abuse.

<literal> (at least in DocBook) is for identifying strings which must
be used character-for-character as they are shown, despite the possible
ambiguit of the surrounding text. They are normally displayed in a
monospace (typewriter) font to ensure that they are visually distinct
and that there is no l/1/I or 0/O misinterpretation.

DocBook (again) has the <wordasword> element type, which is probably
closer to what your example intends.

My example document is probably not the best. The article documents
that I'm working on are used for an on-line help for a web application.
The words that are marked as literals would actually be UI elements
such as button names and screen names.
Initially, we had them displayed in bold, later it was decided to use
quotes instead, and that's how I came upon this problem.


For that application, I'd definitely say "Ignore the rules, or go back
to bold -- or, better, actually render the buttons as buttons."

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Feb 17 '06 #7
Ignoring the rules is not an option in this case. Mainly because the
tech writer will kill me, claiming that people will think that she
doesn't know the rules...
Personally, I thought the bolding looked good, but our graphic designer
claims that it looks to much like the headers.
But hey, the code that Peter sent worked great. And I also learned a
couple of new XSL tricks.

Feb 17 '06 #8
Joe Kesselman wrote:
Peter Flynn wrote:
Ewww. Gag. Spit. :-) Sure, no problem...
Good job of typing while holding your nose... <smile/>


A double brandy solved the problem.
Of course this solution doesn't handle elipses properly. I suppose
another case could be added to handle that; "it's just a Simple Matter
Of Programming".


Hah! Depends whether the user has inserted a UTF-8 horizontal ellipsis
character (quâ character or &hellip; or &#x02026;) or if they've
just typed dot dot dot into a dumb editor. And I'm not sure if even the
MLA require "word..." rather than "word"..., although I wouldn't put it
past them.

///Peter
Feb 17 '06 #9
yoni wrote:
Thanks for the solution, Peter. It works great!
An excellent, if possibly unintentional, example of tag abuse.

<literal> (at least in DocBook) is for identifying strings which must
be used character-for-character as they are shown, despite the possible
ambiguit of the surrounding text. They are normally displayed in a
monospace (typewriter) font to ensure that they are visually distinct
and that there is no l/1/I or 0/O misinterpretation.

DocBook (again) has the <wordasword> element type, which is probably
closer to what your example intends.
My example document is probably not the best. The article documents
that I'm working on are used for an on-line help for a web application.


All the more reason to use markup designed for the purpose.
The words that are marked as literals would actually be UI elements
such as button names and screen names.
What vocabulary are you using? I'd be interested to know if there was
a specific reason you didn't pick DocBook for the job.
Initially, we had them displayed in bold, later it was decided to use
quotes instead, and that's how I came upon this problem.


Interesting...I wonder if the MLA insist on the period after a bold
keyword being a bold period? :-)

///Peter
Feb 17 '06 #10
Joe Kesselman wrote:
For that application, I'd definitely say "Ignore the rules, or go back
to bold -- or, better, actually render the buttons as buttons."


I did exactly that for a book once. Spent a while doing a neat LaTeX
macro so that DocBook <keycap>x</keycap> became \keycap{x} and made
a little curved-sided key, shaded in light and dark grey, with an
italic x on top just where the Mac keyboard had it. Publisher didn't
believe it was automated and asked me to confirm that I had got the
"copyright" agreement over the "image" I had used :-) And then the
printer screwed it all up by over-inking on poor paper so they all
came out as muddy blotches :-)

*Never* trust a publisher.

///Peter

Feb 17 '06 #11
yoni wrote:
Ignoring the rules is not an option in this case. Mainly because the
tech writer will kill me, claiming that people will think that she
doesn't know the rules...
Yes, that's a big problem. Maybe ask her if you can jointly write a
disclaimer for the foreword, explaining why the MLA is wrong :-)
Personally, I thought the bolding looked good, but our graphic designer
claims that it looks to much like the headers.


Medium sans can be good for this, if the body copy is in a conventional
serif typeface.

///Peter
Feb 17 '06 #12
> What vocabulary are you using? I'd be interested to know if there was
a specific reason you didn't pick DocBook for the job.


I actually did not know of DocBook before. I had to build an on-line
help framework for a new web application, so I figured that XML will be
a good choice. I have 3 types of documents: 'glossary', 'FAQ' and 'help
articles'. The first 2 are quite straight forward. Glossary being a
list of terms, each with a name and a definition, and FAQ being a list
of questions and answers.
For the articles, the vocabulary is based on HTML documents that I got
from the tech writer. Looking at the class names she was using and
comparing it to DocBook, I do see some resemblance. I suspect that the
previous company she was working for did have their schema based on
DocBook.

Feb 18 '06 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Allan | last post by:
I need help writing a program. I need to convert temps. from fahernheit to celsius. This is what I have. Can someone help? // This program is for converting temperatures recorded in Fahrenheit to...
21
by: Headless | last post by:
I've marked up song lyrics with the <pre> tag because it seems the most appropriate type of markup for the type of data. This results in inefficient use of horizontal space due to UA's default...
383
by: John Bailo | last post by:
The war of the OSes was won a long time ago. Unix has always been, and will continue to be, the Server OS in the form of Linux. Microsoft struggled mightily to win that battle -- creating a...
27
by: Ron Adam | last post by:
Hi, I found the following to be a useful way to access arguments after they are passed to a function that collects them with **kwds. class namespace(dict): def __getattr__(self, name): return...
8
by: zhiwei wang | last post by:
I remember that there is a function that could invoke shell command such as "rm" "cp", directly in .c file. But I could not recall its name, and I googled with nothing meaningful. I vaguely...
6
by: copx | last post by:
Can you / are you supposed to free() string literals which are no longer needed? In my case I've menu construction code that looks like this: menu_items = list_new(); list_add(menu_items,...
21
by: Marius Lazer | last post by:
Is it possible to write a macro that single-quotes its argument? #define SOME_MACRO(x) such that SOME_MACRO(foo) expands to 'foo' Thanks, Marius
27
by: SasQ | last post by:
Hello. I wonder if literal constants are objects, or they're only "naked" values not contained in any object? I have read that literal constants may not to be allocated by the compiler. If the...
7
by: Steven W. Orr | last post by:
Python has a number of "quoting" 'options' to help """with times when""" one way may be more convenient than another. In the world of shell scripting, I use a technique that I call minimal...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.