473,387 Members | 1,760 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Re: XHTML 1.0 Strict and the Apostrophe

Jukka K. Korpela wrote:
Scripsit Andy Dingley:
>As to the difference between ' or ’, we had a long thread on
this fairly recently (few months), centred on the fact that "single
quote" and "apostrophe" are really not very clearly defined as
distinct in the available character sets, even Unicode.

I don't know who these "we" are, but the references denote distinct
characters without doubt, and the only confusion is around the
unfortunate _names_. The _Unicode name_ of the Ascii apostrophe, ' (or
'), is APOSTROPHE, but that's just a name, an identifier, and not
descriptive of meaning (actually, it's misleading, but Unicode names
will never be changed).
>With much less consensus, the general outcome was that you can
reasonably use whichever you like, neither is ever "wrong" (except
that ‛ should be paired with ’, but not with ') and
that you'd quite reliably get a visually different glyph for each,
either straight or curly. Apart from that, there's no hard-and-fast
rule ' is only ever an "apostrophe" and never a "quote".

Sorry, but that paragraph has far too much confusion to be analyzed.

Here's the picture:

The ASCII apostrophe ' works fairly universally in text, but it's almost
never the _right_ character for anything, except in computer languages.
Consider it as a poor man's excuse for a surrogate for a large
collection of characters. Use it that way if you are lazy or have made
an informed decision (a compromise), but don't you ever be proud of
that.
IBTD. For example, in English it is customary (and AIUI expected) to use
the character that ’ represents should be used to delimit a quotation
within direct speech (which itself should be delimited by “ and
”. (I gathered that from reading several English books.)

I think you would agree that it would make especially English text with
quotations in direct speech (say, in a novel where one person tells another
what a third said) quite badly legible if somewhere there is an apostrophe
represented by ’ in the inner quotation, because you would have to
look very hard at the character and the context to see whether the inner
quotation ends or there is just an apostrophe in it. (BTDT, but YMMV if you
are a speaker of English as first language.)

Since apostrophes appear to occur quite often in English texts, I have
therefore decided that in my English texts, ' (the straight apostrophe,
' or ') is the appropriate character for all apostrophes as it
is clearly distinguishable from "the curly one" using the standard fonts
provided by common UIs. If you want to call that a compromise -- I call
it an informed design decision in support of usability (that should have
been made by the Unicode people instead if what you say below is correct).

To be proud about that is yet another thing. But what reasonable
alternative to the aforementioned approach would you suggest instead?
For other characters, consult the applicable language and style guides
(for _human_ languages).

Note that ’ _should_ have a curly (curved) glyph but it's similar
to a prime (yard symbol) in some fonts. It is explicitly recommended as
punctuation apostrophe in the Unicode standard, and the standard also
explicitly says that it is the same character as the right single
quotation mark.
So it would seen that the standard recommends nonsense, or at least
something not universally applicable, here.
PointedEars
--
Use any version of Microsoft Frontpage to create your site.
(This won't prevent people from viewing your source, but no one
will want to steal it.)
-- from <http://www.vortex-webdesign.com/help/hidesource.htm>
Jun 27 '08 #1
9 3579
Scripsit Thomas 'PointedEars' Lahn:
I think you would agree that it would make especially English text
with quotations in direct speech (say, in a novel where one person
tells another what a third said) quite badly legible if somewhere
there is an apostrophe represented by ’ in the inner quotation,
No I wouldn't. Such usage is _standard_ English, to the extent anything
is standard in English. Consult the applicable style guide and then the
Unicode Standard, which identifies the punctuation marks at the level of
coded characters.
Since apostrophes appear to occur quite often in English texts, I have
therefore decided that in my English texts, ' (the straight
apostrophe, &apos; or ') is the appropriate character for all
apostrophes
That's computerize or typewriterese - abhorred, disliked, and frowned
upon by typographers and grammars.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

Jun 27 '08 #2
On 2008-04-14, Jukka K. Korpela <jk******@cs.tut.fiwrote:
Scripsit Thomas 'PointedEars' Lahn:
>I think you would agree that it would make especially English text
with quotations in direct speech (say, in a novel where one person
tells another what a third said) quite badly legible if somewhere
there is an apostrophe represented by ’ in the inner quotation,

No I wouldn't. Such usage is _standard_ English, to the extent anything
is standard in English. Consult the applicable style guide and then the
Unicode Standard, which identifies the punctuation marks at the level of
coded characters.
>Since apostrophes appear to occur quite often in English texts, I have
therefore decided that in my English texts, ' (the straight
apostrophe, &apos; or ') is the appropriate character for all
apostrophes

That's computerize or typewriterese - abhorred, disliked, and frowned
upon by typographers and grammars.
Style guides and grammar books just tell you when to use apostrophes.
They don't say anything about whether you should use U+0027 or U+2019 to
represent them.

PointedEars is right that using U+2019 to write an apostrophe is
obviously illogical, although I don't agree that it causes any real
ambiguity or legibility problems for human readers.

But do you know _why_ the Unicode Standard recommends using U+2019?

U+0027 is called the "apostrophe" but then the description says "neutral
(vertical) glyph with mixed usage" (whatever that's supposed to mean-- I
thought we were talking about a character not a glyph) and then goes on
about how the wonderful U+2019 is preferred for practically everything.

Typographers may be right that a curlier glyph looks better, but then why
not just map the curlier glyph to both U+2019 and U+0027 in the font?

I don't understand the case against U+0027.
Jun 27 '08 #3
Ben C <sp******@spam.eggswrites:
Style guides and grammar books just tell you when to use apostrophes.
They don't say anything about whether you should use U+0027 or U+2019 to
represent them.
Pardon? Style guides *can* and sometimes actually just *do* that.
U+0027 is called the "apostrophe"
For starters, Unicode names have no semantics (they cannot even be
changed if considered ambiguous or even outright wrong later).
Typographers may be right that a curlier glyph looks better, but then why
not just map the curlier glyph to both U+2019 and U+0027 in the font?
That would be funny, but not very practical. Have you ever copied and
pasted programming code from one of those auto-smart-quoting comment
systems on techno web logs, by the way?
--
||| hexadecimal EBB
o-o decimal 3771
--oOo--( )--oOo-- octal 7273
205 goodbye binary 111010111011
Jun 27 '08 #4
Thomas 'PointedEars' Lahn wrote:
I think you would agree that it would make especially English text with
quotations in direct speech (say, in a novel where one person tells another
what a third said) quite badly legible if somewhere there is an apostrophe
represented by ’ in the inner quotation, because you would have to
look very hard at the character and the context to see whether the inner
quotation ends or there is just an apostrophe in it. (BTDT, but YMMV if you
are a speaker of English as first language.)
I think that's an exaggeration. Except in rare cases where the
apostrophe is at the end of a word it's quite easy to distinguish them
from closing single quotes, which are always at the end of a word or
after a punctuation mark.
>
Since apostrophes appear to occur quite often in English texts, I have
therefore decided that in my English texts, ' (the straight apostrophe,
&apos; or ') is the appropriate character for all apostrophes as it
is clearly distinguishable from "the curly one" using the standard fonts
provided by common UIs. If you want to call that a compromise -- I call
it an informed design decision in support of usability (that should have
been made by the Unicode people instead if what you say below is correct).
Like it or not, this isn't a problem that's newly sprung. Even before
computers, this was the convention in printed material (where the ugly
little ASCII apostrophe didn't exist--it was confined to the typewriter
and, later, to computer programming), and it didn't cause massive
difficulties. So there isn't a massive need to "fix" it now with a
mongrelization of two unrelated practices.
To be proud about that is yet another thing. But what reasonable
alternative to the aforementioned approach would you suggest instead?
The expected one, the familiar one, the one that's been in use for a
very long time.
>For other characters, consult the applicable language and style guides
(for _human_ languages).

Note that ’ _should_ have a curly (curved) glyph but it's similar
to a prime (yard symbol) in some fonts. It is explicitly recommended as
punctuation apostrophe in the Unicode standard, and the standard also
explicitly says that it is the same character as the right single
quotation mark.

So it would seen that the standard recommends nonsense,
No, it recommends existing mainstream practice.
or at least
something not universally applicable, here.
Jun 27 '08 #5
Ben C wrote:
On 2008-04-14, Jukka K. Korpela <jk******@cs.tut.fiwrote:
>Scripsit Thomas 'PointedEars' Lahn:
>>I think you would agree that it would make especially English text
with quotations in direct speech (say, in a novel where one person
tells another what a third said) quite badly legible if somewhere
there is an apostrophe represented by ’ in the inner quotation,
No I wouldn't. Such usage is _standard_ English, to the extent anything
is standard in English. Consult the applicable style guide and then the
Unicode Standard, which identifies the punctuation marks at the level of
coded characters.
>>Since apostrophes appear to occur quite often in English texts, I have
therefore decided that in my English texts, ' (the straight
apostrophe, &apos; or ') is the appropriate character for all
apostrophes
That's computerize or typewriterese - abhorred, disliked, and frowned
upon by typographers and grammars.

Style guides and grammar books just tell you when to use apostrophes.
They don't say anything about whether you should use U+0027 or U+2019 to
represent them.

PointedEars is right that using U+2019 to write an apostrophe is
obviously illogical, although I don't agree that it causes any real
ambiguity or legibility problems for human readers.
It's "illogical" in the semantic sense but since the single closing
quote and the apostrophe are assigned the same appearance by convention
it isn't any more illogical than using the Unicode exclamation point for
factorials, instead of setting off a separate code point for the
factorial mark so that some day a wacko type designer can design a font
in which the factorial symbol looks different from the exclamation point.
But do you know _why_ the Unicode Standard recommends using U+2019?

U+0027 is called the "apostrophe"
Well, that's what it was called when it was the typewriter apostrophe
and there were no curly quotes to be seen anywhere--and their use as
single quotes was infrequent.
but then the description says "neutral
(vertical) glyph with mixed usage" (whatever that's supposed to mean-- I
thought we were talking about a character not a glyph) and then goes on
about how the wonderful U+2019 is preferred for practically everything.

Typographers may be right that a curlier glyph looks better, but then why
not just map the curlier glyph to both U+2019 and U+0027 in the font?
Because we don't want no curly apostrophes in our stinkin' C++.
I don't understand the case against U+0027.
Jun 27 '08 #6
Harlan Messinger <hm*******************@comcast.netwrites:
[...] it isn't any more illogical than using the Unicode
exclamation point for factorials, instead of setting off a separate
code point for the factorial mark so that some day a wacko type
designer can design a font in which the factorial symbol looks
different from the exclamation point.
:)

Still, there’s a lot left to be desired in digital typography. The next
best practical enemy of good taste I can think of is the hyphen; U+002D
is as bad a substitute for it as U+0027 is for a proper apostrophe (this
would become apparent with typefaces that feature a distinctive canted
hyphen; I don’t personally care much for proper minus signs, though, I’m
not really geek enough to know what to look for ;).

I don’t think it is safe to use it on the web, but it is has been quite
a while that I checked.

--
||| hexadecimal EBB
o-o decimal 3771
--oOo--( )--oOo-- octal 7273
205 goodbye binary 111010111011
Jun 27 '08 #7
On 2008-04-14, Harlan Messinger <hm*******************@comcast.netwrote:
Ben C wrote:
[...]
>PointedEars is right that using U+2019 to write an apostrophe is
obviously illogical, although I don't agree that it causes any real
ambiguity or legibility problems for human readers.

It's "illogical" in the semantic sense but since the single closing
quote and the apostrophe are assigned the same appearance by convention
it isn't any more illogical than using the Unicode exclamation point for
factorials, instead of setting off a separate code point for the
factorial mark so that some day a wacko type designer can design a font
in which the factorial symbol looks different from the exclamation point.
Well not quite the same, because there isn't a separate factorial code
point.

Suppose there were. Then it would be like being told that in spite of
that we were supposed to use the exclamation mark even for factorials
and to ignore the despicable factorial code point altogether.
>But do you know _why_ the Unicode Standard recommends using U+2019?

U+0027 is called the "apostrophe"

Well, that's what it was called when it was the typewriter apostrophe
and there were no curly quotes to be seen anywhere--and their use as
single quotes was infrequent.
I think Bednarz may have a hint at the true explanation when he said
something about how the names cannot ever be changed.
>but then the description says "neutral
(vertical) glyph with mixed usage" (whatever that's supposed to mean-- I
thought we were talking about a character not a glyph) and then goes on
about how the wonderful U+2019 is preferred for practically everything.

Typographers may be right that a curlier glyph looks better, but then why
not just map the curlier glyph to both U+2019 and U+0027 in the font?

Because we don't want no curly apostrophes in our stinkin' C++.
Then you'd just use a horrible font for your stinkin' C++ in which they
appeared as nasty abhorrent typewriterized neutral vertical glyphs.

Anyway there are no apostrophes in C++, only single quotes, for which
you use apostrophes.
Jun 27 '08 #8
Jukka K. Korpela wrote:
Scripsit Thomas 'PointedEars' Lahn:
>I think you would agree that it would make especially English text
with quotations in direct speech (say, in a novel where one person
tells another what a third said) quite badly legible if somewhere
there is an apostrophe represented by ’ in the inner quotation,

No I wouldn't. Such usage is _standard_ English, to the extent anything
is standard in English. Consult the applicable style guide and then the
Unicode Standard, which identifies the punctuation marks at the level of
coded characters.
Well, compare

| Paul took two deep breaths. “She said a thing.” He closed his eyes,
| calling up the words, and when he spoke his voice unconsciously took on
| some of the old woman's tone: “ ‘You, Paul Atreides, descendant of kings,
| son of a Duke, you must learn to rule. It's something none of your
| ancestors learned.’ ” Paul opened his eyes, said: “That made me angry and
| I said my father rules an entire planet. And she said, ‘He's losing it.’
| And I said my father was getting a richer planet. And she said, ‘He'll
| lose that one, too.’ And I wanted to run and warn my father, but she said
| he'd already been warned—by you, by Mother, by many people.”
(from: Frank Herbert, Dune, book 1, chapter 4)

against

| Paul took two deep breaths. “She said a thing.” He closed his eyes,
| calling up the words, and when he spoke his voice unconsciously took on
| some of the old woman’s tone: “ ‘You, Paul Atreides, descendant of kings,
| son of a Duke, you must learn to rule. It’s something none of your
| ancestors learned.’ ” Paul opened his eyes, said: “That made me angry and
| I said my father rules an entire planet. And she said, ‘He’s losing it.’
| And I said my father was getting a richer planet. And she said, ‘He’ll
| lose that one, too.’ And I wanted to run and warn my father, but she said
| he’d already been warned—by you, by Mother, by many people.”

Which one do you consider better legible?
>Since apostrophes appear to occur quite often in English texts, I have
therefore decided that in my English texts, ' (the straight
apostrophe, &apos; or ') is the appropriate character for all
apostrophes

That's computerize or typewriterese - abhorred, disliked, and frowned
upon by typographers and grammars.
IBTD. At least as for regular grammars, having the straight apostrophe only
as the apostrophe and ’ only for closing single quote makes it a lot
easier to parse the text.
PointedEars
--
realism: HTML 4.01 Strict
evangelism: XHTML 1.0 Strict
madness: XHTML 1.1 as application/xhtml+xml
-- Bjoern Hoehrmann
Jun 27 '08 #9
Scripsit Thomas 'PointedEars' Lahn:
>No I wouldn't. Such usage is _standard_ English, to the extent
anything is standard in English. Consult the applicable style guide
and then the Unicode Standard, which identifies the punctuation
marks at the level of coded characters.

Well, compare
Which style guide did you consult?
(from: Frank Herbert, Dune, book 1, chapter 4)

against
Which style was used in the printed book? I haven't read it, but I think
I know the answer.
IBTD. At least as for regular grammars, having the straight
apostrophe only as the apostrophe and ’ only for closing single
quote makes it a lot easier to parse the text.
And using "." only as a full stop and never as a decimal point or an
abbreviation point would make parsing even more easier. But that's
completely irrelevant here. It's not feasible to resolve ambiguities
that way, especially since the world around won't listen to your
rationalizing arguments.

The only relevant thing in HTML perspective is that &apos; (when
implemented at all) denotes the Ascii apostrophe and - against common
superstition of unknown origin - not the typographically and
orthographically correct apostrophe of English and other human
languages. This entity reference is best forgotten: it's almost never
needed, and should you need it, the character reference is much safer.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

Jun 27 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

24
by: Nobody | last post by:
Okay, you are all so smart in here. Answer me this: IE6 in standards mode doesn't seem to hide scrollbars on the body element (overflow:hide) Ain't this a quandary. I have it in my head that I...
35
by: The Bicycling Guitarist | last post by:
My web site has not been spidered by Googlebot since April 2003. The site in question is at www.TheBicyclingGuitarist.net/ I received much help from this NG and the stylesheets NG when updating the...
16
by: Peter Maas | last post by:
The XHTML file below creates a 2x2 matrix of square images. There is always some space at the bottom borders of the cells (when rendered with Gecko and KHTML, not with IE) and I've found no way so...
16
by: Mcginkel | last post by:
I am trying to find a way to load XHTML content in an Iframe. I use to do this in html by using the following code : var iframeObject = document.createElement("iframe");...
82
by: Buford Early | last post by:
I read this in http://annevankesteren.nl/2004/12/xhtml-notes "A common misconception is that XHTML 1.1 is the latest version of the XHTML series. And although it was released a bit more than a...
24
by: Dan Jacobson | last post by:
I shall jump on the XHTML bandwagon. I run my perfectly good html4/strict pages thru $ tidy -asxhtml -utf8 #to get: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"...
20
by: Alan Silver | last post by:
Hello, I have read about the problems that IE has when using a doctype of HTML 4.01 Transitional. I was advised to use Strict wherever possible. My question is, does the same apply to XHTML...
11
by: Michael Powe | last post by:
How can I make an XHTML-compliant form of an expression in this format: document.write("<scr"+"ipt type='text/javascript' src='path/to/file.js'>"+"</scr"+"ipt>"); this turns out to be a...
1
by: DeveloperQuest | last post by:
How do I make Xhtml 1.0 strict or Xhtml 1.1 strict appear in the Visual Studio 2005 dropdown? How do I add a Xhtml 1.0 strict to the dropdown in visual studio 2005? I have used the default...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.