Connecting Tech Pros Worldwide Help | Site Map

Markup, Punctuation and Text-to-Speech

Lachlan Hunt
Guest
 
Posts: n/a
#1: Jul 24 '05
Hi,
I have recently downloaded and experemented with IBM HPR 3.0, and
Opera 8 with text-to-speech, and have come to realise some fairly
annoying issues regarding punctuation marks.

I've found, that when a punctuation mark occurs directly after an
element, both HPR and Opera 8 will read the punctuation mark.

For example, the following:
<p><abbr title="...">HTML</abbr> is an application of
<abbr title="...">SGML</abbr>. However, ...</p>
<p>HTML should be served as <code>text/html</code>.</p>

Will be spoken as:
"HTML is an application of SGML dot However..."
"HTML should be served as text/html dot"

The same result occurs no matter which elements or punctuation marks are
used (except that it speaks "comma" for ",", etc.).

The only solution I can think of to help aural browsers read the
sentence correctly, without unnecessarily speaking the punctuation mark,
is to include it within the element like this:
<abbr title="...">SGML.</abbr>
<code>text/html.</code>

However, I usually don't include puncutation within elements like that
because it's not usually part of the abbreviation or code itself, but
rather part of the sentence as a whole, and it doesn't seem semantically
correct to do so.

So, from a semantic point of view and from an accessibility point of
view, is it better to always include any punctuation within the element
regardless, or is there another way to inform aural browsers of the
correct way to read it, perhaps with stylesheets?

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Tim
Guest
 
Posts: n/a
#2: Jul 24 '05

re: Markup, Punctuation and Text-to-Speech


On Sun, 10 Apr 2005 16:51:07 +1000,
Lachlan Hunt <spam.my.gspot@gmail.com> posted:
[color=blue]
> I've found, that when a punctuation mark occurs directly after an
> element, both HPR and Opera 8 will read the punctuation mark.
>
> The only solution I can think of to help aural browsers read the
> sentence correctly, without unnecessarily speaking the punctuation mark,
> is to include it within the element like this:
> <abbr title="...">SGML.</abbr>
> <code>text/html.</code>[/color]

As you point out, it doesn't belong there. Don't put it there.
[color=blue]
> However, I usually don't include puncutation within elements like that
> because it's not usually part of the abbreviation or code itself, but
> rather part of the sentence as a whole, and it doesn't seem semantically
> correct to do so.[/color]

Correct. Write things correctly. Let faulty software behave faultily.
You'll go mad trying to pander to all of the faults in all of the
browsers-there's masses of them, faults and browsers-and you'll end up with
a page full of crap to accomodate them.

Write to the authors of the software, and point out this fault. They
mightn't have noticed. If they, then, tell you to munge the content to
suit their foibles, bluntly tell them that's wrong.

--
If you insist on e-mailing me, use the reply-to address (it's real but
temporary). But please reply to the group, like you're supposed to.

This message was sent without a virus, please delete some files yourself.
jake
Guest
 
Posts: n/a
#3: Jul 24 '05

re: Markup, Punctuation and Text-to-Speech


In message
<4258ccd9$0$6671$5a62ac22@per-qv1-newsreader-01.iinet.net.au>, Lachlan
Hunt <spam.my.gspot@gmail.com> writes[color=blue]
>Hi,
> I have recently downloaded and experemented with IBM HPR 3.0, and
>Opera 8 with text-to-speech, and have come to realise some fairly
>annoying issues regarding punctuation marks.
>
>I've found, that when a punctuation mark occurs directly after an
>element, both HPR and Opera 8 will read the punctuation mark.
>
>For example, the following:
> <p><abbr title="...">HTML</abbr> is an application of
> <abbr title="...">SGML</abbr>. However, ...</p>
> <p>HTML should be served as <code>text/html</code>.</p>
>
>Will be spoken as:
> "HTML is an application of SGML dot However..."
> "HTML should be served as text/html dot"
>
>The same result occurs no matter which elements or punctuation marks
>are used (except that it speaks "comma" for ",", etc.).
>
>The only solution I can think of to help aural browsers read the
>sentence correctly, without unnecessarily speaking the punctuation
>mark, is to include it within the element like this:
> <abbr title="...">SGML.</abbr>
> <code>text/html.</code>
>
>However, I usually don't include puncutation within elements like that
>because it's not usually part of the abbreviation or code itself, but
>rather part of the sentence as a whole, and it doesn't seem
>semantically correct to do so.
>
>So, from a semantic point of view and from an accessibility point of
>view, is it better to always include any punctuation within the element
>regardless, or is there another way to inform aural browsers of the
>correct way to read it, perhaps with stylesheets?
>[/color]
What do you gain by not including the punctuation mark within the range
of the element as you know it will be spoken if you don't?

Personally, I take the pragmatic approach of *always* including the
punctuation mark within the range of the element to prevent it being
spoken, as I prefer to make it as easy as possible on the listening
audience. Not quite 'correct'? Maybe. But I can live with that.

regards,
--
Jake

Alan J. Flavell
Guest
 
Posts: n/a
#4: Jul 24 '05

re: Markup, Punctuation and Text-to-Speech


On Sun, 10 Apr 2005, jake wrote:
[color=blue][color=green]
> > "HTML is an application of SGML dot However..."
> > "HTML should be served as text/html dot"[/color][/color]
[color=blue]
> What do you gain by not including the punctuation mark within the
> range of the element[/color]

The dot is not part of that piece of content, and therefore does not
belong inside the element. Most assuredly HTML should not be served
as "text/html.", and there would be a serious problem if anyone
attempted it. Also the abbreviation for SGML is not normally written
with a trailing dot - so the dot is *not* part of its abbreviation, in
fact it marks the end of the sentence, and so the dot should not be
inside the abbreviation markup. It's all part of the general rule to
mark content up consistently, rather than pander to the shortcomings
of one or other browser.
[color=blue]
> as you know it will be spoken if you don't?[/color]

As it happens, I used IBM HPR earlier, and indeed noticed this
shortcoming - but I believe it would be wrong to let the markup be
significantly influenced by it. That way lies madness, and one day
HPR will be improved.
[color=blue]
> Personally, I take the pragmatic approach of *always* including the
> punctuation mark within the range of the element to prevent it being
> spoken, as I prefer to make it as easy as possible on the listening
> audience. Not quite 'correct'? Maybe. But I can live with that.[/color]

For informal contexts, that might be relatively harmless; but when
things get technical, the positioning of a dot could be critical. Just
try (for example) accessing a URL with a superfluous trailing dot (or
conversely, try accessing one which needs a trailing dot, if your
client is too clever for its own good and removes it for you!).

I'd suggest it can often be an improvement both for the visual reader
and the HPR user, if the sentence can be recast so as not to produce
the problematic juxtaposition.

Instead of

HTML should be served as text/html.

one can write something like:

Serve HTML with text/html as [its] content type.

Now there's no doubt about whether the dot belongs as part of the
content type (wrong), or is there as marking the end of the sentence
(as it was intended, but it's nor properly clear in the earlier
formulation, and that goes for visual readers too).
Jukka K. Korpela
Guest
 
Posts: n/a
#5: Jul 24 '05

re: Markup, Punctuation and Text-to-Speech


Lachlan Hunt <spam.my.gspot@gmail.com> wrote:
[color=blue]
> I've found, that when a punctuation mark occurs directly after an
> element, both HPR and Opera 8 will read the punctuation mark.[/color]

That's unfortunate. However, it's a user agent flaw that we cannot
solve in our authoring. It is of such a general nature that even if you
and me (and a few other authors who know and care) took extra trouble
to work around the problem by reformulating our texts or using
illogical markup, the users of such user agents would encounter the
problem on billions of other pages. Besides, it is an inconvenience
rather than an obstacle. So the best, and IMHO only, hope of a cure is
that the people who work with the software will fix the problem.
[color=blue]
> <p><abbr title="...">HTML</abbr> is an application of
> <abbr title="...">SGML</abbr>. However, ...</p>
> <p>HTML should be served as <code>text/html</code>.</p>[/color]

The examples are somewhat different from each other. In the latter
paragraph, the <code> markup affects, in practice, the rendering, and
it may help an automatic translator like BabelFish to avoid trying to
translate "text/html". The <abbr> markup, on the other hand, is more
questionable: usually no tangible benefits, but some potential
problems.

In particular, if you use <abbr> just in order to use a title="..."
attribute to specify the expansion, you are mostly doing things wrong
way. Explanations should be explicit, or given with links.
Unfortunately the WCAG 1.0 tells us to use <abbr> or <acronym> for all
abbreviations and acronyms, which is worse than waste of time.

Avoiding redundant or harmful markup doesn't solve the original problem
of course; there are many situations where a markup element is needed
in a context where it is immediately followed by a punctuation mark.
Trying to avoid them or putting the mark inside the element would
create more problems than it would solve. (For example, the element
would often be an <a> element that sets up a link. Putting a
punctuation mark at the end of a link text when the mark does not
belong there would create several small problems.)

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Lachlan Hunt
Guest
 
Posts: n/a
#6: Jul 24 '05

re: Markup, Punctuation and Text-to-Speech


Jukka K. Korpela wrote:[color=blue]
> Lachlan Hunt <spam.my.gspot@gmail.com> wrote:
>[color=green]
>>I've found, that when a punctuation mark occurs directly after an
>>element, both HPR and Opera 8 will read the punctuation mark.[/color]
>
> That's unfortunate. However, it's a user agent flaw that we cannot
> solve in our authoring.[/color]

Ok, good point. I guess, just like I normally wouldn't try to fix a
rendering bug for IE with markup hacks, I shouldn't try it with any
other browser either. Although, since this one seems to affect
accessibility a bit more, I thought it would be nice to at least attempt
a fix.
[color=blue]
> the users of such user agents would encounter the problem on billions
> of other pages. Besides, it is an inconvenience rather than an obstacle.[/color]

Very true, so I'd guess they'd be used to it and they would at least
understand my intent, despite the minor annoyance.
[color=blue]
> In particular, if you use <abbr> just in order to use a title="..."
> attribute to specify the expansion, you are mostly doing things wrong
> way. Explanations should be explicit, or given with links.[/color]

Do you mean, that:

<abbr title="Standard Generalised Markup Langauge">SGML</abbr>

is an incorrect use of abbr, unless I make it a useful link or have
provided further explanation within the document?
[color=blue]
> Unfortunately the WCAG 1.0 tells us to use <abbr> or <acronym> for all
> abbreviations and acronyms, which is worse than waste of time.[/color]

I agree with your point about not using <abbr> for every single
abbreviation. To do so is more annoying than beneficial, not to
mentiona pain to markup a document with so many.

In fact had I done so with my latest blog entry, I would have ended up
with around 100 <abbr> elements in the whole document, given all the
times I mentioned SGML, XML, HTML, XHTML, etc. I now tend to only
markup abbreviations where the intended audience is not expected to
understand what they are, and generally only for the first occurance of
each in the document.
[color=blue]
> (For example, the element would often be an <a> element that sets
> up a link. Putting a punctuation mark at the end of a link text when
> the mark does not belong there would create several small problems.)[/color]

What small problems could it create for a link? Of all the elements, I
thought <a> would be one that could get away with such a hack, without
being too semantically incorrect.

--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Zifud
Guest
 
Posts: n/a
#7: Jul 24 '05

re: Markup, Punctuation and Text-to-Speech


Lachlan Hunt wrote:[color=blue]
> Jukka K. Korpela wrote:
>[/color]
[...][color=blue][color=green]
>> (For example, the element would often be an <a> element that sets
>> up a link. Putting a punctuation mark at the end of a link text when
>> the mark does not belong there would create several small problems.)[/color]
>
>
> What small problems could it create for a link? Of all the elements, I
> thought <a> would be one that could get away with such a hack, without
> being too semantically incorrect.
>[/color]

Gotta agree with you there. The content of an element is irrelevant
in nearly every case, it is the attributes that are important.

While purists may rage against the semantics of putting periods where
they aren't 'supposed' to be, your users may well appreciate your
efforts.

When all else fails, use Alan's solution.


--
Zif
Jukka K. Korpela
Guest
 
Posts: n/a
#8: Jul 24 '05

re: Markup, Punctuation and Text-to-Speech


Lachlan Hunt <spam.my.gspot@gmail.com> wrote:
[color=blue]
> Do you mean, that:
>
> <abbr title="Standard Generalised Markup Langauge">SGML</abbr>
>
> is an incorrect use of abbr, unless I make it a useful link or have
> provided further explanation within the document?[/color]

More or less so; though I would say "useless" or "a little worse than
useless" rather than "incorrect". The harm comes from the author's idea
that the attribute _helps_ users. Actually, it is invisible to the vast
majority of users (including all IE users). And writing such attributes
in effect makes authors escape the task of actually explaining the
abbreviations they use or at least pointing to explanations.

Besides, it's "Standard Generalized Markup Language", so you made two
typos. This is rather symptomatic: attribute values contain much more
typos than normal text - partly because even the author himself does
not normally see them as document content.
[color=blue][color=green]
>> Unfortunately the WCAG 1.0 tells us to use <abbr> or <acronym> for
>> all abbreviations and acronyms, which is worse than waste of time.[/color]
>
> I agree with your point about not using <abbr> for every single
> abbreviation.[/color]

Besides, WCAG 1.0 does not really tell us whether we should do so or
just for first occurrences. The general idea might be that browsers are
expected to understand things from the markup for the first occurrence.
I don't think any browser makes any attempt at that. Moreover, WCAG 1.0
or the W3C in general still hasn't told us what the **** is the
difference between acronym and abbr. But the W3C HTML working group
tends to throw away acronym. So you can choose between a tag that is
not supported and a tag that will probably become obsolete.

Or you can just regard acronym and abbr as useless, or use them for fun
only (with some risks).
[color=blue][color=green]
>> (For example, the element would often be an <a> element that sets
>> up a link. Putting a punctuation mark at the end of a link text
>> when the mark does not belong there would create several small
>> problems.)[/color]
>
> What small problems could it create for a link? Of all the
> elements, I thought <a> would be one that could get away with such
> a hack, without being too semantically incorrect.[/color]

Links are what makes HTML hypertext. They should be used properly, not
hacked. Think about a speech-based user agent reading all links on a
page, on user request, saying "period" or "full stop" at the end of
some links. Think about an indexing program that needs to treat
"SGML" and "SGML." as separate links (since we know that "." can be
part of an expression and make a difference). Think about user
confusion if the user is careful enough to look at the link text and
wonder why the punctuation is included. He won't know your reasons, and
should not need to ask.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Closed Thread