By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
454,632 Members | 1,425 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 454,632 IT Pros & Developers. It's quick & easy.

Stragglers in 4.01 strict

P: n/a
Why were <i>, <b>, <tt>, <big>, and <small> retained in HTML 4.01 Strict?
How about the cellpadding, cellspacing, rules, and frame attributes for the
<table> tag?

I would expect the answer to be "backwards compatibility", but I thought
that's what Transitional was for. Or is it that non-structural elements have
been divided conceptually into two layers of backwards compatibility, one
consisting of features thought to be in greater need of continuing support
than the features allocated to the other layer?

--
Harlan Messinger
Remove the first dot from my e-mail address.
Veuillez ôter le premier point de mon adresse de courriel.

Jul 20 '05 #1
Share this Question
Share on Google+
55 Replies


P: n/a
Harlan Messinger wrote:
Why were <i>, <b>, <tt>, <big>, and <small> retained in HTML 4.01 Strict?
How about the cellpadding, cellspacing, rules, and frame attributes for the
<table> tag?


i, b, small and big should obviously have been taken out (though small
is the only problematic one). I can see the use of tt (typed), though
that seems to be the function of kbd. And about the table attributes,
maybe it was because HTML 4.0 came out in 1997 and CSS 2 (which
introduced these capabilities) wasn't until 1998. HTML 4.01, dated
1999, was merely a bug-fix release.

But for whatever rational answer one devises, the real reason is the
HTML WG, in their indefinite hypocrisies, decided so.

Jul 20 '05 #2

P: n/a
On Thu, 29 Jan 2004, Keith Bowes wrote:
Why were <i>, <b>, <tt>, <big>, and <small> retained in HTML 4.01 Strict?


i, b, small and big should obviously have been taken out


Why? And why "obviously"? I don't get it.
On the contrary, I don't understand what's wrong with <u> that it was
taken out from HTML 4 Strict.

Jul 20 '05 #3

P: n/a
Andreas Prilop wrote:

i, b, small and big should obviously have been taken out

Why? And why "obviously"? I don't get it.
On the contrary, I don't understand what's wrong with <u> that it was
taken out from HTML 4 Strict.


Because people are too inclined to use i instead of em, for example.
And therefore make accurate indexing and archiving difficult (is the
information emphasized or just italicized for looks?).

And all uses of underlining for things other than links must be
abolished! Furthermore, there was no derived intrinsic meaning, like
there is for, for example, s and tt. Is it a title, inserted text, or
simply presentation?

But if you really don't understand, either keep using HTML 3.2 (or your
tag soup de jour) or do some reading.

And just so you know, "obviously" is for reasons of consistency. It's
inconsistent to take out some presentational contamination, yet keep others.

Jul 20 '05 #4

P: n/a
"Andreas Prilop" <nh******@rrzn-user.uni-hannover.de> wrote in message
news:Pine.GSO.4.44.0401291832380.5861-100000@s5b004...
On Thu, 29 Jan 2004, Keith Bowes wrote:
Why were <i>, <b>, <tt>, <big>, and <small> retained in HTML 4.01
Strict?
i, b, small and big should obviously have been taken out


Why? And why "obviously"? I don't get it.
On the contrary, I don't understand what's wrong with <u> that it was
taken out from HTML 4 Strict.


Because they are presentational, not structural... and presentation should
be handled by CSS, not HTML.

Regards,
Peter Foti


Jul 20 '05 #5

P: n/a
Keith Bowes:
And all uses of underlining for things other than links must be
abolished!


Are you saying underlining should be removed from CSS, or that CSS
should be changed to make it impossible to use underlining for anything
else than links?

--
Bertilo Wennergren <be******@gmx.net> <http://www.bertilow.com>
Jul 20 '05 #6

P: n/a

"Andreas Prilop" <nh******@rrzn-user.uni-hannover.de> wrote in message
news:Pine.GSO.4.44.0401291832380.5861-100000@s5b004...
On Thu, 29 Jan 2004, Keith Bowes wrote:
Why were <i>, <b>, <tt>, <big>, and <small> retained in HTML 4.01
Strict?
i, b, small and big should obviously have been taken out
Why? And why "obviously"? I don't get it.


I thought HTML 4.01 was largely about removing presentational aspects from
the content.
On the contrary, I don't understand what's wrong with <u> that it was
taken out from HTML 4 Strict.


And on the double contrary, the fact that it was removed makes it even more
strange to me that <i> and <b> were left in.

Jul 20 '05 #7

P: n/a
On Thu, 29 Jan 2004 18:35:21 +0100, Andreas Prilop
<nh******@rrzn-user.uni-hannover.de> wrote:
On the contrary, I don't understand what's wrong with <u> that it was
taken out from HTML 4 Strict.


Underlined text is not only presentational, it's confusing. Underlined
text resembles links, and it pisses off the user.
Jul 20 '05 #8

P: n/a
Neal <ne*****@spamrcn.com> wrote:
Underlined text is not only presentational, it's confusing. Underlined
text resembles links, and it pisses off the user.


I need an underlined "s"
<http://www.unics.uni-hannover.de/nhtcapri/arabic.html6>
<http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html>
and I continue to write <u>s</u>.
Jul 20 '05 #9

P: n/a
On Thu, 29 Jan 2004 23:28:08 +0100, Andreas Prilop
<nh******@rrzn-user.uni-hannover.de> wrote:
Neal <ne*****@spamrcn.com> wrote:
Underlined text is not only presentational, it's confusing. Underlined
text resembles links, and it pisses off the user.


I need an underlined "s"
<http://www.unics.uni-hannover.de/nhtcapri/arabic.html6>
<http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html>
and I continue to write <u>s</u>.

Well, in your context the established transliteration includes the
underline, so I agree it's a problem. I suppose the Persian/Urdu speaking
community could petition for the underlined S and Z to be available. I
have no idea how to go about that, though.

In the meantime, if the <u> markup is deprecated, there are other options,
like styling with text-decoration: underline. Not that I can come up with
a foolproof solution.
Jul 20 '05 #10

P: n/a
On Thu, 29 Jan 2004 12:52:31 -0500, "Peter Foti"
<pe***@Idontwantnostinkingemailfromyou.com> declared in
comp.infosystems.www.authoring.html:
Because they are presentational, not structural... and presentation should
be handled by CSS, not HTML.


I agree, it should. But in the case of <i> in particular, the line
between content and presentation is somewhat blurred. Take ship names
for example.

See the archives of this group for various discussions on this topic.
:-)

--
Mark Parnell
http://www.clarkecomputers.com.au
Jul 20 '05 #11

P: n/a
In article <op**************@news.rcn.com>, Neal <ne*****@spamrcn.com>
wrote:
On Thu, 29 Jan 2004 23:28:08 +0100, Andreas Prilop
<nh******@rrzn-user.uni-hannover.de> wrote:
Neal <ne*****@spamrcn.com> wrote:
Underlined text is not only presentational, it's confusing. Underlined
text resembles links, and it pisses off the user.


I need an underlined "s"
<http://www.unics.uni-hannover.de/nhtcapri/arabic.html6>
<http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html>
and I continue to write <u>s</u>.

Well, in your context the established transliteration includes the
underline, so I agree it's a problem. I suppose the Persian/Urdu speaking
community could petition for the underlined S and Z to be available. I
have no idea how to go about that, though.


The Unicode Consortium <http://www.unicode.org> would be the group to petition. In theory it should already be possible
to render any needed character with HTML entities, rather than resorting to
abuse of markup's semantic intent. Has this character not been provided by
Unicode? According to the charts, 005F is an Urdu "low line". If (and
that's a big assumption since I know zilch about Urdu) that's the character
Neal is referring to, then "&3X005F;" would be its HTML entity.

--
CC
Jul 20 '05 #12

P: n/a
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote in message news:<290120042328086236%nh******@rrzn-user.uni-hannover.de>...
Neal <ne*****@spamrcn.com> wrote:
Underlined text is not only presentational, it's confusing. Underlined
text resembles links, and it pisses off the user.


I need an underlined "s"
<http://www.unics.uni-hannover.de/nhtcapri/arabic.html6>
<http://www.unics.uni-hannover.de/nhtcapri/persian-alphabet.html>
and I continue to write <u>s</u>.


The proper way to do this would be with combining diacritical marks -
unicode 0320, 0331 or 0332.

--- Safalra (Stephen Morley) ---
http://www.safalra.com/hypertext
Jul 20 '05 #13

P: n/a
On Fri, 30 Jan 2004, CC Zona wrote:

[reformatted to usenet conventions...]
The Unicode Consortium <http://www.unicode.org> would be the group
to petition. In theory it should already be possible to render any
needed character with HTML entities, ^^^^^^^^

I think you mean "numerical character references".
rather than resorting to abuse of markup's semantic intent.
Agreed.
Has this character not been provided by Unicode? According to the
charts, 005F is an Urdu "low line".
U+005F (to use the Unicode notation, &#x5F; to use HTML notation,
although the decimal equivalent is slightly more accessible to
browsers) is the ASCII underscore, pedantically denoted "low line" in
the Unicode charts. But it's not a combining character, so it isn't
what you'd want here. The combining diacritical marks are in the
U+03xx area, as another posting has already mentioned.
If (and that's a big assumption since I know zilch about Urdu)
that's the character Neal is referring to, then "&3X005F;" would be ^^^typo its HTML entity.


Character entities on the one hand (such as &uuml;), and numerical
character references on the other hand, both use ampersand in their
notation in HTML, but it only causes confusion IMHO if the terms
aren't kept distinct. I know that the phrase *"numerical entities"
is widely used, but it isn't part of the official HTML terminology,
and I'd suggest it's best avoided.

h.t.h
Jul 20 '05 #14

P: n/a
On 30 Jan 2004, Safalra wrote:
I need an underlined "s"
and I continue to write <u>s</u>.


The proper way to do this would be with combining diacritical marks -
unicode 0320, 0331 or 0332.


Have you tested it? On different operating systems? With various browsers?
Actually, support for the "fancier" characters from
<http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata1E.html>
either written as such or with combining diacritical marks
is poorer than poor.

For the letters with line below, <u>s</u> is a convenient alternative.
Funny that there is no
<http://www.google.com/search?q=%22s+with+line+below%22> but
<http://www.google.com/search?q=%22z+with+line+below%22>

Jul 20 '05 #15

P: n/a
On Fri, 30 Jan 2004, Andreas Prilop wrote:
On 30 Jan 2004, Safalra wrote:
The proper way to do this would be with combining diacritical marks -
unicode 0320, 0331 or 0332.
Have you tested it?


But the theory remains true, whether it works in practice or not :-{
On different operating systems? With various browsers?
Actually, support for the "fancier" characters from
<http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata1E.html>
either written as such or with combining diacritical marks
is poorer than poor.


As my introductory page says - if you want to test combining
diacriticals, you're better looking at Alan Wood's page,
http://www.alanwood.net/unicode/comb...cal_marks.html

Support for the combining marks does indeed appear to be horribly
inadequate, when looking at a range of browsers.

U+1Exx looks quite a bit better, given appropriate fonts.

Seems to me there's a difference between using exotic characters in
pages meant for a general readership (who would neither know nor care
to get it right if it didn't work "out of the box"), and pages aimed
at a specialist audience where there is a generally agreed way of
representing their specialist content - such an audience might be
presumed to be willing and able to put in the extra effort to get that
"generally agreed" method working in their browser.

cheers
Jul 20 '05 #16

P: n/a
Keith Bowes wrote:
Andreas Prilop wrote:

i, b, small and big should obviously have been taken out


Why? And why "obviously"? I don't get it.
On the contrary, I don't understand what's wrong with <u> that it was
taken out from HTML 4 Strict.


Because people are too inclined to use i instead of em, for example.
And therefore make accurate indexing and archiving difficult (is the
information emphasized or just italicized for looks?).

[snip]

<i> is very useful for species names. This is a global typographical
convention.

--
Barry Pearson
http://www.Barry.Pearson.name/photography/
http://www.BirdsAndAnimals.info/
http://www.ChildSupportAnalysis.co.uk/
Jul 20 '05 #17

P: n/a
On Fri, 30 Jan 2004 23:58:31 -0000, Barry Pearson
<ne**@childsupportanalysis.co.uk> wrote:

<i> is very useful for species names. This is a global typographical
convention.


There is the Achille's heel of the content/separation divide, I'm afraid.
There are times that italics are presentational.

However, consider this. What if the user agent does not do italics at all?
Problem? Using <i> markup fails in these environments.

Best solution is to use CSS.

<span class="species">lepidae</span> (just made that up...)

The following could also feasibly be marked up with <em>

<span class="shipname">U.S.S. Constitution</span>
<span class="worktitle">Paradise Theater</span>
Jul 20 '05 #18

P: n/a
Neal <ne*****@spamrcn.com> wrote in news:op**************@news.rcn.com:
On Fri, 30 Jan 2004 23:58:31 -0000, Barry Pearson
<ne**@childsupportanalysis.co.uk> wrote:
<i> is very useful for species names. This is a global typographical
convention.

There is the Achille's heel of the content/separation divide, I'm
afraid. There are times that italics are presentational.

However, consider this. What if the user agent does not do italics at
all? Problem? Using <i> markup fails in these environments.


Presumably the user agent will have some way to render the marked-up
element in a way that calls attention to the distinction: for example, a
character-cell browser might surround the content with underscores, as is a
common convention in Usenet posts.

Best solution is to use CSS.

<span class="species">lepidae</span> (just made that up...)


The problem with that scenario is that CSS is always supposed to be
optional, so the distinction is lost completely if it isn't being used. In
fact, in the character-cell situation, using <i> would result in a rendered
indication, whereas using CSS wouldn't.
Jul 20 '05 #19

P: n/a
Neal wrote:
However, consider this. What if the user agent does not do italics at
all? Problem? Using <i> markup fails in these environments.

Best solution is to use CSS.


I don't agree with you. Both have pro's and con's.

Markup with <i>, <b>, <u> gives some effect in for example Lynx, as does
<cite> and the likes. <span> Doesn't do a thing in Lynx.
Besides that, if CSS failes for what ever reason, <i> and others might
still do something.
And on top of that, I like using <i>foo</i> instead of <span
class="foo">bar</span>, just because is takes so much less characters in
my code.

If it is very important to 'mark' a word with something special, one can
always use <i class="foo"> :-)

--

Barbara

http://home.wanadoo.nl/b.de.zoete/html/weblog.html
http://home.wanadoo.nl/b.de.zoete/html/webontwerp.html

Jul 20 '05 #20

P: n/a
On 31 Jan 2004 09:57:32 GMT, Eric Bohlman <eb******@earthlink.net> wrote:
Neal <ne*****@spamrcn.com> wrote in news:op**************@news.rcn.com:
However, consider this. What if the user agent does not do italics at
all? Problem? Using <i> markup fails in these environments.


Presumably the user agent will have some way to render the marked-up
element in a way that calls attention to the distinction: for example, a
character-cell browser might surround the content with underscores, as
is a
common convention in Usenet posts.


Right on. Point is, <i> does not mean the user will see italics.
Best solution is to use CSS.

<span class="species">lepidae</span> (just made that up...)


The problem with that scenario is that CSS is always supposed to be
optional, so the distinction is lost completely if it isn't being used.
In
fact, in the character-cell situation, using <i> would result in a
rendered
indication, whereas using CSS wouldn't.


We can divide browsers into 2 categories:

1) Modern browsers which support CSS, which will display italics when CSS
is suggested (unless user countermands it, in which case the user accepts
responsibility for defeating these effects). Even the dinosaur NN4 can
render italics from CSS, provided you don't import every style.

2) Text browsers, speech readers, and other user agents which cannot
render italic text. No markup will cause italics to be rendered. It is
true that some other visual effect might be used in a text browser, but if
the goal is to get italics, it's impossible.

Therefore, the <i> element isn't really needed. Using a styled span with a
meaningful class name will fail in an italics-capable browser only if the
author applies it poorly or if the user opts out of the style declaration
through their settings.
Jul 20 '05 #21

P: n/a
In article <op**************@news.rcn.com>,
Neal <ne*****@spamrcn.com> writes:
However, consider this. What if the user agent does not do italics at all?
Problem? Using <i> markup fails in these environments.
No it doesn't. It just falls back.
<span class="species">lepidae</span> (just made that up...)
Span soup loses all semantics, and is an indication of an author with a
mindset geared to presentation just as much as the well-known HTML
abuses. It may be less damaging than <font>, but is worse than the
usual mild abuse of <b>, <i>, or even <blockquote>.
The following could also feasibly be marked up with <em>


Indeed, that adds semantics, though it's rather weak (the actual
semantics being "proper name", not "emphasis").

As a rule of thumb, <span> should always be carefully considered and
avoided unless you're happy that it *doesn't* convey meaning.
The only exacption that springs to mind is <span lang=...> for
phrases in foreign languages.

--
Nick Kew
Jul 20 '05 #22

P: n/a
Neal <ne*****@spamrcn.com> wrote:
Best solution is to use CSS.
<span class="species">lepidae</span> (just made that up...)


<i class="species">lepidae</i>
where <span> isn't really necessary.
Jul 20 '05 #23

P: n/a
ni**@hugin.webthing.com (Nick Kew) wrote:
Span soup loses all semantics, and is an indication of an author with a
mindset geared to presentation just as much as the well-known HTML
abuses.
What can you offer instead of <span> in the following example?
<http://www.unics.uni-hannover.de/nhtcapri/temp/parentheses.html>
As a rule of thumb, <span> should always be carefully considered and
avoided unless you're happy that it *doesn't* convey meaning.
The only exacption that springs to mind is <span lang=...> for
phrases in foreign languages.


To which I add <span dir=...> and <span title=...>.

--
Top-posting.
What's the most irritating thing on Usenet?
Jul 20 '05 #24

P: n/a
In article <31*************************@rrzn-user.uni-hannover.de>,
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> writes:
What can you offer instead of <span> in the following example?
<http://www.unics.uni-hannover.de/nhtcapri/temp/parentheses.html>
I don't think I'm qualified to comment on that (though I note my
browser - konqueror - places all the apostrophes in what you
say are the right places).

But AFAICS what you really have is a series of table of english names
against rendered characters. But I guess marking it up as such would
defeat your purpose?
To which I add <span dir=...> and <span title=...>.


Hmmm, maybe.

dir= looks like your example. But the reason for span there is that
you've marked up a table as a paragraph, leaving you with two
different languages and charsets interleaved. With table markup
that would become <td dir=...>.

As for title=, there may be uses for <span>, but I suspect other
elements (such as abbr/acronym, q, or even code) would be more
appropriate in most cases. As with <foo class=...>, span should be
seen as exceptional.

--
Nick Kew
Jul 20 '05 #25

P: n/a
ni**@hugin.webthing.com (Nick Kew) wrote:
<http://www.unics.uni-hannover.de/nhtcapri/temp/parentheses.html>
But AFAICS what you really have is a series of table of english names
against rendered characters. But I guess marking it up as such would
defeat your purpose?


Indeed. The purpose of this test document is study the line wrapping
of parentheses in mixed bidirectional text. The names of the letters
are just dummy text.
dir= looks like your example. But the reason for span there is that
you've marked up a table as a paragraph, leaving you with two
different languages and charsets interleaved. With table markup
that would become <td dir=...>.
It's just my example dummy text - it is no table. In the real world,
you would have mixed English and Hebrew or mixed English and Arabic
text in one paragraph.
As for title=, there may be uses for <span>, but I suspect other
elements (such as abbr/acronym, q, or even code) would be more
appropriate in most cases.


<http://www.unics.uni-hannover.de/nhtcapri/arabic.html6> is an
example for <span title>.
Jul 20 '05 #26

P: n/a
In article <op**************@news.rcn.com>, Neal <ne*****@spamrcn.com>
wrote:
Presumably the user agent will have some way to render the marked-up
element in a way that calls attention to the distinction: for example, a
character-cell browser might surround the content with underscores, as
is a
common convention in Usenet posts.


Right on. Point is, <i> does not mean the user will see italics.


But it is the most likely he will, compared to using <span> or <em>.

--
Kris
<kr*******@xs4all.netherlands> (nl)
<http://www.cinnamon.nl/>
Jul 20 '05 #27

P: n/a
Neal wrote:
On 31 Jan 2004 09:57:32 GMT, Eric Bohlman <eb******@earthlink.net>
wrote:
Neal <ne*****@spamrcn.com> wrote in
news:op**************@news.rcn.com:
However, consider this. What if the user agent does not do italics
at all? Problem? Using <i> markup fails in these environments.


Presumably the user agent will have some way to render the marked-up
element in a way that calls attention to the distinction: for
example, a character-cell browser might surround the content with
underscores, as is a
common convention in Usenet posts.


Right on. Point is, <i> does not mean the user will see italics.

[snip]

True - indeed, a blind user won't see anything!

But shouldn't the mark-up get as close as possible to what the "real world"
demands? The global convention isn't that species names should be marked-up as
"species". It is that they should be rendered as italics. Perhaps if we could
wind back time a century or so we could change that. But I can't do that.

Do we believe the web can sensibly ignore the principles established by
scholars and others over a long time? Perhaps when it has grown up it can lay
claim to be moving this in a better direction. But, at the moment, it is a
medium of communication between people who want to say things about species
names, and people who want to read about them, with *both* parties accepting
certain useful conventions.

The web is primarily about communication between publishers and their target
audiences. Where they collectively agree, that is a very good baseline.

--
Barry Pearson
http://www.Barry.Pearson.name/photography/
http://www.BirdsAndAnimals.info/
http://www.ChildSupportAnalysis.co.uk/
Jul 20 '05 #28

P: n/a
On Sun, 1 Feb 2004 19:19:07 -0000, Barry Pearson
<ne**@childsupportanalysis.co.uk> wrote:
But shouldn't the mark-up get as close as possible to what the "real
world"
demands? The global convention isn't that species names should be
marked-up as
"species". It is that they should be rendered as italics. Perhaps if we
could
wind back time a century or so we could change that. But I can't do that.
But you're leaving a step out. When we see italics, we don't say, "Oh,
it's a species." We get the fact that it's a species from the linguistic
context. The italics only serve to impress this further - in short, they
are a presentation convention that happen to support a contextual meaning.

The need wasn't to put species names in italics, it was to mark them
somehow to be sure they are recognized as the proper type of name they
are. Italics were chosen to mark this. But this does not mean that italics
should be construed as species, but that italics used in a context which
suggests a species name is being communicated should be construed as such.

If I were to make a page discussing an ancient lettering or numbering
system, I don't have the typography on the web to do the job justice. All
systems have limitations, and this is one. The lack of meaningful markup
for a species name is another. Using <i>, a classed <span> or even <em>
are but corrective measures to putty the cracks in the system, just as
using images to represent my ancient numbers is.

However, in my example the meaning relies totally on the expression of the
appearance of these characters. In the species example, only the
agreed-upon presentation of the name is lost, the meaning remains intact.
As such, it's not so critical to force the typography. But if you want to
do this, it's better to do so in CSS than in the HTML.

As I've said, the only browsers which will not apply the CSS italics will
be those which are either unable to express italics in the first place or
are configured purposely by the user to not observe CSS italics rules. In
either case, ANY italics are unrendered, so from a rendering standpoint
there is no difference between styling as italics and marking up with <i>
- so there is no benefit to using <i> over the CSS, especially as HTML is
intended to allow the UA to provide markup based on the author-described
meaning, not to carry out the author's rendering preferences.

Should the mark-up get as close as possible to what the "real world"
demands? Sure. But do you really purport to know what the real world
demands in each case? Seems to me they demand conflicting things more
often than not. The one common thread I see is that the real world demands
*control*. Unless you disagree with that, it's only sensible to author
from the perspective that the rendering is a user-end task, not an
author-end task. The user and his agent render your code according to the
agent's defaults, perhaps superceded by user preference, perhaps
superceded by author rules, and again possibly superceded again by the
user's rules.

It's a big lie to pretend you are doing the rendering, or that anything
you do can ultimately guarantee what will happen when the UA renders your
document. The user, knowing it or not, has all the control over what your
page looks like. I suspect that as more and more users become aware of
that, they'll be very appreciative of having been made the arbiters of
their own display. The responsible web author respects the user's domain
and designs the page with that firmly in mind.
Do we believe the web can sensibly ignore the principles established by
scholars and others over a long time?
Can you sensibly assume the web can both assign user control and
accomodate every typographic principle through history?
The web is primarily about communication between publishers and their
target
audiences. Where they collectively agree, that is a very good baseline.


But the collective agreement is that the author is in charge of sending
the content, and the user is responsible for commanding the rendering.

If a species name isn't rendered in italics on some browser, the meaning
still is there, we only lose a small amount of conventional enhancement of
meaning. If we allow the author to mandate any style on the user, even if
it is conventional to do so, it's my opinion we lose far more.
Jul 20 '05 #29

P: n/a
Neal <ne*****@spamrcn.com> wrote:
But you're leaving a step out. When we see italics, we don't say,
"Oh, it's a species." We get the fact that it's a species from the
linguistic context.
Well, textual context, not linguistic. And we also need to know the
convention (which is very useful indeed, since not following this
little convention when writing about species immediately reveals that
one hasn't read many biology books or even serious articles ;-)).
The italics only serve to impress this further
- in short, they are a presentation convention that happen to
support a contextual meaning.
It rarely changes the meaning of text to remove the italics, but the
italics still carries a semantic message. And sometimes you really
wouldn't tell a common name from a scientific name without italics.
Using <i>,
a classed <span> or even <em> are but corrective measures to putty
the cracks in the system
Agreed, but <em> is a _wrong_ measure (when you don't really mean
emphasis). What's so great in using <span class="species">, which is
semantically absolutely empty, as compared with <i class="species">,
which at least tells that element - whose semantic meaning is not
revealed by the markup - is meant to be shown in italics. This means a
large variety of possibilities, but it's still more informative than
<span>.

Similarly, if the only reason for using <span> would be to make it
appear in red, wouldn't it be more honest (and more effective) to use
<font class="red" color="red"> than <span class="red">?

In fact, I'm getting sceptic about this Strict stuff. Has the world
stopped using deprecated markup? Will it stop just because the W3C says
so? We're seeing a lot of wasted work when people convert <font
face="Arial"> to <span class="Arial">. _Educated_ authors should know
the _reasons_ for avoiding presentational markup when possible,
and uneducated should learn the idea. Just arguing whether this or that
markup is acceptable in Strict is pointless. Actually the W3C decision
in deprecation is not that bad _if_ you think this Strict/Transitional
business makes sense. It was a partly illogical move - it had to be,
since the very idea was somewhat illogical.
As I've said, the only browsers which will not apply the CSS
italics will be those which are either unable to express italics in
the first place or are configured purposely by the user to not
observe CSS italics rules.


Or configured to ignore all author style sheets because the user needs
to impose his own rules easily. Or with CSS switched completely off
since the author's style sheet _as a whole_ is broken and this results
in a crash or mess on some browser. There are lots of possibilities why
CSS may not take effect and <i> still work.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #30

P: n/a
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote:
<i class="species">lepidae</i>
where <span> isn't really necessary.


I agree, but I wonder why nobody has mentioned that the example is
wrong.

I can't resist the temptation to remark that _if_ there were an element
for scientific names of specifies, say <species> (though I would really
prefer <taxon class="species">), then a good authoring tool could spot
three errors in <species>lepidae</species>:
- a species name must be two words (unless you are talking about the
species part of a real binominal name - a part that should not be
used alone)
- it must begin with a capital letter
- it is (probably) not suitable as either part of a species name,
since -idae normally indicates a _family_ name.
(And naturally a very good authoring tool would also check whether the
name exists at all in the biological nomenclature and could ask whether
he really meant the family /Leporidae/ or the genus /Lepidium/. :-))

Naturally a user agent could do similar things too, i.e. spot errors in
documents and tell the user that some purported biology pages are not
worth reading.

Yes, I should be sleeping now, but my point is that if we had simple
really semantic markup and authors used it, wonderful things could be
built upon it. Compared to this, <span> vs. <i> or the deprecation or
non-deprecation of <small> is rather uninteresting.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #31

P: n/a
On Wed, 4 Feb 2004 00:50:33 +0000 (UTC), Jukka K. Korpela
<jk******@cs.tut.fi> wrote:
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote:
<i class="species">lepidae</i>
where <span> isn't really necessary.

Oh, I noted I made it up on the spot when doing the example. I have no
idea about species or families or anything like that.
Jul 20 '05 #32

P: n/a
"Jukka K. Korpela" <jk******@cs.tut.fi> wrote:
We're seeing a lot of wasted work when people convert <font
face="Arial"> to <span class="Arial">.


The class Arial might be
Arial {font-family: Arial, Helvetica, ... }
which is a wee bit better.
Jul 20 '05 #33

P: n/a
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote:
"Jukka K. Korpela" <jk******@cs.tut.fi> wrote:
We're seeing a lot of wasted work when people convert <font
face="Arial"> to <span class="Arial">.


The class Arial might be
Arial {font-family: Arial, Helvetica, ... }
which is a wee bit better.


Or maybe not, depending on what Helvetica and other fonts look like
and why you would suggest Arial.

Besides, regarding browsers that conform to CSS specifications, you
could just as well use the <font> markup and
font[face="Arial"] {font-family: Arial, Helvetica, sans-serif }

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #34

P: n/a
In message <je**************@newsfep3-gui.server.ntli.net>, Barry
Pearson <ne**@childsupportanalysis.co.uk> writes

<i> is very useful for species names. This is a global typographical
convention.


as used on, for example:

<http://www.westmidlandbirdclub.com/biblio/birdwatch/2002-04.htm>

with mark-up like:

<i class="sci">Larus glaucoides</i>

and CSS:

{font-style: italic;}

the "i" ensures that it degrades gracefully when browsers don't have CSS
capability.
--
Andy Mabbett
"The Internet is a reflection of our society[ ...]. If we do not like what we
see in that mirror the problem is not to fix the mirror, we have to fix
society." Vint Cerf
Jul 20 '05 #35

P: n/a
In message <Xn*****************************@193.229.0.31>, Jukka K.
Korpela <jk******@cs.tut.fi> writes
I can't resist the temptation to remark that _if_ there were an element
for scientific names of specifies

[...]
I have, previously, proposed a language code for Scientific Latin...
--
Andy Mabbett
"The Internet is a reflection of our society[ ...]. If we do not like what we
see in that mirror the problem is not to fix the mirror, we have to fix
society." Vint Cerf
Jul 20 '05 #36

P: n/a
Andy Mabbett <us**********@pigsonthewing.org.uk> wrote:
I can't resist the temptation to remark that _if_ there were an
element for scientific names of specifies

[...]

I have, previously, proposed a language code for Scientific
Latin...


Have you? The correct procedure to propose language codes is through
IANA registration. It seems that (IMHO unfortunately) that they will
register virtually anything, when submitted using the correct
procedure.

The scientific names of taxons are, IMHO, only superficially Latin.
About half of them is actually just Greek in Latin clothes.

Actually I don't think that the binominal nomenclature qualifies as a
human language. It is a special-purpose codes. Its strings have been
taken from Latin, Greek, and other languages, obey a few Latin grammar
rules, and are pronounced in varying ways of pronouncing Latin.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #37

P: n/a
On Sat, 7 Feb 2004 19:15:47 +0000 (UTC), "Jukka K. Korpela"
<jk******@cs.tut.fi> wrote:
Have you? The correct procedure to propose language codes is through
IANA registration.
This is described in RFC 3066 (or RFC 1766 at one time)
http://www.ietf.org/rfc/rfc3066.txt

Language codes come from ISO, as ISO 639-2 codes
http://www.loc.gov/standards/iso639-2/langhome.html
and making use of ISO 3166 for the country codes
http://www.din.de/gremien/nas/nabd/iso3166ma/

IANA codes have the i- prefix and are to be avoided when there's a
viable ISO 639 alternative. AIUI at present, the only non-deprecated
IANA codes are i-klingon and possibly i-elvish. All others that were
previously i- codes have now been recognised under their ISO 3166
countries (including en-geordie and en-scouse)

There's also the alternative of an entirely private x- space, which is
how I personally do markup for species names <i lang="x-lat" >.
The scientific names of taxons are, IMHO, only superficially Latin.
About half of them is actually just Greek in Latin clothes.


Around a third I'd suggest - Species names, especially for new
species, are increasingly latinised adjectival versions of the
discoverer's (frequently English) surname.

--
Die Gotterspammerung - Junkmail of the Gods
Jul 20 '05 #38

P: n/a
In message <Xn*****************************@193.229.0.31>, Jukka K.
Korpela <jk******@cs.tut.fi> writes
I have, previously, proposed a language code for Scientific
Latin...
Have you?


Yes. See above.
The correct procedure to propose language codes is through IANA
registration. It seems that (IMHO unfortunately) that they will
register virtually anything, when submitted using the correct
procedure.
I used the correct procedure; it got no further :-(
The scientific names of taxons are, IMHO, only superficially Latin.
About half of them is actually just Greek in Latin clothes.

Actually I don't think that the binominal nomenclature qualifies as a
human language. It is a special-purpose codes. Its strings have been
taken from Latin, Greek, and other languages, obey a few Latin grammar
rules, and are pronounced in varying ways of pronouncing Latin.


Hence my proposal, which was for either a new language code, or a new
"subset of Latin" code, for what I described as a "pseudo language":

Tag to be registered : SC (or possibly "LA-sci")

English name of language : Scientific names (aka "Latin names") of
living things ("Scientific Latin")

I said then:

The use of the tag "LA" for Latin, while it may act as a useful
guide for pronunciation in some cases, is clearly inappropriate
for many such names, which will not occur in regular Latin
dictionaries.

The [proposed] tag will allow clients to be aware that they
should NOT translate Scientific names when translating the text
of a document in which they are included; Homo sapiens is Homo
sapiens in French, German, English or Serbo- Croat.

Also, user agents might know to parse the "H" in "'Homo sapiens has a
bigger brain that H. erectus'" as "Homo".

--
Andy Mabbett
"The Internet is a reflection of our society[ ...]. If we do not like what we
see in that mirror the problem is not to fix the mirror, we have to fix
society." Vint Cerf
Jul 20 '05 #39

P: n/a
In message <il********************************@4ax.com>, Andy Dingley
<di*****@codesmiths.com> writes
Species names, especially for new species, are increasingly latinised
adjectival versions of the discoverer's (frequently English) surname.


These (real!) examples might amuse:

Brachypelma albopilosum
(Brachypelma, from the Greek)

Ekgmowechashala philotae
(the North American Lakota language)

Uluops uluops
(from "ulu", an Eskimo knife)

Linnaea borealis
(in honour of Linneaus)

Ardeola grayii
(in honour of John Edward Gray, a
biologist)

Nepenthes sumatrana
(from the place name "Sumatra")

Phyllidia polkadotsa
("polka-dotted")

Draculoides bramstokeri
(in honour of the character Dracula and
its author, Bram Stoker)

Calponea harrisonfordi
(in honour of Harrison Ford, the actor)

Ba humbugi
(a quote from Dickens' 'A Christmas
Carol')

Ytu brutus
(a quote from Shakespeare, "Et U,
Brutus?)

Polemistus chewbacca
(a character from the film 'Star Wars')

Crex crex
(onomatopoeia)

Phthiria relativitae
(a play on "The Theory of Relativity")

Abra cadabra
(a magical pun)

Orizabus subaziro
(a palindrome)

Agra vation
(a play on "aggravation")

Bombylius aureocookae
(a play on "oreo cookie")

Heerz lukenatcha
(a play on "here's looking at you")

Cyclocephala nodanotherwon
(a play on "not another one")

Zyzzyx chilensis
(???!!!)
--
Andy Mabbett
"The Internet is a reflection of our society[ ...]. If we do not like what we
see in that mirror the problem is not to fix the mirror, we have to fix
society." Vint Cerf
Jul 20 '05 #40

P: n/a
Andy Mabbett <us**********@pigsonthewing.org.uk> wrote:
In message <il********************************@4ax.com>, Andy Dingley
<di*****@codesmiths.com> writes

Species names, especially for new species, are increasingly latinised
adjectival versions of the discoverer's (frequently English) surname.


These (real!) examples might amuse:


[snip]

My favourite daft exanple is the dinosaur Tianchisaurus
nedegoapeferima, which is named after the cast of Jurassic Park - sam
NEill, laura DErn, jeff GOldblum, richard Attenborough, bob PEck,
martin FErrero, ariana RIchards and joseph MAzzell

cheers,
Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st***@pugh.net> <http://steve.pugh.net/>
Jul 20 '05 #41

P: n/a
Andy Mabbett <us**********@pigsonthewing.org.uk> wrote:
I used the correct procedure; it got no further :-(
Interesting. There seems to have been some discussion on it:
<http://eikenes.alvestrand.no/pipermail/ietf-languages/
2003-February/thread.html>
Hence my proposal, which was for either a new language code, or a
new "subset of Latin" code, for what I described as a "pseudo
language":
IANA is not authorized to assign a primary language code (or "primary
subtag" in RFC 3066 parlance) like "sc". It would have been authorized
to assign "la-sci".

But it seems that the proposal was rejected basically for the reason I
mentioned: scientific names do not constitute a language. There's a
widespread confusion around the word "language" due to its metaphoric
use, which has obscured the fundamental difference between real
languages and different systems of notations, signals, etc. In any real
language you can say "I love you", "What time is it?", and "Let's call
this 'foo'".

But scientific taxon names would deserve _markup_ in markup systems
(confusingly called "markup languages") like HTML.
The [proposed] tag will allow clients to be aware that they
should NOT translate Scientific names when translating the
text of a document in which they are included; Homo sapiens
is Homo sapiens in French, German, English or Serbo- Croat.
This actually proves that they do not constitute a language any more
than chemical formulas or music notations do. In fact, for some
dimensions of language markup, they should be tagged as being of no
language. For the purposes of pronunciation, mostly Latin, though the
derived from proper nouns are difficult; should "Ardeola grayii" be
actually marked up as <i lang="la">Ardeola <span lang="en-US"
title="Gray">gray</span>ii</i>, in theory?

(In the ietf-languages discussion, the possibility of using xml:lang=""
was mentioned. It seems that this is often suggested for reasons
unknown to be, apparently frowning upon the idea of using "und" as a
language code. But I find it odd to assign a specific meaning to an
empty code.)
Also, user agents might know to parse the "H" in "'Homo sapiens has
a bigger brain that H. erectus'" as "Homo".


By the way, that's a good example of situation where the <abbr> markup
could make sense, as a hint to a reader who might need it:
<i>Homo sapiens</i> has a bigger brain than
<i><abbr title="Homo">H.</abbr> erectus</i>.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #42

P: n/a
Jukka K. Korpela:
This actually proves that they [Scientific names] do not constitute
a language any more than chemical formulas or music notations do.
In fact, for some dimensions of language markup, they should be
tagged as being of no language.


That sound right. But then, just how do we do that?

--
Bertilo Wennergren <be******@gmx.net> <http://www.bertilow.com>
Jul 20 '05 #43

P: n/a
On Sat, 07 Feb 2004 21:35:30 +0000, Andy Dingley
<di*****@codesmiths.com> wrote:
IANA codes have the i- prefix and are to be avoided when there's a
viable ISO 639 alternative. AIUI at present, the only non-deprecated
IANA codes are i-klingon and possibly i-elvish. All others that were
previously i- codes have now been recognised under their ISO 3166
countries (including en-geordie and en-scouse)


& zh-hak, which used to be i-hakka IIRC.

Cheers,
Philip
--
Philip Newton <no***********@gmx.li>
That really is my address; no need to remove anything to reply.
If you're not part of the solution, you're part of the precipitate.
Jul 20 '05 #44

P: n/a
On Sun, 08 Feb 2004 08:08:43 +0100, Philip Newton
<pn*************@newton.digitalspace.net> wrote:
All others that were
previously i- codes have now been recognised under their ISO 3166
countries (including en-geordie and en-scouse)


& zh-hak, which used to be i-hakka IIRC.


There's a whole bunch of zh- codes that have gone the same way, not
just i-hakka.

Jul 20 '05 #45

P: n/a
On Sat, 7 Feb 2004 22:05:05 +0000, Andy Mabbett
<us**********@pigsonthewing.org.uk> wrote:
I used the correct procedure; it got no further :-( Tag to be registered : SC (or possibly "LA-sci")


No, you suggested a language code based on the country code for Lao !
There _is_ a structure to these things.

en-scientificlatin or even va-lat (Vatican City) might have flown,
but la-sci is just completely arse-about-face.

I presume va-lat is already in widespread use with a meaning of
ecclesiastical latin, and we're not really talking about the same
thing. Shame really - it would be amusing to describe evolution in
terms of the language of an old superstition.

Jul 20 '05 #46

P: n/a
In message <si********************************@4ax.com>, Andy Dingley
<di*****@codesmiths.com> writes
Tag to be registered : SC (or possibly "LA-sci")


No, you suggested a language code based on the country code for Lao !
There _is_ a structure to these things.


Indeed; that's why I took advice from the person who wrote the RFC.

Nobody on the mailing list raised the concern you have.
--
Andy Mabbett
"The Internet is a reflection of our society[ ...]. If we do not like what we
see in that mirror the problem is not to fix the mirror, we have to fix
society." Vint Cerf
Jul 20 '05 #47

P: n/a
Andy Dingley <di*****@codesmiths.com> wrote:
Tag to be registered : SC (or possibly "LA-sci")
No, you suggested a language code based on the country code for Lao !


No, "la-sci" was technically a well-formed code. And "sc" was
inadequate for the reason I gave before - IANA has no authority over
primary language codes. Besides, "sc" is currently assigned to
Sardinian.
There _is_ a structure to these things


Indeed. And in the language codes that IANA registers, and that we have
been told to use in HTML, the structure is that the primary language
code comes first, then the secondary code, which is a country code if
it consists of two letters. As in "en-US". The first part never means a
country (though it may occasionally coincide with a country code, even
for a country where the language indicated is dominant, for obvious
reasons).

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #48

P: n/a
On Sun, 8 Feb 2004 21:47:42 +0000 (UTC), "Jukka K. Korpela"
<jk******@cs.tut.fi> wrote:
No, "la-sci" was technically a well-formed code.


Sorry - total brain-fart on my part and I was reading it backwards.

Jul 20 '05 #49

P: n/a
Bertilo Wennergren <be******@gmx.net> wrote:
This actually proves that they [Scientific names] do not
constitute a language any more than chemical formulas or music
notations do. In fact, for some dimensions of language markup,
they should be tagged as being of no language.


That sound right. But then, just how do we do that?


If you ask me, lang="und" would be the theoretically correct method.
But many people seem to think that lang="", or rather xml:lang="", is
the way to go. Either way, no software probably pays any attention.

But this would only apply to some dimensions of language markup. If I
write a document about Perl programming and include the Perl program
print "Ave, munde!\n"
then in one dimension, this is of no human language. For example,
spelling checking, if performed, should not be based on any language,
but might be performed according to Perl syntax rules.*) On the other
hand, when read aloud, part of the data should be treated as English,
part of it as Latin. Translation would be an interesting question, but
as the first approximation, the text should be left untranslated.
Probably as the second approximation too, since sample strings (or
comments) in source programs should remain unchanged in translation,
get changed in localization.

*) It would actually be very natural to have an attribute like
notation="..." in the <code> element. This would open interesting
possibilities, like checking an HTML document about programming in
several ways...

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #50

55 Replies

This discussion thread is closed

Replies have been disabled for this discussion.