
February 25th, 2006, 08:45 PM
|
|
|
utf-8 or UTF-8?
How is this for correct HTML 4.01 headers?:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="zh-tw"><head>
<meta http-equiv="Content-Type" content=
"text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="zh-tw">
And for my English pages, en-us instead of zh-tw.
Did I screw up any details? utf-8 or UTF-8 like Google?
The page should work via http:// or file:///.
|

February 25th, 2006, 11:15 PM
|
|
|
Re: utf-8 or UTF-8?
Dan Jacobson <jidanni@jidanni.org> wrote:
[color=blue]
> How is this for correct HTML 4.01 headers?:[/color]
You actually didn't reveal the _headers_, namely the HTTP headers,
which are what really matters. If they specify the encoding
("charset"), they trump any <meta> tags (as explained so often in this
group).
[color=blue]
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
> "http://www.w3.org/TR/html4/strict.dtd">
> <html lang="zh-tw"><head>
> <meta http-equiv="Content-Type" content=
> "text/html; charset=utf-8">[/color]
OK, but real HTTP headers still have preference. (Some people would
prefer zh-Hant to zh-tw, since the subcode is really about variant of
writing system rather than geographic area, but that's mostly
politics.)
[color=blue]
> <meta http-equiv="Content-Language" content="zh-tw">[/color]
Do you know of _any_ software that actually _uses_ the information in
such a <meta> tag, as opposite to just emitting it?
[color=blue]
> And for my English pages, en-us instead of zh-tw.[/color]
That's fine in principle, if the pages are really in US English.
[color=blue]
> utf-8 or UTF-8 like Google?[/color]
There's no difference. Names of encodings are by definition case
insensitive. For what it's worth, the official registery of names of
encodings uses "UTF-8" in uppercase:
http://www.iana.org/assignments/character-sets
[color=blue]
> The page should work via http:// or file:///.[/color]
Nothing works via file:// on the World Wide Web; the file:// URLs are
by definition system-dependent and work (at most) inside a computer or
across similar computers in a local network.
--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
|

February 27th, 2006, 08:15 PM
|
|
|
Re: utf-8 or UTF-8?
Dan Jacobson wrote:[color=blue]
> How is this for correct HTML 4.01 headers?:
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
> "http://www.w3.org/TR/html4/strict.dtd">
> <html lang="zh-tw"><head>
> <meta http-equiv="Content-Type" content=
> "text/html; charset=utf-8">
> <meta http-equiv="Content-Language" content="zh-tw">
> And for my English pages, en-us instead of zh-tw.
> Did I screw up any details? utf-8 or UTF-8 like Google?
> The page should work via http:// or file:///.[/color]
Did you save the file as UFT-8? I often forget that ;-)
|

February 28th, 2006, 03:25 PM
|
|
|
Re: utf-8 or UTF-8?
Jukka K. Korpela wrote:[color=blue]
> Dan Jacobson <jidanni@jidanni.org> wrote:
>[color=green]
>> How is this for correct HTML 4.01 headers?:[/color]
>
> You actually didn't reveal the _headers_, namely the HTTP headers,
> which are what really matters. If they specify the encoding
> ("charset"), they trump any <meta> tags (as explained so often in this
> group).[/color]
Is it wrong to refer to the HEAD element of an HTML document as an HTML
header?
What I noticed is:
- doesn't an HTML document have only one HTML header (if such
terminology is valid)?
- the snippet includes things that aren't part of the HEAD
- it isn't complete - no TITLE
[color=blue][color=green]
>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
>> "http://www.w3.org/TR/html4/strict.dtd">
>> <html lang="zh-tw"><head>
>> <meta http-equiv="Content-Type" content=
>> "text/html; charset=utf-8">[/color]
>
> OK, but real HTTP headers still have preference. (Some people would
> prefer zh-Hant to zh-tw, since the subcode is really about variant of
> writing system rather than geographic area, but that's mostly
> politics.)
>[color=green]
>> <meta http-equiv="Content-Language" content="zh-tw">[/color]
>
> Do you know of _any_ software that actually _uses_ the information in
> such a <meta> tag, as opposite to just emitting it?[/color]
<snip>
Specifying the language of an HTML document certainly has its uses.
http://webtips.dan.info/language.html
Where different programs look for this information is, of course,
another matter....
Stewart.
--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++@ a->--- UB@ P+ L E@ W++@ N+++ o K-@ w++@ O? M V? PS-
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------
My e-mail is valid but not my primary mailbox. Please keep replies on
the 'group where everyone may benefit.
|

February 28th, 2006, 04:55 PM
|
|
|
Re: utf-8 or UTF-8?
In our last episode,
<du1pbt$g7a$1@sun-cc204.lut.ac.uk>,
the lovely and talented Stewart Gordon
broadcast on comp.infosystems. www.authoring.html:
[color=blue]
> Jukka K. Korpela wrote:[color=green]
>> Dan Jacobson <jidanni@jidanni.org> wrote:
>>[color=darkred]
>>> How is this for correct HTML 4.01 headers?:[/color]
>>
>> You actually didn't reveal the _headers_, namely the HTTP headers,
>> which are what really matters. If they specify the encoding
>> ("charset"), they trump any <meta> tags (as explained so often in this
>> group).[/color][/color]
[color=blue]
> Is it wrong to refer to the HEAD element of an HTML document as an HTML
> header?[/color]
It is likely to lead to confusion with the http headers.
[color=blue]
> What I noticed is:
> - doesn't an HTML document have only one HTML header (if such
> terminology is valid)?[/color]
An HTML document may have only one HEAD element.
[color=blue]
> - the snippet includes things that aren't part of the HEAD
> - it isn't complete - no TITLE[/color]
[color=blue][color=green][color=darkred]
>>> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
>>> "http://www.w3.org/TR/html4/strict.dtd">
>>> <html lang="zh-tw"><head>
>>> <meta http-equiv="Content-Type" content=
>>> "text/html; charset=utf-8">[/color]
>>
>> OK, but real HTTP headers still have preference. (Some people would
>> prefer zh-Hant to zh-tw, since the subcode is really about variant of
>> writing system rather than geographic area, but that's mostly
>> politics.)
>>[color=darkred]
>>> <meta http-equiv="Content-Language" content="zh-tw">[/color]
>>
>> Do you know of _any_ software that actually _uses_ the information in
>> such a <meta> tag, as opposite to just emitting it?[/color]
><snip>[/color]
[color=blue]
> Specifying the language of an HTML document certainly has its uses.[/color]
[color=blue]
> http://webtips.dan.info/language.html[/color]
[color=blue]
> Where different programs look for this information is, of course,
> another matter....[/color]
[color=blue]
> Stewart.[/color]
--
Lars Eighner usenet@larseighner.com http://www.larseighner.com/
War on Terrorism: Okay, Unleash OUR Extreme Fundamentalists
"... all of them who have tried to secularize America, I point the finger in
their face and say, 'You helped this happen.'" --Jerry Falwell
|

February 28th, 2006, 05:15 PM
|
|
|
Re: utf-8 or UTF-8?
Stewart Gordon wrote:[color=blue]
> Jukka K. Korpela wrote:
>[color=green]
>> Dan Jacobson <jidanni@jidanni.org> wrote:
>>[color=darkred]
>>> How is this for correct HTML 4.01 headers?:[/color]
>>
>> You actually didn't reveal the _headers_, namely the HTTP headers,
>> which are what really matters. If they specify the encoding
>> ("charset"), they trump any <meta> tags (as explained so often in this
>> group).[/color]
>
> Is it wrong to refer to the HEAD element of an HTML document as an HTML
> header?[/color]
Yes, because in a document, "headers" are the introductory bits of text
at the beginning of the different sections are the content, represented
in HTML documents by the tags H1 through H6.
In a communication protocol, such as HTTP, "headers" are the attributes
of the communication itself, preceding the content and telling the
receiving application what it needs to know to process the
communication. My use of the term "receiving application" here is
narrow. In the case of a browser, I don't mean all of the modules in the
browser, including the HTML renderer. I mean just the part that is party
to the communication: the HTTP-processing component.
[color=blue]
> What I noticed is:
> - doesn't an HTML document have only one HTML header (if such
> terminology is valid)?[/color]
It has any number of headers (H1 through H6 elements). It has only one
*head*.
[color=blue]
> - the snippet includes things that aren't part of the HEAD
> - it isn't complete - no TITLE[/color]
It contained the DOCTYPE declaration and the HTML tag, neither of which
is part of the head. It also doesn't complain the complete head, because
the title is missing, as you observe. It also doesn't contain the
closing </head> tag, but that's technically not required.
|

February 28th, 2006, 05:45 PM
|
|
|
Re: utf-8 or UTF-8?
On Tue, 28 Feb 2006 15:14:05 +0000, Stewart Gordon
<smjg_1998@yahoo.com> wrote:
[color=blue]
>Is it wrong to refer to the HEAD element of an HTML document as an HTML
>header?[/color]
Hey Stewart. Perhaps right or wrong is not so relevant, as opposed to
what will cause other coders to call foul. I've always avoided
referring to the <head> section as a header, just for this very
reason. I've no idea if it's actually correct or not, one way or the
other.
Ian
--
http://sundry.ws/
|

February 28th, 2006, 11:45 PM
|
|
|
Re: utf-8 or UTF-8?
Stewart Gordon wrote:
[color=blue][color=green][color=darkred]
>>> <meta http-equiv="Content-Language" content="zh-tw">[/color]
>>
>> Do you know of _any_ software that actually _uses_ the information in
>> such a <meta> tag, as opposite to just emitting it?[/color]
>
> <snip>
>
> Specifying the language of an HTML document certainly has its uses.
>
> http://webtips.dan.info/language.html[/color]
Dan's page on specifying language is great, even though actual
utilization on such information is fairly limited at present, as Dan
mentions.
My point was the use of a <meta> tag to specify language. A <meta> tag
like this is just a surrogate for an HTTP header. The header is in this
case somewhat debatable (by HTTP protocols, Content-Language indicates
the language(s) of the intended _audience_, though is admittedly almost
splitting hairs). More importantly, does any user agent actually make
some use of the Content-Type header, whether sent as an actual header or
simulated via <meta>?
In any case, by HTML specs, the lang attribute takes precedence over the
HTTP header, so the <meta> tag is pointless if you use the lang
attribute in <html>
|

February 28th, 2006, 11:55 PM
|
|
|
Re: utf-8 or UTF-8?
Harlan Messinger wrote:
[color=blue][color=green]
>> Is it wrong to refer to the HEAD element of an HTML document as an
>> HTML header?[/color]
>
> Yes, because in a document, "headers" are the introductory bits of text
> at the beginning of the different sections are the content, represented
> in HTML documents by the tags H1 through H6.[/color]
I would say that calling the HEAD element a header is misleading, but on
other grounds. The serious confusion here is between data in the HEAD
element and data in HTTP headers, especially since some data in the HEAD
element actually "simulates" HTTP headers but does _not_ take precedence
over actual HTTP headers.
The elements H1 through H6 are called headings in HTML specs, and I'd
keep them that way.
We also have THEAD (table header part) and TH (table header cell), so we
run out of terms and have to use the same word about two rather
different constructs. But what we can do is that we distinguish between
a) HTTP headers
b) HEAD part of an HTML document
c) headings in the BODY element of an HTML document
[color=blue]
> It has any number of headers (H1 through H6 elements).[/color]
Technically, yes. But it is normally good practice to have a single H1
element, since you rarely have meaningful use for two or more
_top-level_ headings. (A bilingual document containing parallel texts
could be an exception.)
|

March 1st, 2006, 06:55 PM
|
|
|
Re: utf-8 or UTF-8?
Jukka K. Korpela wrote:[color=blue]
> The elements H1 through H6 are called headings in HTML specs, and I'd
> keep them that way.[/color]
Ack, that always gets me. To me "header" and "heading" are virtually the
same word, but since we're being precise here I realize I should have
thought of that.
[color=blue]
>
> We also have THEAD (table header part) and TH (table header cell), so we
> run out of terms and have to use the same word about two rather
> different constructs. But what we can do is that we distinguish between
> a) HTTP headers
> b) HEAD part of an HTML document
> c) headings in the BODY element of an HTML document
>[color=green]
>> It has any number of headers (H1 through H6 elements).[/color]
>
> Technically, yes. But it is normally good practice to have a single H1
> element, since you rarely have meaningful use for two or more
> _top-level_ headings. (A bilingual document containing parallel texts
> could be an exception.)[/color]
I had a feeling someone would bring that up. Note that I didn't say at
each level! I meant in the aggregate.
|

March 1st, 2006, 11:16 PM
|
|
|
Re: utf-8 or UTF-8?
Jukka K. Korpela wrote:
[color=blue]
> Technically, yes. But it is normally good practice to have a single H1
> element, since you rarely have meaningful use for two or more
> _top-level_ headings. (A bilingual document containing parallel texts
> could be an exception.)[/color]
I use multiple H1 all the time,
e.g.
<h1>first heading</h1>
....
<h1>second heading</h1>
...
<h1>third heading</h1>
....
<h1>fourth heading</h1>
....
Is that bad?
|

March 1st, 2006, 11:35 PM
|
|
|
Re: utf-8 or UTF-8?
On Thu, 2 Mar 2006 12:08:55 +1300, "windandwaves"
<winandwaves@coldmail.com> wrote:
[color=blue]
>I use multiple H1 all the time,
>
>Is that bad?[/color]
Not terribly, IMO, but I'm not schooled too well in the semantic
aspects of HTML ... it's a pretty complex subject. I would say <h1> is
more for a top-level heading, and <h2> more for section headings. Why,
who knows. This post is in case someone like Jukka doesn't jump in
with the official verdict. :-)
What you said reminded me of an HTML version that never took off,
ISO-HTML, and what they have to say about headings:
https://www.cs.tcd.ie/15445/UG.HTML#H1
I guess the idea is that we want the structure of the document to be
logical. 'Fraid I can't explain more, as I don't know more. I'm
curious to see what others say on the subject.
Ian
--
http://sundry.ws/
|

March 2nd, 2006, 05:35 AM
|
|
|
Re: utf-8 or UTF-8?
"windandwaves" <winandwaves@coldmail.com> wrote:
[color=blue]
> I use multiple H1 all the time,[/color]
- -[color=blue]
> Is that bad?[/color]
H1 means first level heading. How many first levels has your document
got?
The first level heading is a heading for the document as a whole,
since "level" refers to division into structural parts. If your
document does not contain such a heading but only headings for parts
of the document, the logical move is to make them H2 elements.
On the practical side - perhaps even too practical to some people's
taste -, using H1 elements illogically results in poor _default_
rendering of the document. The default rendering is typically in very
large font and bolded. As an author, you can suggest a different
rendering, but browsers may ignore some or all of your suggestions.
--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html
|

March 2nd, 2006, 11:25 PM
|
|
|
Re: utf-8 or UTF-8?
On Thu, 2 Mar 2006 12:08:55 +1300, "windandwaves"
<winandwaves@coldmail.com> wrote:
[color=blue]
>I use multiple H1 all the time,[/color]
[color=blue]
>Is that bad?[/color]
No - because W3C HTML has no semantics defined for "sibling" headers or
for their permitted nesting. ISO HTML does define this, although I think
they permit multiple siblings, they just don't permit <h1>...<h3>
directly. ISO get it wrong here, because HTML isn't theirs to define
and they certainly can't change the semantics from the "real" version
like this.
It might not be a recommended best practice to have multiple <h1>
elements, but it's not demonstrably wrong.
|

March 3rd, 2006, 09:17 AM
|
|
|
Headings (H1, H2, ...) - semantics and syntax
Under Subject: Re: utf-8 or UTF-8?
Andy Dingley wrote:
[color=blue][color=green]
>>I use multiple H1 all the time,[/color]
>[color=green]
>>Is that bad?[/color]
>
> No - because W3C HTML has no semantics defined for "sibling" headers or
> for their permitted nesting.[/color]
Sorry, but you are very confused now (and confusing). The HTML
specifications define the semantics (meaning) of H1, H2, etc. - not very
exactly, but still. It rigorously defines permitted nesting: they must
not be nested (no H2 inside H1 for example). What you actually mean by
"nesting" is a different issue - and a syntactic question. The specs
also recommend against skipping header levels, though this more or less
follows from the semantics (you don't go from level 2 to level 4 without
going through level 3).
[color=blue]
> ISO HTML does define this,[/color]
Nobody really cares about ISO HTML. And it takes a long way in its
attempt to express formally a simple principle about headings, and fails.
[color=blue]
> although I think they permit multiple siblings,[/color]
If I cared about ISO HTML, I would actually check what it says instead
of writing "I think". It's on the www and probably easily googleable.
[color=blue]
> ISO get it wrong here, because HTML isn't theirs to define
> and they certainly can't change the semantics from the "real" version
> like this.[/color]
What you write is about syntax (specifically, an attempt to add a
syntactic constraint in a rigorous way), not semantics.
[color=blue]
> It might not be a recommended best practice to have multiple <h1>
> elements, but it's not demonstrably wrong.[/color]
That statement has very little content. Compare: It might not be a
recommended best practice to have an empty <title> element (or a <title>
element with 1,000 characters in it), elements, but it's not
demonstrably wrong. (For some values of "demonstrably" and "wrong", as
your statement.)
|
|
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|
|
What is Bytes?
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over network members.
|