By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,805 Members | 1,652 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,805 IT Pros & Developers. It's quick & easy.

W3C Validator Update

P: n/a
There's a new beta of the W3C Markup Validation Service now live at
<URL:http://validator.w3.org:8001/>

Probably the most important change is verbose output, including attempts
to explain the validator errors. Other changes include improved
display of error messages, and a choice of parse modes.
Currently - but probably not for long - it includes an
"interesting" default setting!

Some of the changes have been the subject of much debate. We need
to widen that to include users: please tell us what you like or
dislike in the new service! Quick feedback may catch Terje while
he's still hacking this release:-)

As always, problem and bug reports are welcome, but please check
first whether they're already known.

--
Nick Kew

In urgent need of paying work - see http://www.webthing.com/~nick/cv.html
Jul 20 '05 #1
Share this Question
Share on Google+
24 Replies


P: n/a
Jim Ley wrote:
On Thu, 28 Aug 2003 19:11:44 +0000 (UTC), "Jukka K. Korpela"
<jk******@cs.tut.fi> wrote:
ni**@fenris.webthing.com (Nick Kew) wrote:

First of all - well done and thanks for the efforts all of you have put into
this.

This beta release has been defaulted to an extended mode as an
oversight, that's clearly wrong, do you feel a fussy mode should not
exist?
There's certainly benefit in a fussy mode. Whether that should still fall
under the naming of "validator" is a different question. I certainly don't
feel qualified to answer that question.

An SGML validator certainly, is the CSS validator also useless? If
you don't like the non-technical use of validator, what do you propose
such a QA tool be called?
Quality Assist. QA Assist. Its tools about improving the Quality of the
markup, so why not focus on the Quality aspect. Businesses tend to like
words like Quality.
Yes the beta is wrong to claim valid document invalid, yes the beta is
wrong to default to fussy mode - I think everyone has acknowledged
that. Do you see anything else wrong in the beta you could report?


Nothing wrong, but confused me a bit:
<http://validator.w3.org:8001/check?uri=http%3A%2F%2Fwww.isolani.co.uk%2Fblog%2F &verbose=1&fussy=1>
(Validating my blog page)

I'm a little surprised that the "unescaped" & in the main text went
unnoticed by validation - but I'm glad the new fussy checker picked it up
(Multiple occurrances of Marks & Spencer within the text). It is probably
more my lack of understanding of what a validator checks rather than the
tool itself - but its good enough to convince me of the benefits of a fussy
checker.
Good work!
--
Iso.
FAQs: http://html-faq.com http://alt-html.org http://allmyfaqs.com/
Recommended Hosting: http://www.affordablehost.com/
Web Standards: http://www.webstandards.org/
Jul 20 '05 #2

P: n/a
ji*@jibbering.com (Jim Ley) wrote:
Yes the beta is wrong to claim valid document invalid, yes the beta is
wrong to default to fussy mode - I think everyone has acknowledged
that.
So why hasn't it been fixed? And how _did_ they manage to make such
elementary errors? If you ask me, it was just the culmination of the
approach that created "fussy mode" in the first place.
Do you see anything else wrong in the beta you could report?


Should I report something else than the fact that all the announced new
features are nonsense?

We would have needed a good tag soup checker years ago. Turning a
validator to a very one-sided tag soup checker helps nobody.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #3

P: n/a
On Thu, 28 Aug 2003 20:12:49 +0000 (UTC), "Jukka K. Korpela"
<jk******@cs.tut.fi> wrote:
ji*@jibbering.com (Jim Ley) wrote:
Yes the beta is wrong to claim valid document invalid, yes the beta is
wrong to default to fussy mode - I think everyone has acknowledged
that.
So why hasn't it been fixed?


Hmm, QA processes suggest to me that fixes should be tested first,
before being rolled out onto production machines, the beta validator
does run on the same machine as the real validator, so hack and patch
as you go probably isn't particularly wise is it?
And how _did_ they manage to make such
elementary errors?
Elementary errors as what, defaulting to a wrong mode, that's pretty
simple to do surely? The text of the message saying valid - well one
word can easily be overlooked, it is mostly aesthetic when you know
what you actually mean, I can at least understand both bugs creeping
in with my authoring processes, which is why we have betas etc.
Do you see anything else wrong in the beta you could report?


Should I report something else than the fact that all the announced new
features are nonsense?


Of course not, you very right to report the problems here. I was
asking if you had seen any other issues.
We would have needed a good tag soup checker years ago. Turning a
validator to a very one-sided tag soup checker helps nobody.


What do you mean by "one sided" ?

Jim.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/

Jul 20 '05 #4

P: n/a
ji*@jibbering.com (Jim Ley) wrote:
We would have needed a good tag soup checker years ago. Turning a
validator to a very one-sided tag soup checker helps nobody.


What do you mean by "one sided" ?


It apparently applies a collection of simple syntactic rules, generally
aimed at tag verbosity in XHTML style, trying to force them upon HTML,
and generating cascades of "error messages". There's surely much else
that could and should be checked in HTML markup.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #5

P: n/a
On Thu, 28 Aug 2003 22:30:12 +0200, "Alan J. Flavell"
<fl*****@mail.cern.ch> wrote:
I think the problem here is that - as long as there have been people
who took the W3C's assertion "HTML is an application of SGML" in any
way seriously - the term "validator" has had a specific technical
meaning, namely the meaning which it inherits from SGML.
Yes, and within that scope it's fine, but I don't fully agree that the
W3 has been within that scope for some time, nor is it particularly
useful for it to be, no-one has been taking the application of SGML
seriously because there's too many holes in that.
So, within the scope of its
implementation limitations, it gives an unambiguous answer, and all
validators must give the same answer, even if they express it in
different words.
Certainly, which is all a default v.w3.org should do, having more
useful QA options is a good thing, I feel and think Jukka is being
much too negative in his criticism, by suggesting to me that the
entire process is flawed. I'd love to see some alternative QA
approach that didn't use Jukka's "one-sided" approach, I believe
linting of attributes is pencilled in for a 0.7.0 release of v.w3.org.
These sort of things are useful, if you want a validator you've got
one, web authors though need QA tools, not pointless pedantry on HTML
origins.

Criticise the defaults, criticise the claim that it's invalid, they're
horrible bugs *, but I'd like to see the idea of web-QA taken
seriously, SGML validation doesn't do that.
but I *would* like to see the distinction made more clearly, that's
all I'm saying.


I agree, much of the supporting commentary isn't there yet.

Jim.

* I'm beginning to wonder if the fussy default wasn't a bug but an
intentional way to get the beta talked about and looked at, but I
don't actually think Terje is quite that mad.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/

Jul 20 '05 #6

P: n/a
Andreas Prilop schrieb:

Isofarro <sp*******@spamdetector.co.uk> wrote:
There's certainly benefit in a fussy mode.


What exactly is a "fussy mode"?


It's explained in the announcement:
<http://lists.w3.org/Archives/Public/www-validator/2003Aug/0105.html>
Matthias
Jul 20 '05 #7

P: n/a
Follow the fun at:

http://lists.w3.org/Archives/Public/...3Aug/0105.html

James Pickering
http://www.jp29.org/
(Validates in v0.6.5 [Beta #1] "fussy" mode)
Jul 20 '05 #8

P: n/a
In article <ms***********@sidious.isolani.co.uk>, one of infinite monkeys
at the keyboard of Isofarro <sp*******@spamdetector.co.uk> wrote:
Nick Kew wrote:
There's a new beta of the W3C Markup Validation Service now live at
<URL:http://validator.w3.org:8001/>


This URL doesn't display any of the form elements in Konqueror 3.0.0 on Suse
8.0:

http://www.isofarro.freeserve.co.uk/temp/w3val.png

Not sure why - possibly the fieldset and legend need to be inside the form
elements?


I've seen Konq 3 on deadrat 7.3 do that with fieldset. I'd say that's a
serious bug (Konq 2.2 is better). If only I had the time and kit to hack
browsers ....
--
Nick Kew

In urgent need of paying work - see http://www.webthing.com/~nick/cv.html
Jul 20 '05 #9

P: n/a
Nick Kew wrote:
In article <ms***********@sidious.isolani.co.uk>, one of infinite monkeys
at the keyboard of Isofarro <sp*******@spamdetector.co.uk> wrote:
This URL doesn't display any of the form elements in Konqueror 3.0.0 on
Suse 8.0:

http://www.isofarro.freeserve.co.uk/temp/w3val.png

Not sure why - possibly the fieldset and legend need to be inside the
form elements?


I've seen Konq 3 on deadrat 7.3 do that with fieldset. I'd say that's a
serious bug (Konq 2.2 is better). If only I had the time and kit to hack
browsers ....


FWIW, the same browser on the same OS used to crash on
<http://www.w3.org/Style/CSS/> which did not impress me in the least. 3.1.3
has been much better except for one rather annoying bug (always-present
horizontal scrollbar).

--
Shawn K. Quinn
Jul 20 '05 #10

P: n/a
In article <Xn*****************************@193.229.0.31> in
comp.infosystems.www.authoring.html, Jukka K. Korpela
<jk******@cs.tut.fi> wrote:
I wrote my immediate comments to the www-validator list, saying much the
same as here, and that list would have been suitable for fixing the
errors. But if the bogosity is announced on a wider forum, I think it
needs to be pointed out what it is.


I am not a subscriber to the www-validator list, but if what you
wrote there is much the same as what you wrote here, I don't know
how they'll know what you meant. I've read your article twice and I
understand that you hate the new validator, but I don't understand
why.

The one specific complaint I understood was that a lot of the
explanations of error messages don't exist and instead have annoying
placeholders in small type. (Those would annoy me too.)

I can't say whether your other criticisms of the validator are right
or wrong, because I have no idea what they are. You say it "invents
its own rules" and "intentionally claims that a valid page is not
valid", but give no examples or explanations.

If someone gave you such a criticism for your work, what would you
do with it? How would that guide you in making improvements?

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #11

P: n/a
Stan Brown th************@fastmail.fm wrote (with reference to a Jukka K.
Korpela posting):
I am not a subscriber to the www-validator list, but if what you
wrote there is much the same as what you wrote here, I don't know
how they'll know what you meant ..... <snip>


You can follow the interchange at:

http://lists.w3.org/Archives/Public/...hread.html#105

James Pickering
http://www.jp29.org/
Pickering Pages
Computer programming since 1951 (IBM 407)
Jul 20 '05 #12

P: n/a
On Fri, Aug 29, Nick Kew inscribed on the eternal scroll:
at the keyboard of "Jukka K. Korpela" <jk******@cs.tut.fi> wrote:
more serious than claiming undefined character references to be markup
errors).


Do you have a reference for that? My reading of it is that undefined
references are indeed errors, and that reporting them as such was in
fact fixing a bug in early versions.


Some years ago (I guess around 1995?) I was assured by an SGML
specialist (I can only dimly recollect who it was, and at the moment I
cannot find the correspondence, so please excuse me for not trying to
name him) that in SGML an undefined numerical character reference is
not an error, and it is open to the parties to reach mutual agreement
on a meaning for them.

Google reveals this discussion document from 1996:
http://www.y12.doe.gov/sgml/wg8/document/1875.htm

This sets the context:

The US national body recommends that the following be considered
during the ongoing review of ISO 8879:

Under 3(b), one can read:

There are 3 classes of characters: those that can be entered directly
or through character references, those that can be entered only
through character references, and those that are prohibited. The
first class is declared with a minimum literal or base set character
number, the second by "UNUSED", and the third by omitting the
^^^^^^^^^^^^^^^^^^^^^^
character number from the described character set portions. In
addition to clarifying the interpretation of the character set
declaration, this approach provides users with a way to prohibit
references to undefined characters.

I think it's fair to deduce from what it says there, that at the time
of writing, references to undefined characters were not ipso facto an
error, otherwise there would have been no need to discuss providing
users with ways to prohibit them.

If you want an authoritative answer, I guess you need to look at any
subsequent amendments to the SGML standard. It's beyond my current
expertise, and I have too much fire-fighting to do on other fronts to
be able to research it further, sorry.

good luck
Jul 20 '05 #13

P: n/a
Nick Kew wrote:

[snip]
There are longstanding FAQs concerning SHORTTAGS and NET-enabling tags[1],
and the fact that a strict SGML parser permits them is not helpful to
most users (see the list archives for examples of confusion it causes).
That's why the WDG Validator and Page Valet default to parse modes
that complain about them. Do you consider that wrong?

[snip]

It depends on the nature of the complaint. If it claims that the document
is _invalid_, then yes, that would be wrong. If, however, it claims that
the document is valid but may have serious compatibility issues, I'd
consider that to be good behaviour. I see no problem with including
linting behaviour in a validator as long as it clearly separates the
concept of "compatibility issues" from actual mistakes with the markup.

--
Jim Dabell

Jul 20 '05 #14

P: n/a
Jacqui or (maybe) Pete <po****@spamcop.net> wrote:
<URL:http://validator.w3.org:8001/>


Speaking as a humble html hacker, I like the fussy mode. It picks up
things (eg unclosed elements) that while strictly valid html 4.01 strict
generally indicate that I've cocked something up.


First, I am confused (as a non-native speaker of English) by the
term "fussy". There is also the word "fuzzy" in the English language.
I suggest to find a better, technical word as description.

Second, this "validator" labels one and the same document in
<h2>big letters</h2> as "valid" or as "not valid" - just depending
on this "fussy" parameter. Now, is the page valid according the
HTML specifications or not? Yes or No?

This "validator" requires always <tbody> although the HTML*4.01
specifications clearly permit tables without <tbody>.
Suppose I have a <table> with 100 <tr> but without <tbody>.
This "validator" gives me 100 "errors"! But in fact it is only
one "error", namely the omission of <tbody>. How silly!

--
http://www.unics.uni-hannover.de/nhtcapri/plonk.txt
Jul 20 '05 #15

P: n/a
On Fri, 29 Aug 2003 17:08:21 +0000 (UTC), "Jukka K. Korpela"
<jk******@cs.tut.fi> wrote:
"Alan J. Flavell" <fl*****@mail.cern.ch> wrote:
If it presented itself as - I don't know what, let's say for the sake
of argument the "W3C Markup Quality Inspector", with a tagline saying
that it comprised formal validation +and+ practical checking options
I'm afraid they will invent something like that some day, if they admit
that it ain't no validator no more. This will create some new confusion,
partly much worse than validation. After all, "quality assured" sells
much better than "valid".


What value do you see in a valid HTML badge? being valid html is
little relevance to your actual effort in QA., do you care more about
being conformant to some joke statement about HTML as an application
of SGML, or do you care about people actually authoring quality
documents.

If you only care about SGML validation of HTML, then to me you're
living in an irrelevant utopian world, no-one is disagreeing that
"they"* are wrong in using the term valid, or in defaulting to it, yet
you continue to there's any value in the validator at all as I can
see.

I don't know if you're actually intending to claim there's no value in
the validator at all, but that's the impression you're giving. The
impression I get is that strict SGML validation of HTML is of value,
but anything beyond that is useless. I can't agree with that
statement it's trivial to show that given there are not SGML compliant
html implementations (I'd agree with you 100% if you complained that
therefore html 4.01 should not exist as w3 process requires
implementations, W3 process is certainly often flawed) SGML validation
alone can leave numerous situtations which break in the real world.

I'd like to see some constructive comments on how a QA tool could
exist on the web, at the moment you seem to be nothing but negative.
So we would see clueless bosses requiring that
sites get QA stamps, no matter what, and reject any criticism and
questions on the grounds that our site has been Quality Approved by The
Consortium, or something like that.


The W3 makes no such claims about its validator, and in any case how
is that behaviour any different from SGML validation, which is just as
pointless in having in the real world, in fact to me it's considerably
worse, since we know SGML validation is a load of crap which achieves
nothing.

Jim.

* Whoever "they" are, the authors of the validator are pretty well
known, and there's even pretty easy methods to actually communicate
with them directly should you wish.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/

Jul 20 '05 #16

P: n/a
On Fri, 29 Aug 2003 22:02:09 +0000 (UTC), "Jukka K. Korpela"
<jk******@cs.tut.fi> wrote:
ji*@jibbering.com (Jim Ley) wrote:
Whoever "they" are, the authors of the validator are pretty well
known, and there's even pretty easy methods to actually communicate
with them directly should you wish.
For some value of "directly", yes.


Well IRC is relatively direct, what do you want phone numbers? I
imagine an actual teleconference could even be arranged should you
have constructive things to add.
And I actually did try that, hoping
that they would realize the big mistake - and then I saw the beta
announced here and couldn't resist the temptation to comment on it.
Which was good, you're still lacking in any positive comments on how
it could be improved, other than "just do SGML validation" which is
rather pointless in the real world.
It's pretty pointless to discuss the issue here, too, so I will now try
to resist the temptation to comment on your provocations


Please do, I'll certainly be passing on any relevant comments from
ciwah to the validator folks, so don't feel that your comments would
be wasted.

Jim.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/

Jul 20 '05 #17

P: n/a
On Fri, 29 Aug 2003 15:47:49 +0100, Jim Dabell
<ji********@jimdabell.com> wrote:
Nick Kew wrote:

[snip]
There are longstanding FAQs concerning SHORTTAGS and NET-enabling tags[1],
and the fact that a strict SGML parser permits them is not helpful to
most users (see the list archives for examples of confusion it causes).
That's why the WDG Validator and Page Valet default to parse modes
that complain about them. Do you consider that wrong?

[snip]

It depends on the nature of the complaint. If it claims that the document
is _invalid_, then yes, that would be wrong. If, however, it claims that
the document is valid but may have serious compatibility issues, I'd
consider that to be good behaviour. I see no problem with including
linting behaviour in a validator as long as it clearly separates the
concept of "compatibility issues" from actual mistakes with the markup.


I ws thinking of a way to express a similar reservation about "fussy
mode," but you have expressed it much better than I would have.

The example I had in mind was validating a page with a simple table
(without a <tbody> element). I don't find it helpful if the
validator/linter/ whatever you want to call it now treats it the same
as a page with actual markup errors.

Nick

--
Nick Theodorakis
ni******************@urmc.rochester.edu
Jul 20 '05 #18

P: n/a
ni**@fenris.webthing.com (Nick Kew) wrote:
There's a new beta of the W3C Markup Validation Service now live at
<URL:http://validator.w3.org:8001/>


My document
<http://www.unics.uni-hannover.de/nhtcapri/temp/no-tbody.html>
gives *100* errors. Why? If you think (incorrectly) that TBODY
is required, then you should report it as *one* error.

Silly!

I will prefer <http://uk.htmlhelp.com/tools/validator/>

--
Top posting.
What's the most irritating thing on Usenet?
Jul 20 '05 #19

P: n/a
In article <30*************************@rrzn-user.uni-hannover.de>,
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote:
My document
<http://www.unics.uni-hannover.de/nhtcapri/temp/no-tbody.html>
gives *100* errors. Why? If you think (incorrectly) that TBODY
is required, then you should report it as *one* error.
Ah, this is a great example. May I use it as a test case?

(I'll assume your questions were rethorical)

I will prefer <http://uk.htmlhelp.com/tools/validator/>


Good for you. Liam has done an excellent job on the WDG's Validator; as
Nick has done for WebThing's Valet tools <http://valet.webthing.com/>.

--
T.E.R.J.E. - Technician Engineered for Repair and Justified Exploration
B.L.E.S.S. - Biomechanical Lifeform Engineered for Scientific Sabotage
Jul 20 '05 #20

P: n/a

Terje Bless <li*******@pobox.com> wrote:
<http://www.unics.uni-hannover.de/nhtcapri/temp/no-tbody.html>


Ah, this is a great example. May I use it as a test case?


Of course!
Jul 20 '05 #21

P: n/a
On Sat, 30 Aug 2003 20:25:03 +0100, Steve Pugh <st***@pugh.net> wrote:
Check where? There's been announcements in at least two places now.
The W3 Bugzilla perhaps http://www.w3.org/Bugs/Public/query.cgi
It seems to check two different sets of things.
On one hand it checks for SGML related issues that may be valid but
which can cause problems (SHORTTAG, etc.);


It's purely SGML related.

Jim.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/

Jul 20 '05 #22

P: n/a
ji*@jibbering.com (Jim Ley) wrote:
On Sat, 30 Aug 2003 20:25:03 +0100, Steve Pugh <st***@pugh.net> wrote:
Check where? There's been announcements in at least two places now.


The W3 Bugzilla perhaps http://www.w3.org/Bugs/Public/query.cgi
It seems to check two different sets of things.
On one hand it checks for SGML related issues that may be valid but
which can cause problems (SHORTTAG, etc.);


It's purely SGML related.


Then the insisting on the presence of <tbody> is a bug?

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st***@pugh.net> <http://steve.pugh.net/>
Jul 20 '05 #23

P: n/a
On Sat, 30 Aug 2003 20:59:44 +0100, Steve Pugh <st***@pugh.net> wrote:
ji*@jibbering.com (Jim Ley) wrote:
On Sat, 30 Aug 2003 20:25:03 +0100, Steve Pugh <st***@pugh.net> wrote:
Check where? There's been announcements in at least two places now.
The W3 Bugzilla perhaps http://www.w3.org/Bugs/Public/query.cgi
It seems to check two different sets of things.
On one hand it checks for SGML related issues that may be valid but
which can cause problems (SHORTTAG, etc.);


It's purely SGML related.


Then the insisting on the presence of <tbody> is a bug?


No, but it's due to a change of the SGML declaration, just as the
SHORTTAG handling is done.

See:
<URL:
http://www.w3.org/mid/f02000101-1026...3.157.66.23%5D

for a description.

Jim.
--
comp.lang.javascript FAQ - http://jibbering.com/faq/

Jul 20 '05 #24

P: n/a
ji*@jibbering.com (Jim Ley) wrote:
On Sat, 30 Aug 2003 20:59:44 +0100, Steve Pugh <st***@pugh.net> wrote:
ji*@jibbering.com (Jim Ley) wrote:
On Sat, 30 Aug 2003 20:25:03 +0100, Steve Pugh <st***@pugh.net> wrote:

Check where? There's been announcements in at least two places now.

The W3 Bugzilla perhaps http://www.w3.org/Bugs/Public/query.cgi

It seems to check two different sets of things.
On one hand it checks for SGML related issues that may be valid but
which can cause problems (SHORTTAG, etc.);

It's purely SGML related.


Then the insisting on the presence of <tbody> is a bug?


No, but it's due to a change of the SGML declaration, just as the
SHORTTAG handling is done.

See:
<URL:
http://www.w3.org/mid/f02000101-1026...3.157.66.23%5D

for a description.


Thus re-inforcing my earlier point that you need to add documentation
to the validator site. To make this a useful QA tool for HTML authors
rather than just SGML authors it must be understandable by users with
no SGML knowledge.

Steve

--
"My theories appal you, my heresies outrage you,
I never answer letters and you don't like my tie." - The Doctor

Steve Pugh <st***@pugh.net> <http://steve.pugh.net/>
Jul 20 '05 #25

This discussion thread is closed

Replies have been disabled for this discussion.