Arbitrary definition of class names by user agents

Stefan Ram wrote (in "More than one language in a page"):

In this case, one might even use Google's new attribute value:

<p lang="en">The word
<q><span lang="fr" class="notranslate">chef</span></q>
is of French origin.</p>

See

http://googlewebmastercentral.blogsp...ge-barrier.htm

Is this a new trend of user-agent writers (Microformats, and now Google)
staking claims on the @class namespace? I'm surely not the only one
disturbed by this. Somehow, an author publishing on the web, with no
control over which user agents will access his page, has to avoid
clashes with the union of all names deemed special by all those user
agents, now and in the future?

I suppose the proponents justify this practice by a line in the HTML
spec (HTML4.01 Â§7.5.2), that class names are also for "general purpose
processing by user agents" as well as stylesheet selectors. It doesn't
go into any further detail, but I don't think it was the intention that
applications which the author has no control over (e.g. once a page is
published) should define class names willy-nilly. More likely, the
author would have opted in to some scheme, such as a company's internal
robot to do some advanced indexing on all its own pages.

Here are some ideas for external interpretation, i.e. by some 'third
party' such as Google:

* Opt in to a third party's scheme. Register ones URIs with Google,
so they know that 'notranslate' means what they think on those
pages. I don't fancy doing that with a lot of third parties, though.
* Third parties register class names with an authority (e.g. W3C).
But still, authors have to watch out for future uses of names.
And third parties shouldn't have to register with W3C when they've
already registered (for example) DNS names.
* Define a sub-namespace not used by CSS to form DNS-like names,
e.g. ':com:google:notranslate'. Okay, but potentially verbose if
used a lot. And it doesn't generally sidestep non-CSS mechanisms
of defining class names.
* Use head/@profile with a URI owned by the third party. This is
what Microformats seem to be doing, but I don't think it is
adequate. Independent microformats used in the same page still
have to avoid clashing with each other, which means going back to
some authority's third-party register. Plus, the author doesn't
have control over the class names - it's all or nothing for a
particular format.
* Extend CSS with properties not related to style. There's nothing
in the framework of CSS that limits it to just style (right?). I
favour this, and shall elaborate on it...

Google could define a CSS property which turns translation on or off,
and the author could associate any class he chooses (indeed, any CSS
selector) with that property:

.notranslate { // Okay, so he chose the same one after all! ;-)
-google-translation: disable;
}

Then, to avoid Google having to scan his stylesheets just to find this
rule, the author links it in with:

<link rel="stylesheet" media="translator" href="...">

Other user agents won't touch it, because they don't recognise
"translator". Google won't touch other stylesheets because they're not
labelled with "translator".

A few issues raised by this approach are:

* It's not style/presentation, which is what CSS was designed for.
But I think this is a superficial problem - just regard the name
"CSS" and rel="stylesheet" as historical accidents, and CSS
becomes an application of arbitrary properties, that happens to
include ones related to style.
* It's now invading the CSS-property and media-type namespaces. But
both of these could go the same way as XML namespaces and
link/@rel schemas, if necessary.

To summarise: Rather than user agents stomping over the heretofore
author-defined namespace of class names, they should fit into it in the
same way that CSS properties do. This would scale better, and would be
less intrusive on the author's ability to choose.

Oct 26 '08 #1

Subscribe Post Reply

2993

Jukka K. Korpela

Steven Simpson wrote:

Is this a new trend of user-agent writers (Microformats, and now
Google) staking claims on the @class namespace?

It surely is, and all the warnings seem to get ignored. The idea of
assigning fixed meanings to class names sounds _so_ cool and useful, and you
don't need anybody's permission or time-wasting discussions!

And it probably looks obvious that "notranslate" won't accidentally be used
for something else by someone else, so it looks safe to define it as you
like. It might be different with shorter and more vague class names like
"date" - does it refer to date notations, or dating, or something else? You
cannot possibly know what the string "date" might intuitively mean to
billions of people speaking hundreds of different languages. So by
declaring, say, "date" as predefined, you would assign arbitrary meanings to
an unknown number of constructs in documents, meanings that need not have
anything to do with the intentions of their authors.

In fact, "notranslate" is potentially very risky too. It is true that in any
existing document, it probably relates to someone's intentions of not having
something translated. But it might also mean that something _has not_ been
translated. Or it might mean 'do not translate (the content)' in a very
specific and limited technical meaning, _not_ a universal declaration that
the content should not be translated. For example, in some bilingual site
maintenance approach, it might be an instruction to human translators to
leave the content untranslated, since it shall be the same in both
languages - without meaning that it should be the same in _all_ languages.

The only sensible approach in using class attributes for purposes like
"notranslate" in the Google technique would have been to use a class name
that is syntactically malformed by existing specifications. That way, no
legitimate existing usage of the string as class attribute would have been
affected.

Even better, a new attribute (or element) should have been introduced.

Someone might say that from the viewpoint of generalized markup, a
processing instruction might have been the most adequate approach. But
generalized markup is water under the bridge, and we live with tag sets that
everyone can use as he likes and sees fit.

And on the realistic side, translation instructions should not really be
merged into markup. They are process-oriented, not data-oriented or
structure-oriented. You typically have words or phrases that should not be
translated, and would you really like to be forced to add
non-translatability markup into each and every occurrence in each document,
instead of having e.g. a site-wide glossary of terms that specifies them,
among other things?

Besides, the most common case for non-translatability that I can imagine
right now is English words and phrases in non-English text. For them, common
sense might say that it should suffice to declare their language as English.
When translating, say, some text from Dutch to French, you are normally not
supposed to translate any English words and phrases in them. If they are OK
in the original, they're usually the right choice in the translation as
well. So the only thing needed would be language markup.

Google could define a CSS property which turns translation on or off,

That would be even more wrong than using "predefined" class names, since
translation issues are not presentational in the sense that CSS is supposed
to be.

* It's not style/presentation, which is what CSS was designed for.
But I think this is a superficial problem - just regard the name
"CSS" and rel="stylesheet" as historical accidents, and CSS
becomes an application of arbitrary properties, that happens to
include ones related to style.

Excuse me while fall into despair.

To summarise: Rather than user agents stomping over the heretofore
author-defined namespace of class names, they should fit into it in
the same way that CSS properties do.

I cannot recognize parody any more, sorry.

--
Yucca, http://www.cs.tut.fi/~jkorpela/

Oct 26 '08 #2

Ben Bacarisse

"Jukka K. Korpela" <jk******@cs.tut.fiwrites:

Steven Simpson wrote:

>Is this a new trend of user-agent writers (Microformats, and now
Google) staking claims on the @class namespace?

<snip>

In fact, "notranslate" is potentially very risky too. It is true that
in any existing document, it probably relates to someone's intentions
of not having something translated. But it might also mean that
something _has not_ been translated. Or it might mean 'do not
translate (the content)' in a very specific and limited technical
meaning, _not_ a universal declaration that the content should not be
translated. For example, in some bilingual site maintenance approach,
it might be an instruction to human translators to leave the content
untranslated, since it shall be the same in both languages - without
meaning that it should be the same in _all_ languages.

Agreed. It could also relate to the other meaning of "translate" --
the geometric one. A paragraph which is to be left in its normal
position, not translated in any direction, might well be marked
"notranslate".

--
Ben.

Oct 26 '08 #3

Steven Simpson

Jukka K. Korpela wrote:

Steven Simpson wrote:
>Google could define a CSS property which turns translation on or off,

That would be even more wrong than using "predefined" class names,
since translation issues are not presentational in the sense that CSS
is supposed to be.

> * It's not style/presentation, which is what CSS was designed for.
But I think this is a superficial problem - just regard the name
"CSS" and rel="stylesheet" as historical accidents, and CSS
becomes an application of arbitrary properties, that happens to
include ones related to style.

Excuse me while fall into despair.

What's wrong? I'm not suggesting that we abandon the distinction
between content and presentation, merely recognising that only two
things constrain CSS technically to presentation:

* the set of properties defined by various specs,
* the media type/query filter,

....and by extending these together, you get a framework still capable of
separating presentation from content, but also capable of separating
other kinds of (erm) "interpretation" from content.

Looking at it another way, if you wanted to devise a framework for the
latter separation, you could easily come up with one identical to that
used for the former, except that:

* the file format's property set would differ from CSS's,
* you'd have a different set of @media,
* you wouldn't call the format CSS,
* your @rel type wouldn't mention 'style'.

It would be technically sufficient to continue using @rel="stylesheet",
and rely on @media to distinguish between presentation and 'other kinds
of interpretation'. But if that really is a problem, just use
@rel="propertysheet".

Oct 26 '08 #4

Harlan Messinger

Jukka K. Korpela wrote:

Steven Simpson wrote:

>Is this a new trend of user-agent writers (Microformats, and now
Google) staking claims on the @class namespace?

It surely is, and all the warnings seem to get ignored. The idea of
assigning fixed meanings to class names sounds _so_ cool and useful, and
you don't need anybody's permission or time-wasting discussions!

And it probably looks obvious that "notranslate" won't accidentally be
used for something else by someone else, so it looks safe to define it
as you like. It might be different with shorter and more vague class
names like "date" - does it refer to date notations, or dating, or
something else? You cannot possibly know what the string "date" might
intuitively mean to billions of people speaking hundreds of different
languages. So by declaring, say, "date" as predefined, you would assign
arbitrary meanings to an unknown number of constructs in documents,
meanings that need not have anything to do with the intentions of their
authors.

In fact, "notranslate" is potentially very risky too. It is true that in
any existing document, it probably relates to someone's intentions of
not having something translated. But it might also mean that something
_has not_ been translated. Or it might mean 'do not translate (the
content)' in a very specific and limited technical meaning, _not_ a
universal declaration that the content should not be translated. For
example, in some bilingual site maintenance approach, it might be an
instruction to human translators to leave the content untranslated,
since it shall be the same in both languages - without meaning that it
should be the same in _all_ languages.

The only sensible approach in using class attributes for purposes like
"notranslate" in the Google technique would have been to use a class
name that is syntactically malformed by existing specifications. That
way, no legitimate existing usage of the string as class attribute would
have been affected.

If Google had specified class="google:notranslate" in place of
class="notranslate", despite the lack of any intrinsic significance of
the x: in class names it would have gone a long way toward eliminating
potential conflict.

Oct 27 '08 #5

Similar topics

Multiple Classes in "class" ok?

by: Will Hartung | last post by:

Can someone clarify that multiple classes in the "class" attribute are ok and "legal" and not some fluke? So, I can do: ..pink {color: pink} ..bold {font-weight: bold} ..medium {font-size:...

HTML / CSS

Where in the class definition should these be put

by: Tony Johansson | last post by:

Hello! I have these two statements typedef const char *enum_txt; and enum_txt phase_tab ={"IDLE","LAUNCHING","LAUNCHED","ROLLING","BOOSTERRELEASE","SPACE"}; where in the class definition...

C / C++

Interface definition for static methods

by: Steven Livingstone | last post by:

Anyone able to explain to me why you cannot define an interface that can then be implemented using static methods? I understand the C# CLS states this, but just interested in the reasons behind...

C# / C Sharp

Timing out arbitrary functions

by: Steven D'Aprano | last post by:

I have a problem and I don't know where to start looking for a solution. I have a class that needs to call an arbitrary function and wait for a result. The function, being completely arbitrary...

Python

A browser definition file for crawlers

by: Stefano | last post by:

Hi all, I'm trying to create a browser definition file (.browser) that matches crawlers user agents. I don't want modify browser files in the Config system folder. I'd like to use App_Browsers...

ASP.NET

How to select an arbitrary area on a form using sender / what EventHandler?

by: raylopez99 | last post by:

I have a form, Form6, that has a bunch of buttons overlaid on it. I want to be able to click on any arbitrary area of the form, and if that area of the form is overlaid by a button, I want to...

C# / C Sharp

New python module to simulate arbitrary fixed and infinite precisionbinary floating point

by: Rob Clewley | last post by:

Dear Pythonistas, How many times have we seen posts recently along the lines of "why is it that 0.1 appears as 0.10000000000000001 in python?" that lead to posters being sent to the definition...

Python

275

Finding the instance reference of an object

by: Astley Le Jasper | last post by:

Sorry for the numpty question ... How do you find the reference name of an object? So if i have this bob = modulename.objectname() how do i find that the name is 'bob'

Python

how to check for all the fields in user registrations perl script

by: happyse27 | last post by:

Hi All, I modified the user registration script, but not sure how to make it check for each variable in terms of preventing junk registration and invalid characters? Two codes below : a)...

Perl

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware