What is the correct way to mark up, say, a div or p, to indicate
that it is in a different language to the main page? Are there
any potential pitfalls with different browsers associated with
doing this? If it makes any difference, this is in reference to
a page in mixed English and French: http://www.chem.utoronto.ca/IChO.Ontario/index.html 21 1917
On 2008-10-21, David Stone <no******@domain.invalidwrote:
What is the correct way to mark up, say, a div or p, to indicate
that it is in a different language to the main page?
You just do <div lang="en"etc. What browsers actually do with that
lang attribute is not so clear. In most cases probably nothing, although
it may influence choice of font in some.
Most fonts I've seen that are used for English will also contain all the
glyphs needed for French anyway.
I don't know if aural renderers use it to influence choice of speech
synthesizer. I doubt it, but you never know.
Are there any potential pitfalls with different browsers associated
with doing this? If it makes any difference, this is in reference to a
page in mixed English and French: http://www.chem.utoronto.ca/IChO.Ontario/index.html
Ben C wrote:
You just do <div lang="en"etc. What browsers actually do with that
lang attribute is not so clear. In most cases probably nothing, although
it may influence choice of font in some.
Well, some languages display right-to left so the difference there
should be significant. I believe that there are also spacing issues
around some punctuation, and word-splitting issues as well. Of course,
it's all down to the care with which the browser was coded.
--
Steve Swift http://www.swiftys.org.uk/swifty.html http://www.ringers.org.uk
On 2008-10-22, Swifty <st***********@gmail.comwrote:
Ben C wrote:
>You just do <div lang="en"etc. What browsers actually do with that lang attribute is not so clear. In most cases probably nothing, although it may influence choice of font in some.
Well, some languages display right-to left so the difference there
should be significant.
For that you've got to use dir=rtl or "direction: rtl". lang=ar by
itself won't make any difference.
I believe that there are also spacing issues around some punctuation,
and word-splitting issues as well. Of course, it's all down to the
care with which the browser was coded.
I haven't seen lang making a difference, but perhaps it should. Some
browsers use something based on Unicode Annex 14 for line-breaking, and
language is not involved in the algorithm they describe there.
See also http://www.cs.tut.fi/~jkorpela/unicode/linebr.html
On Wed, 22 Oct 2008, Ben C wrote:
For that you've got to use dir=rtl or "direction: rtl". lang=ar by
itself won't make any difference.
But the use of Arabic script should make a difference without the need of
specifying the writing direction.
I believe that there are also spacing issues around some punctuation,
and word-splitting issues as well. Of course, it's all down to the
care with which the browser was coded.
One example could be the interpretation of a quote symbol like <q>:
<p lang="en">The word <q><span lang="fr">chef</span></qis of French origin.</p>
should be rendered as
The word ``chef´´ is of French origin.
whereas the (incorrect)
<p lang="en">The word <span lang="fr"><q>chef</q></spanis of French origin.</p>
as
The word « chef » is of French origin.
--
Helmut Richter
Helmut Richter schreef:
On Wed, 22 Oct 2008, Ben C wrote:
>For that you've got to use dir=rtl or "direction: rtl". lang=ar by itself won't make any difference.
But the use of Arabic script should make a difference without the need of
specifying the writing direction.
>>I believe that there are also spacing issues around some punctuation, and word-splitting issues as well. Of course, it's all down to the care with which the browser was coded.
One example could be the interpretation of a quote symbol like <q>:
<p lang="en">The word <q><span lang="fr">chef</span></qis of French origin.</p>
should be rendered as
The word ``chef´´ is of French origin.
You mean
The word “chef” is of French origin.
:-p
H.
--
Hendrik Maryns http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
On 2008-10-22, Helmut Richter <hh***@web.dewrote:
On Wed, 22 Oct 2008, Ben C wrote:
>For that you've got to use dir=rtl or "direction: rtl". lang=ar by itself won't make any difference.
But the use of Arabic script should make a difference without the need of
specifying the writing direction.
It will make a difference but it won't be quite right in all
circumstances.
For a simple Arabic string it will be OK (although left-aligned), but if
you've got Roman characters embedded in there, the bidi base direction
will be wrong.
You can see the result of bidi base direction if you try an example like
this:
<div dir="rtl">
ARABIC hello
</div>
which should appear as "hello CIBARA"
<div dir="ltr">
ARABIC hello
</div>
which should appear as "CIBARA hello".
I'm using capitals to mean strongly right-to-left characters-- of course
you'd need real Arabic in the example for it to work.
Unicode Annex 9 defines three bidi base directions: left-to-right,
right-to-left and neutral.
In HTML and CSS specifications you get left-to-right unless you specify
dir or direction respectively to get right-to-left. You can't have
neutral (except perhaps in a textarea or input).
I believe that there are also spacing issues around some punctuation,
and word-splitting issues as well. Of course, it's all down to the
care with which the browser was coded.
One example could be the interpretation of a quote symbol like <q>:
<p lang="en">The word <q><span lang="fr">chef</span></qis of French origin.</p>
should be rendered as
The word ``chef´´ is of French origin.
whereas the (incorrect)
<p lang="en">The word <span lang="fr"><q>chef</q></spanis of French origin.</p>
as
The word « chef » is of French origin.
Yes and there is stuff in CSS to do all that-- see the "quotes"
property, content: open-quote, and lang pseudos in CSS 2.1.
Not sure if any of the browsers actually implement all that stuff
though.
I think Korpela recommends just type the quote characters you want and
don't bother with <qbut I hope I'm not misquoting him [pause for
groans].
On Wed, 22 Oct 2008, Stefan Ram wrote:
(However, »chef« as used above actually is the English
word (because it is said that it was of french origin),
and so it should not be marked as french.
The English word »chef« is of french origin.
The French word »chef« is not of french origin, it /is/ french.)
Right. I should have taken better example.
--
Helmut Richter
On Wed, 22 Oct 2008, Hendrik Maryns wrote:
Helmut Richter schreef:
One example could be the interpretation of a quote symbol like <q>:
<p lang="en">The word <q><span lang="fr">chef</span></qis of French origin.</p>
should be rendered as
The word ``chef´´ is of French origin.
You mean
The word «chef» is of French origin.
No.I meant what I wrote.
1) When the quotes are in the outer text, they are English. These are also the
correct quotes (at least according to German quote rules where the *outer*
language determines the form of the quotes at least as long as the quoted
text is not a paragraph of its own).
2) Guillemets are used with a space to the enclosed text:
« chef »
In German, they are sometimes used the other way round without spaces
instead of other quotes:
»chef«
--
Helmut Richter
Helmut Richter schreef:
On Wed, 22 Oct 2008, Hendrik Maryns wrote:
>Helmut Richter schreef:
>>One example could be the interpretation of a quote symbol like <q>:
<p lang="en">The word <q><span lang="fr">chef</span></qis of French origin.</p>
should be rendered as
The word ``chef´´ is of French origin.
You mean
The word «chef» is of French origin.
No.I meant what I wrote.
This is interesting. I did not type «» (i.e. guillemets) at all. I
actually typed “” (i.e. proper curly open and close quotes); it seems
like your newsreader has interpreted them as guillemets anyway. Funny.
I suppose you (or me?) have an encoding problem.
H.
--
Hendrik Maryns http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
Ask smart questions, get good answers: http://www.catb.org/~esr/faqs/smart-questions.html
On Wed, 22 Oct 2008, Hendrik Maryns wrote:
This is interesting. I did not type «» (i.e. guillemets) at all. I
actually typed «» (i.e. proper curly open and close quotes); it seems
like your newsreader has interpreted them as guillemets anyway. Funny.
I suppose you (or me?) have an encoding problem.
It is me who has an encoding problemą). Had it resulted in illegible
characters (?chef?), I would have checked. But as it looked like a
possibly intended usage of guillemets I did not check. I am sorry for the
oversight.
ą) The newsreader correctly converts UTF-8 to ISO-8859-1 if the character
exists there. Other characters are converted to something the newsreader
considers appropriate. I was not aware that English quotes are converted
to guillemets: it is much too seldom that I receive text with English
quotes.
--
Helmut Richter
Ben C wrote:
You just do <div lang="en"etc.
That's a right thing to do, though in practical terms, it does not matter
much.
What browsers actually do with that
lang attribute is not so clear. In most cases probably nothing,
although it may influence choice of font in some.
Mostly for East Asian languages, and only when the page does not set font -
and most pages do, no matter what we think about that.
Most fonts I've seen that are used for English will also contain all
the glyphs needed for French anyway.
Well, yes, and I would expect any browser default font to contain all French
characters, anyway.
I don't know if aural renderers use it to influence choice of speech
synthesizer.
Some of them use, at least optionally. But in fact, considering the web as a
whole, good algorithmic language guessing (from the content) generally
produces better results. There are so many non-English pages incorrectly
marked up as English, due to misunderstandings or, most often, due to web
authoring software defaults.
--
Yucca, http://www.cs.tut.fi/~jkorpela/
On Wed, 22 Oct 2008, Jukka K. Korpela wrote:
Some of them use, at least optionally. But in fact, considering the web as a
whole, good algorithmic language guessing (from the content) generally
produces better results.
And in most contexts there are not many languages to choose from. I am now
transferring data to another CMS, and I use the simple algorithm "if there
are twice as many "the" (as single words) than "der", the language is
"en", otherwise "de". It is *much* more reliable than trusting the
language explicitly specified by the authors in the old CMS.
--
Helmut Richter
In article
<Pi******************************@lxhri01.lrz.lr z-muenchen.de>,
Helmut Richter <hh***@web.dewrote:
On Wed, 22 Oct 2008, Jukka K. Korpela wrote:
Some of them use, at least optionally. But in fact, considering the web as a
whole, good algorithmic language guessing (from the content) generally
produces better results.
And in most contexts there are not many languages to choose from. I am now
transferring data to another CMS, and I use the simple algorithm "if there
are twice as many "the" (as single words) than "der", the language is
"en", otherwise "de". It is *much* more reliable than trusting the
language explicitly specified by the authors in the old CMS.
So what everyone seems to be saying is that there isn't much practical
point in specifying page language, except (i) if it requires a particular
character set (which is specified separately), (ii) if it differs from
left-to-right direction (which is specified separately), and/or (iii) to
be nice?
On 2008-10-23, David Stone <no******@domain.invalidwrote:
In article <Pi******************************@lxhri01.lrz.l rz-muenchen.de>,
Helmut Richter <hh***@web.dewrote:
>On Wed, 22 Oct 2008, Jukka K. Korpela wrote:
Some of them use, at least optionally. But in fact, considering the web as a
whole, good algorithmic language guessing (from the content) generally
produces better results.
And in most contexts there are not many languages to choose from. I am now transferring data to another CMS, and I use the simple algorithm "if there are twice as many "the" (as single words) than "der", the language is "en", otherwise "de". It is *much* more reliable than trusting the language explicitly specified by the authors in the old CMS.
So what everyone seems to be saying is that there isn't much practical
point in specifying page language, except (i) if it requires a particular
character set (which is specified separately),
More if it requires a particular font (which is usually set explicitly
or detected separately).
(ii) if it differs from
left-to-right direction (which is specified separately), and/or (iii) to
be nice?
In article
<Pi******************************@s5b004.rrzn.un i-hannover.de>,
Andreas Prilop <pr********@trashmail.netwrote:
On Thu, 23 Oct 2008, David Stone wrote:
User-Agent: MT-NewsWatcher/3.5.2 (PPC Mac OS X)
When you write about such a subject, you should at least set up
your newsreader properly so that it can post and display non-English
characters correctly: http://www.smfr.org/mtnw/docs/Mime.h...sage_with_MIME http://www.smfr.org/mtnw/docs/TextEncoding.html
Euro sign: ¤
Cent sign: ˘
I've honestly never considered doing so, because I've always
avoided using characters in usenet posts that aren't in the
basic ASCII set. I'd just use "euros" and "cents" (or the
ubiquitous "c") instead.
However, I did find a "Send with MIME" option in the preferences,
so I checked it. Don't know if it will affect this reply, though.
I don't think I've ever needed to do a bilingual post (largely
because I am monolingual); the reason for this particular thread
is because I am currently responsible for a web site that has to
be in English and French. Parlais Frainglais, anyone?
Andreas Prilop wrote:
Google translates <TT but it does not translate <CODE-
even without any class=notranslate .
This is another point for semantic markup with CODE
instead of just TT.
There's a logical gap here, though. Computer code may well contain comments,
which are (in theory at least) supposed to be in some human language and
understandable to speakers of that language. If <CODEimplies
non-translation, then there is no way, even with explicit markup, to specify
that comments be translated.
The page http://www.google.com/intl/en/help/faq_translation.html describes
class=notranslate but no attribute for turning translation on (inside an
element that is treated as nontranslatable). Looks like command-oriented tag
design, which even forgot to provide a way to give the opposite command.
--
Yucca, http://www.cs.tut.fi/~jkorpela/
On Sat, 25 Oct 2008, Jukka K. Korpela wrote:
The page http://www.google.com/intl/en/help/faq_translation.html
describes class=notranslate
Another observation: When I have
<table dir="ltr" lang="fr" class="notranslate">
Google will still mess around with it. On translating the page
from English to Arabic or Hebrew, Google changes the direction
of the table to right-to-left and the table is f*cked up.
--
In memoriam Alan J. Flavell http://www.alanflavell.org.uk/charset/ This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: mike420 |
last post by:
In the context of LATEX, some Pythonista asked what the big
successes of Lisp were. I think there were at least three *big*
successes.
a. orbitz.com web site uses Lisp for algorithms, etc.
b....
|
by: bearophile |
last post by:
Ville Vainio:
>It's highly typical for the newbies to suggest improvements to the
>language. They will usually learn that they are wrong, but the
>discussion that ensues can be fruitfull anyway...
|
by: Adonis |
last post by:
What I do not understand, or not clear to me I should say, is how can some
people regard Python as a scripting language? In particular the JAVA crowd.
Unless my understanding is off, and from what...
|
by: Fresh Air Rider |
last post by:
Hello
Could anyone please explain how I can pass more than one
arguement/parameter value to a function using <asp:linkbutton> or is
this a major shortfall of the language ?
Consider the...
|
by: Marc Violette |
last post by:
<Reply-To: veejunk@sympatico.ca>
Hello,
I'm hoping someone can help me out here... I'm a beginner ASP.NET
developper, and am trying to follow a series of exercises in the book
entitled...
| |
by: ptass |
last post by:
Hi
In asp.net 2.0 an aspx files .cs file is a partial class and all works fine,
however,
I thought I’d be able to create another class file, call it a partial class
and have
that compile and...
|
by: Water Cooler v2 |
last post by:
Questions:
1. Can there be more than a single script block in a given HEAD tag?
2. Can there be more than a single script block in a given BODY tag?
To test, I tried the following code. None...
|
by: =?Utf-8?B?V2FubmFiZQ==?= |
last post by:
We have a page that is loading very slow. There is not a lot of data, not a
lot of users are connected at the same time and the page does not produce an
error, so I am not sure where to start to...
|
by: xirowei |
last post by:
I try to search information from many websites, but what i can found is they only demonstrate the example with ONE ATTRIBUTE in a Cookie only.
What i want is how to set more than 1 attribute in a...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
| |
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
| |
by: bsmnconsultancy |
last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...
| |