By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,226 Members | 1,415 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,226 IT Pros & Developers. It's quick & easy.

Any support to layout controls ‍, ‌, ‎, ‏?

P: n/a
The HTML specifications define the entities ‍, ‌, ‎, ‏
as denoting zero-width joiner, zero-width non-joiner, left to right
mark, and right to left mark.

Is there any evidence of any browser support to the characters so
denoted, in the sense defined in the Unicode standard, chapter 15?
( ‍, ‌, ‎, ‏ )
For example, does f‍i ever produce an fi ligature? In my tests, the
best I get is that the characters are ignored - but even worse things
happen, like rendering them as narrow vertical bars.

_Should_ they be treated according to the Unicode standard by browsers?

If not, what was the point of including entities for them. It seems
that people just get confused if they start wondering what they are
when they see them in various lists of entities, or start looking for
such layout controls.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #1
Share this Question
Share on Google+
5 Replies


P: n/a
DU
Jukka K. Korpela wrote:
The HTML specifications define the entities ‍, ‌, ‎, ‏
as denoting zero-width joiner, zero-width non-joiner, left to right
mark, and right to left mark.

Is there any evidence of any browser support to the characters so
denoted, in the sense defined in the Unicode standard, chapter 15?
( ‍, ‌, ‎, ‏ )
Direction mark characters ‎, ‏ are supported in Opera 7.2+,
Mozilla 1.x, MSIE 6
Direction mark characters
http://www.robinlionheart.com/stds/html4/dir.html

‍, ‌ are not supported according to
http://www.robinlionheart.com/stds/html4/spchars.html

Bug 25142: [FEATURE] implement OpenType ligatures
http://bugzilla.mozilla.org/show_bug.cgi?id=25142
For example, does f‍i ever produce an fi ligature?
Not in Opera 7.50 PR1 build 3494 nor in Mozilla 1.6 build 20040113 under
windows XP Pro SP1.

DU

In my tests, the best I get is that the characters are ignored - but even worse things
happen, like rendering them as narrow vertical bars.

_Should_ they be treated according to the Unicode standard by browsers?

If not, what was the point of including entities for them. It seems
that people just get confused if they start wondering what they are
when they see them in various lists of entities, or start looking for
such layout controls.

Jul 20 '05 #2

P: n/a
"Jukka K. Korpela" <jk******@cs.tut.fi> wrote:
The HTML specifications define the entities &zwj;, &zwnj;, &lrm;, &rlm;
as denoting zero-width joiner, zero-width non-joiner, left to right
mark, and right to left mark.
Is there any evidence of any browser support to the characters so
denoted, in the sense defined in the Unicode standard, chapter 15?
I don't know about the entities &zwj; &zwnj; but * is
supported by Mozilla 1.3 and Internet Explorer 6.0. It is not
recognized by Netscape 7.0, Mozilla 1.0 and Opera 7.23. Test with
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html>
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html6>
In the third column, you should see three different glyphs.

&lrm; and &rlm; are completely pointless. In text/html, you are
better off by using the markup "DIR=RTL" and "DIR=LTR", resp. See
<http://ppewww.ph.gla.ac.uk/~flavell/charset/text-direction.html>
<http://www.unics.uni-hannover.de/nhtcapri/temp/parentheses.html>
For example, does f&zwj;i ever produce an fi ligature?


&zwj; and &zwnj; are essential for the Arabic script; see
<http://students.washington.edu/irina/persianword/zwj.htm>
<http://students.washington.edu/irina/persianword/zwnj.htm>
With the Latin script, they would be purely cosmetically.
Jul 20 '05 #3

P: n/a
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote:
I don't know about the entities &zwj; &zwnj; but * is
supported by Mozilla 1.3 and Internet Explorer 6.0.
I would be surprised if they didn't support the entities in the sense
of treating them as identical with the character references.

But I need to confess that I had not checked the specification well
enough before asking. HTML 4 spec mentions the layout controls (rather
confusingly, but still) at
http://www.w3.org/TR/html4/struct/dirlang.html#h-8.2.5
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html>
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html6>
In the third column, you should see three different glyphs.
I partly do. I'm rather confused now. Does the behavior actually depend
on special processing of Arabic characters in those browsers? Why do I
see a vertical bar when I try f&zwj;i? I'm not saying it should produce
a ligature; but producing f and i with some mess between them is not my
idea of supporting zero-width joiner.

I see _some_ differences there but in most cases the glyphs are
similar.
&lrm; and &rlm; are completely pointless.
I tend to agree.
&zwj; and &zwnj; are essential for the Arabic script; see
<http://students.washington.edu/irina/persianword/zwj.htm>
<http://students.washington.edu/irina/persianword/zwnj.htm>
The pages present the essential examples as images...
With the Latin script, they would be purely cosmetically.


Surely. But maybe nice. :-)

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #4

P: n/a
On Tue, 3 Feb 2004, Jukka K. Korpela wrote:
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html>
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html6>
In the third column, you should see three different glyphs.
I partly do. I'm rather confused now. Does the behavior actually depend
on special processing of Arabic characters in those browsers?


Well, the production if initial, medial, final and isolated forms
for Arabic depends in any case on "special processing"; but that's
nothing specific to the &zwj; character.
Why do I see a vertical bar when I try f&zwj;i?


Actually I'm seeing something resembling a US "railroad grade
crossing" sign: a vertical bar with a diagonal cross on top.

My hunch is that the font is populated with this glyph - well no, I
_know_ (thanks to the ListFont utility) that the font (Arial) is
populated with this glyph - and the browser (Mozilla 1.6 in this case)
doesn't seem to understand that it's supposed to be dealing with it in
a special way, so it just displays what's in the font at that
position. If I try the same test with Opera (7.23), then I get the
same misbehaviour (i.e I see the little grade-crossing sign) with
Arabic too.

And the font (Arial) seems to have equally cryptic glyphs for U+200C,
E, F, and for 202C, D and E. Looking at other fonts (Arial Unicode
MS, Tahoma etc.), they vary in whether they have such glyphs or not
at these positions.

So the bottom line to your question seems to be "yes", Mozilla _is_
doing some special processing for Arabic in relation to this character
- which it's failing to do for your Latin characters. Time to consult
the Bugzilla?

(And do you feel like submitting an Opera bug report? I've got one on
the go already, as I mentioned in the last day or two...).
Jul 20 '05 #5

P: n/a
"Jukka K. Korpela" <jk******@cs.tut.fi> wrote:
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote:
I don't know about the entities &zwj; &zwnj; but * is
supported by Mozilla 1.3 and Internet Explorer 6.0.


I would be surprised if they didn't support the entities in the
sense of treating them as identical with the character references.


It seems that they support is OK in that respect.
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html>
<http://www.unics.uni-hannover.de/nhtcapri/arabic-alphabet.html6>
In the third column, you should see three different glyphs.


I partly do. I'm rather confused now.


OK, after some studies (Arabic is really Hebrew to me ;-)) I realize
that I was mistaken - the glyphs look OK, I just hadn't realized that
many of the glyphs of characters there are meant to be the same whereas
for ba' for example they should be different, and are. When I use
Arial Unicode MS. Oddly, the Arabic characters look very different in
different fonts.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #6

This discussion thread is closed

Replies have been disabled for this discussion.