On Tue, 31 Aug 2004, Alan Wood wrote:
I filed a bug report on this a long time ago:
<http://bugzilla.mozilla.org/show_bug.cgi?id=95067>
If you want it fixed, please vote for it.
In what follows, I'm assuming we're referring to the ASCII
hyphen/minus character, as opposed to the more distinctive unicode
hyphen-like characters which have explicitly-defined line-break
semantics.
As you evidently realise yourself, it's not just a matter of "fixing".
The existing non-breaking behaviour might be rated as sub-optimal, but
it's not actually wrong (per the specification), and any quick fix
would only rate to make behaviour noticeably worse in other cases, as
the bugzilla discussion has already explored.
Reading the comments will
show you that this is not a simple issue.
Quite so. It rates to be a whole i18n development, doesn't it?
It looks as though only the Mozilla family is out of step.
Actually, the HTML specification itself is "out of step", since the
original plan (whether we like it or not) was for the use of as
a hyphenation hint - but the popular browser developers didn't seem
interested in implementing it. That hint is still documented in the
current HTML version, but - since support for it is not mandatory -
it's basically unusable in practice, in the WWW context.
Beyond that, the HTML specification calls only for line-breaking
behaviour to be appropriate to the language and writing system in
question. That's a big topic.
I wouldn't object to voting for the matter to receive further study;
but - IMHO - I doubt that anything is to be gained by voting for a
quick "fix". Such a "fix" would only bring further bug reports in its
wake, as far as I can see.
With due respect, I'm surprised you dismiss the correct Unicode
solution so readily as:
These characters are not on any keyboard, and so would need to be
entered as numeric character references.
I don't have a no-break space on my keyboard either, but that doesn't
stop me from using them liberally. I don't have accented letters on
my keyboard either, but that doesn't prevent me from including them in
text when I need to. Anything that's worth using often enough is
surely worth defining a keyboard method for? For less-needed
characters, there's such a thing as a character picker utility.
These characters are not included in the core
fonts for Windows (Arial, Courier New, Times New Roman)
For interest's sake, I just fired up a shiny new laptop - one on which
I haven't installed any specialised fonts on either of the OSes.
Mozilla under Linux (Scientific Linux 3.02, approx equivalent to RHEL
- Mozilla version is 1.4.3 i.e quite old) displayed the characters all
without difficulty, straight out of the box. I admit I didn't test to
see if their line-breaking properties were honoured per specification,
I was only looking at the display of isolated characters.
With XP Pro, I can report that Mozilla is displaying both U+2010 and
U+2011 in both proportional and monospaced fonts. IE6 is displaying
U+2011 (the non-breaking hyphen) in both kinds of font also, but not
U+2010 in either. I tried changing the default font selection, but
things got no better.
However, if I install XP's OS support for Asian languages, then the
U+2010 miraculously appears in IE, along with lots of other nice
Unicode symbols. This is yet another confirmation of the remark on my
browsers-fonts web page, that installing support for other writing
systems can have quite unexpected benefits on IE's display, even if
one doesn't read those other writing systems!
cheers