469,356 Members | 2,012 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,356 developers. It's quick & easy.

Firefox doesn't break lines at hyphens

IE does, and I can't remember this used to be a problem in Netscape. I
guess someone in the Mozilla team just came up with a Smart Idea about
the True Semantics of the Hyphen Minus character. :-( How do I make IE
and Firefox agree to make hyphens trigger a linebreak?

Gustaf
Jul 23 '05 #1
18 20125
On Mon, 30 Aug 2004 08:17:28 +0200, Gustaf Liljegren
<gu*****@algonet.se> wrote:
IE does, and I can't remember this used to be a problem in Netscape. I
guess someone in the Mozilla team just came up with a Smart Idea about
the True Semantics of the Hyphen Minus character. :-( How do I make IE
and Firefox agree to make hyphens trigger a linebreak?


Why would you want to do that?

Also please know that this subject has been discussed before of course.
You may want to use, as one possible search starting point, definition
and references to discussions on the entity ­ to find out what has
been said so far.

http://groups.google.com/groups?q=%2...authoring.html

Mind you that there is more to this than what meets the eye initially.

--
Rex
Jul 23 '05 #2
Jan Roland Eriksson wrote:
On Mon, 30 Aug 2004 08:17:28 +0200, Gustaf Liljegren
<gu*****@algonet.se> wrote:
IE does, and I can't remember this used to be a problem in
Netscape. I guess someone in the Mozilla team just came up with a
Smart Idea about the True Semantics of the Hyphen Minus character.
:-( How do I make IE and Firefox agree to make hyphens trigger a
linebreak?


Why would you want to do that?


To jump in the conversation, a line-break after a hyphen is normal
usage in books, newspapers etc. So the question would be why would one
*not* want that?

I ran into the same problem recently. I was using pixel-values for
table-cells and Firefox did not dare to break state names such as
"Baden-Wurtemberg", even though Internet Explorer did. Firefox just
ignored the table-cell widths as defined in the CSS and made the table
move out of the layout frame as suggested by our designers. (Now I
suppose someone will tell me I can't have pixel-perfect layout on the
Web and should rather go for PDF... great. Because line-breaks *do*
usually work, and HTML is well for this.)

By the way, playing around with putting span trickery etc. here and
there didn't solve the problem. I finally had to insert hard breaks
("<br />") via the ASP that served the pages.

--
Google Blogoscoped
http://blog.outer-court.com
Jul 23 '05 #3
Jan Roland Eriksson wrote:
How do I make IE and Firefox agree to make hyphens trigger a
linebreak?
Why would you want to do that?


My text has a lot of hyphens in it. In fact,
every-space-character-is-replaced-with-a-hyphen, (to illustrate a steady
drone-like phonation without pauses). I don't want the invisible soft
hyphens here, and I don't want the non-breaking hyphen. I want visible
hyphens that may trigger a linebreak, if needed.

I found Jukkas page on hyphens [1] and it says: "A minus sign may not be
separated from following numeric characters or following opening
characters, even if a space character intervenes."

If "following opening characters" means normal letters, then Firefox
does the right thing, but then there seem to be no way to achieve this
effect. Before today, I was under the impression that hyphens are used
to say "this word continues on the next line". Now it seems Unicode
needs a "breaking hyphen" character.

Gustaf

[1]http://www.cs.tut.fi/~jkorpela/dashes.html
Jul 23 '05 #4
On Mon, 30 Aug 2004, Gustaf Liljegren wrote:
Before today, I was under the impression that hyphens are used to
say "this word continues on the next line".
You seem to be confusing an output format with a markup format.

If you see a hyphen at the end of a line *in rendered output*, it
could very likely mean that the word continues on the next line.

But that is -not- an accepted convention for HTML source.
Now it seems Unicode needs a "breaking hyphen" character.


Don't confuse Unicode character conventions (which should be valid in
arbitrary text/plain Unicode data), with HTML markup conventions.

HTML has, in theory, the ­ character, whose semantics have been
defined since quite early in the life of HTML. For better or for
worse, the major browser developers decided not to implement those
semantics, so ­ cannot be used reliably in practice.

Support for ZWJ and ZWNJ for this kind of control is also unreliable.

And some of the U+20xx characters may simply produce an ugly box in
the rendering, on browsers which don't understand them.

In short, there's nothing available for practical use at the present
time in standard HTML for the author to control this aspect reliably.

The browsers implement some non-standard markups. Of course,
respectable HTML authors don't use non-standard markups - do they?
But nevertheless, those work more reliably in practice than the
official methods. Sad, really.

Jul 23 '05 #5
Gustaf Liljegren <gu*****@algonet.se> wrote:
My text has a lot of hyphens in it. In fact,
every-space-character-is-replaced-with-a-hyphen, (to illustrate a
steady drone-like phonation without pauses).
I will assume that the hyphen (or, technically, the hyphen-minus
character, or "Ascii hyphen", as opposite to real Unicode hyphen)
is the appropriate character here. I don't know the conventions used in
such presentations, and I can't tell how they are best described in terms
of abstract characters.
I don't want the invisible soft
hyphens here, and I don't want the non-breaking hyphen. I want
visible hyphens that may trigger a linebreak, if needed.
Then the way that works most often without causing damage when it doesn't
is the use of nonstandard <wbr> markup:
foo-<wbr>bar
The theoretically more correct
foo-​bar
fails badly (​ rendered as a small rectangle) in most browsing
situations, I'm afraid. More on this:
http://www.cs.tut.fi/~jkorpela/html/nobr.html#suggest
(Both methods work on Firefox.)

By Unicode line breaking rules, the hyphen is automatically breakable, in
the sense that line break after it is allowed. HTML 2.0 (!) mentioned the
possibility that browsers break a line after a hyphen, and IE actually
implemented this. On the other hand, this is regarded as a very bad idea
by many, because there is a huge number of cases where such breaks are
not desirable.
I found Jukkas page on hyphens [1] and it says: "A minus sign may not
be separated from following numeric characters or following opening
characters, even if a space character intervenes."


Yes, but it's about the minus sign, the Unicode character for use as a
mathematical minus symbol specifically.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 23 '05 #6
On Mon, 30 Aug 2004, Alan J. Flavell wrote:
Support for ZWJ and ZWNJ for this kind of control is also unreliable.
I must correct myself on this point. Unicode advocates the use of the
"zero width space" and "zero width no break space" for the kind of
control that we're discussing here. Appropriate use of these
characters, as well as of the zero-width joiners, is discussed in 13.2
of the Unicode spec. But...
And some of the U+20xx characters may simply produce an ugly box in
the rendering, on browsers which don't understand them.


And indeed this is the problem of trying to use them in HTML.

But, my apologies for carelessly citing the zero-width joiners,
instead of the zero-width spaces.
Jul 23 '05 #7
On Mon, 30 Aug 2004, Gustaf Liljegren wrote:
My text has a lot of hyphens in it. In fact,
every-space-character-is-replaced-with-a-hyphen, (to illustrate a steady
drone-like phonation without pauses). I don't want the invisible soft
hyphens here, and I don't want the non-breaking hyphen. I want visible
hyphens that may trigger a linebreak, if needed.
I suspect you have used the ASCII hyphen-minus sign - .
The ASCII hyphen is just a compatibility character and might be a
true hyphen or a minus sign. Browsers behave differently at this
ambiguous character.
Now it seems Unicode needs a "breaking hyphen" character.


‐ the "real hyphen" (breaking allowed)
‑ the non-breaking hyphen
− the "real minus sign"

Jul 23 '05 #8
Gustaf Liljegren <gu*****@algonet.se> wrote in message news:<cg*********@green.tninet.se>...
IE does, and I can't remember this used to be a problem in Netscape. I
guess someone in the Mozilla team just came up with a Smart Idea about
the True Semantics of the Hyphen Minus character. :-( How do I make IE
and Firefox agree to make hyphens trigger a linebreak?


I filed a bug report on this a long time ago:

<http://bugzilla.mozilla.org/show_bug.cgi?id=95067>

If you want it fixed, please vote for it. Reading the comments will
show you that this is not a simple issue.

The same problem exists in all versions of Netscape. The problem
exists in earlier versions of Opera and Safari, but has been fixed in
recent versions.

It looks as though only the Mozilla family is out of step.
Jul 23 '05 #9
al*******@context.co.uk (Alan Wood) wrote:
I filed a bug report on this a long time ago:

<http://bugzilla.mozilla.org/show_bug.cgi?id=95067>

If you want it fixed, please vote for it.
I didn't find a way to vote _against_ it, so I will just say here that
a) it isn't a bug (it does not violate any relevant specification)
b) "fixing" it by making Mozilla treat every hyphen (except perhaps
when followed by a digit) as allowable word breaking point would
cause far too many wrong divisions (such as for "vitamin-A", "C--",
"foo--bar" [when using "--" as a surrogate for dash] and even
"-foo" or "-s").
It is not acceptable to allow completely wrong word splits just to make
line lengths a bit more equal.
Reading the comments will
show you that this is not a simple issue.


And they have told just part of the story. We would have to consider
hundreds of languages and an unknown number of special notations that may
use "-" for some purpose or another.

If presented as a suggested enhancement - that is, qualitative
improvement, not as "bug fix" - the idea of allowing line breaks under
_some_ conditions would deserve due attention. It might be reasonably
safe to allow a break if it leaves at least three characters on each side
of the hyphen and the hyphen appears between letters (for some suitable
definition of "letter").

Other breaks should be explicitly allowed by using the nice <wbr> tag
(for pragmatists) or the correct Unicode character defined for such
purposes (for theorists, who will surely find the magic number).

We don't need to break things at all places where the Unicode line
breaking rules permit a break. And it's a Web tradition that "words" are
not broken unless you ask for it, with "word" defined the usual techie
way (a maximal string of non-whitespace characters).

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 23 '05 #10

"Jukka K. Korpela" <jk******@cs.tut.fi> wrote in message
news:Xn*****************************@193.229.0.31. ..
al*******@context.co.uk (Alan Wood) wrote:
I filed a bug report on this a long time ago:

<http://bugzilla.mozilla.org/show_bug.cgi?id=95067>

If you want it fixed, please vote for it.
I didn't find a way to vote _against_ it, so I will just say here that
a) it isn't a bug (it does not violate any relevant specification)
b) "fixing" it by making Mozilla treat every hyphen (except perhaps
when followed by a digit) as allowable word breaking point would
cause far too many wrong divisions (such as for "vitamin-A", "C--",
"foo--bar" [when using "--" as a surrogate for dash] and even
"-foo" or "-s").


[snip]
We don't need to break things at all places where the Unicode line
breaking rules permit a break. And it's a Web tradition that "words" are
not broken unless you ask for it, with "word" defined the usual techie
way (a maximal string of non-whitespace characters).


But we do need post-hyphen breakability, for reasons that have been given
here and on the Buzilla page. As for uses of the hyphen where a break
shouldn't be permitted (such as C--), isn't that what the non-breaking
hyphen (2011/8209) is for?

Jul 23 '05 #11
On Tue, 31 Aug 2004, Alan Wood wrote:
I filed a bug report on this a long time ago:

<http://bugzilla.mozilla.org/show_bug.cgi?id=95067>

If you want it fixed, please vote for it.
In what follows, I'm assuming we're referring to the ASCII
hyphen/minus character, as opposed to the more distinctive unicode
hyphen-like characters which have explicitly-defined line-break
semantics.

As you evidently realise yourself, it's not just a matter of "fixing".
The existing non-breaking behaviour might be rated as sub-optimal, but
it's not actually wrong (per the specification), and any quick fix
would only rate to make behaviour noticeably worse in other cases, as
the bugzilla discussion has already explored.
Reading the comments will
show you that this is not a simple issue.
Quite so. It rates to be a whole i18n development, doesn't it?
It looks as though only the Mozilla family is out of step.


Actually, the HTML specification itself is "out of step", since the
original plan (whether we like it or not) was for the use of ­ as
a hyphenation hint - but the popular browser developers didn't seem
interested in implementing it. That hint is still documented in the
current HTML version, but - since support for it is not mandatory -
it's basically unusable in practice, in the WWW context.

Beyond that, the HTML specification calls only for line-breaking
behaviour to be appropriate to the language and writing system in
question. That's a big topic.

I wouldn't object to voting for the matter to receive further study;
but - IMHO - I doubt that anything is to be gained by voting for a
quick "fix". Such a "fix" would only bring further bug reports in its
wake, as far as I can see.

With due respect, I'm surprised you dismiss the correct Unicode
solution so readily as:

These characters are not on any keyboard, and so would need to be
entered as numeric character references.

I don't have a no-break space on my keyboard either, but that doesn't
stop me from using them liberally. I don't have accented letters on
my keyboard either, but that doesn't prevent me from including them in
text when I need to. Anything that's worth using often enough is
surely worth defining a keyboard method for? For less-needed
characters, there's such a thing as a character picker utility.

These characters are not included in the core
fonts for Windows (Arial, Courier New, Times New Roman)

For interest's sake, I just fired up a shiny new laptop - one on which
I haven't installed any specialised fonts on either of the OSes.

Mozilla under Linux (Scientific Linux 3.02, approx equivalent to RHEL
- Mozilla version is 1.4.3 i.e quite old) displayed the characters all
without difficulty, straight out of the box. I admit I didn't test to
see if their line-breaking properties were honoured per specification,
I was only looking at the display of isolated characters.

With XP Pro, I can report that Mozilla is displaying both U+2010 and
U+2011 in both proportional and monospaced fonts. IE6 is displaying
U+2011 (the non-breaking hyphen) in both kinds of font also, but not
U+2010 in either. I tried changing the default font selection, but
things got no better.

However, if I install XP's OS support for Asian languages, then the
U+2010 miraculously appears in IE, along with lots of other nice
Unicode symbols. This is yet another confirmation of the remark on my
browsers-fonts web page, that installing support for other writing
systems can have quite unexpected benefits on IE's display, even if
one doesn't read those other writing systems!

cheers
Jul 23 '05 #12

"Alan Wood" <al*******@context.co.uk> wrote in message
news:93**************************@posting.google.c om...
Gustaf Liljegren <gu*****@algonet.se> wrote in message news:<cg*********@green.tninet.se>...
IE does, and I can't remember this used to be a problem in Netscape. I
guess someone in the Mozilla team just came up with a Smart Idea about
the True Semantics of the Hyphen Minus character. :-( How do I make IE
and Firefox agree to make hyphens trigger a linebreak?


I filed a bug report on this a long time ago:

<http://bugzilla.mozilla.org/show_bug.cgi?id=95067>

If you want it fixed, please vote for it. Reading the comments will
show you that this is not a simple issue.


But not a new issue. Hasn't it been handled sufficiently by typesetting and
then word processing software over the decades, so that by now it should be
matter of adapting that which already exists?

The same problem exists in all versions of Netscape. The problem
exists in earlier versions of Opera and Safari, but has been fixed in
recent versions.

It looks as though only the Mozilla family is out of step.


Jul 23 '05 #13
On Tue, 31 Aug 2004, Harlan Messinger wrote:
Hasn't it been handled sufficiently by typesetting and
then word processing software over the decades, so that by now it should be
matter of adapting that which already exists?


No. Word processing and page layout software on Mac and Windows
have (over the years) only dealed with character 45 = x2D,
the "ASCII hyphen/minus". Character 45 = x2D in Unix fonts is
usually a minus sign, wider than a hyphen.
<http://www.google.com/search?q=IsoLatin1Encoding+minus>
<http://www.google.com/search?q=%22plus+comma+minus+period%22>

The situation is different with Unicode and HTML 4.
We have (at least) four characters:
45 = x2D ASCII hyphen/minus, ambiguous *forever*
8208 = x2010 hyphen
8209 = x2011 non-breaking hyphen
8722 = x2212 minus sign

Character 45 itself is ambiguous; so the browser behaviour
will also remain ambiguous.

Jul 23 '05 #14

"Andreas Prilop" <nh******@rrzn-user.uni-hannover.de> wrote in message
news:Pine.GSO.4.44.0408311737450.6904-100000@s5b003...
On Tue, 31 Aug 2004, Harlan Messinger wrote:
Hasn't it been handled sufficiently by typesetting and
then word processing software over the decades, so that by now it should be matter of adapting that which already exists?
No. Word processing and page layout software on Mac and Windows
have (over the years) only dealed with character 45 = x2D,
the "ASCII hyphen/minus". Character 45 = x2D in Unix fonts is
usually a minus sign, wider than a hyphen.
<http://www.google.com/search?q=IsoLatin1Encoding+minus>
<http://www.google.com/search?q=%22plus+comma+minus+period%22>

The situation is different with Unicode and HTML 4.
We have (at least) four characters:
45 = x2D ASCII hyphen/minus, ambiguous *forever*
8208 = x2010 hyphen
8209 = x2011 non-breaking hyphen


= Microsoft Word nonbreaking hyphen Ctrl + _
8722 = x2212 minus sign
Word also has a soft hyphen ("optional hyphen"), Ctrl + - .

I'm pretty sure I recall WordPerfect 5.1 (back when I was using it) having
hard and soft hyphens too.

Anyway, I wasn't referring to whether the symbols existed, but whether their
ramifications in different languages had been gathered at some point.
Character 45 itself is ambiguous; so the browser behaviour
will also remain ambiguous.


Jul 23 '05 #15
On Tue, 31 Aug 2004, Harlan Messinger wrote:
The situation is different with Unicode and HTML 4.
We have (at least) four characters:
45 = x2D ASCII hyphen/minus, ambiguous *forever*
8208 = x2010 hyphen
8209 = x2011 non-breaking hyphen


= Microsoft Word nonbreaking hyphen Ctrl + _
Word also has a soft hyphen ("optional hyphen"), Ctrl + - .


IIRC, Microsoft Word used ASCII control characters 30 and 31
for hard hyphen and soft hyphen, resp. Of course, there are no
glyphs at these positions in fonts. The chosen glyph was
always that from position 45.

You could check this by inserting chars 30 and 31 into an
RTF file.

Jul 23 '05 #16
"Harlan Messinger" <h.*********@comcast.net> wrote:
But we do need post-hyphen breakability, for reasons that have been
given here and on the Buzilla page.
We "need" lots of breakability, especially in languages that have
very long words, but that's a different story.

And we have to deal with the issue with various hacks like <wbr> when we
think the need is big enough.
As for uses of the hyphen where a
break shouldn't be permitted (such as C--), isn't that what the
non-breaking hyphen (2011/8209) is for?


No, that would make things unnecessarily complex - and would not work
well on the Web, where such fancy characters aren't supported in most
browsing situations.

And in any case, there is a literally huge amount of pages written using
hyphen-minus "normally", i.e. in the way people normally use it when
typing text on computers. People had little reason to worry about line
breaks after hyphens before some browsers got strange ideas about
hyphens, and most of us still don't even know the problem.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 23 '05 #17
"Harlan Messinger" <h.*********@comcast.net> wrote:
Word also has a soft hyphen ("optional hyphen"), Ctrl + - .


No, on Word, Ctrl + - produces some internal code that acts as an
invisible hyphenation hint. ObHTML: you'll see this if you ask Word to
save a document in HTML format - it does not emit a soft hyphen character
(or an entity or reference for it). Word does _not_ support the soft
hyphen character. You can see this if you manually insert soft hyphen
on Word, e.g. by typing Alt+0173.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 23 '05 #18
Andreas Prilop <nh******@rrzn-user.uni-hannover.de> wrote:
On Tue, 31 Aug 2004, Harlan Messinger wrote:
Hasn't it been handled sufficiently by typesetting and
then word processing software over the decades, so that by now it
should be matter of adapting that which already exists?


No. Word processing and page layout software on Mac and Windows
have (over the years) only dealed with character 45 = x2D,
the "ASCII hyphen/minus".


Besides, they often do it wrong. MS Word takes liberties at breaking a
string like "-foo" just as IE does. With default settings, it also
converts a word-initial hyphen into a dash. Should Web browsers imitate
that too? :-) (Surely it is possible to present "heuristic" reasons for
rendering a word-initial hyphen in a manner that resembles a minus sign,
since most often such a hyphen is actually a surrogate for the minus
sign. But just as with the "break after any hyphen" strategy, the
"heuristics" produce far too many wrong renderings.)

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 23 '05 #19

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Danny | last post: by
3 posts views Thread by niconedz | last post: by
13 posts views Thread by rbronson1976 | last post: by
4 posts views Thread by Alejandro Penate-Diaz | last post: by
10 posts views Thread by Lorenzo Thurman | last post: by
2 posts views Thread by patrik.nyman | last post: by
reply views Thread by zhoujie | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.