By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,183 Members | 1,212 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,183 IT Pros & Developers. It's quick & easy.

breaking hyphen?

P: n/a
If a word has a hyphen in it, IE will permit a line break at the
hyphen, but Firefox/Mozilla won't. Apparently the Firefox behavior is
standards-compliant, but it is not what I want. Is there a way to
denote a hyphen in HTML, that the line can be broken after?

I've read some stuff about soft hyphens and non-breaking hyphens, but
those seem like the opposite of what I'm looking for. I want a normal
hyphen, that always appears, and I want the line to be breakable after
it in Firefox as well as IE.

Does anyone know if this is possible?

Oct 24 '05 #1
Share this Question
Share on Google+
22 Replies


P: n/a
st*********@hotmail.com wrote:
If a word has a hyphen in it, IE will permit a line break at the
hyphen, but Firefox/Mozilla won't. Apparently the Firefox behavior is
standards-compliant,
So is the IE behavior. The specifications are obscurely silent on the
matter. Technically, HTML 4.01 specification says that "the plain hyphen
should be interpreted by a user agent as just another character",
http://www.w3.org/TR/REC-html40/stru...t.html#h-9.3.3
but I don't think this is meant to disallow breaking after a hyphen as
was suggested in HTML 2.0 and as the Unicode standard defines.
Is there a way to
denote a hyphen in HTML, that the line can be broken after?


Using <wbr> after a hyphen is the practical way, e.g. as in
non-<wbr>breaking
The long answer is at
http://www.cs.tut.fi/~jkorpela/html/nobr.html#suggest
Oct 24 '05 #2

P: n/a
"Jukka K. Korpela" wrote:

st*********@hotmail.com wrote:
If a word has a hyphen in it, IE will permit a line break at the
hyphen, but Firefox/Mozilla won't. Apparently the Firefox behavior is
standards-compliant,


So is the IE behavior. The specifications are obscurely silent on the
matter. Technically, HTML 4.01 specification says that "the plain hyphen
should be interpreted by a user agent as just another character",
http://www.w3.org/TR/REC-html40/stru...t.html#h-9.3.3
but I don't think this is meant to disallow breaking after a hyphen as
was suggested in HTML 2.0 and as the Unicode standard defines.
> Is there a way to
denote a hyphen in HTML, that the line can be broken after?


Using <wbr> after a hyphen is the practical way, e.g. as in
non-<wbr>breaking
The long answer is at
http://www.cs.tut.fi/~jkorpela/html/nobr.html#suggest


<wbr> is non-standard and not recognized by all browsers. It does
not exist in the HTML 4.01 specification.

Breaking and wrapping styles are indicated for CSS3. However, the
specification is still in flux with comments on the draft for the
affected text module being accepted until some time next year. I
don't know of any browsers that implement the proposed draft of the
CSS3 text module.

--

David E. Ross
<URL:http://www.rossde.com/>

I use Mozilla as my Web browser because I want a browser that
complies with Web standards. See <URL:http://www.mozilla.org/>.
Oct 24 '05 #3

P: n/a
David Ross <no****@nowhere.not> writes:
"Jukka K. Korpela" wrote:
Let's add some emphasis here
Using <wbr> after a hyphen is the practical way, e.g. as in ^^^^^^^^^ non-<wbr>breaking
The long answer is at
http://www.cs.tut.fi/~jkorpela/html/nobr.html#suggest


<wbr> is non-standard


What-*standard*?

IIRC, just recently the use of the width and height attribute for the
IMG element type were suggested here for practical reasons. They are
non-*standard* as well and nobody appeared to complain, as far as I can
recollect.
and not recognized by all browsers.
It's recognized by most browsers that care about such details in the
first place, I dare saying. OTOH, lots of recommended or even
standardized element-types are either not recognized as well or --
worse, actually -- not processed properly (e.g. NOSCRIPT).
It does
not exist in the HTML 4.01 specification.


If it helps, the HTML 4.01 recommendation itself is non-standard.

Oct 25 '05 #4

P: n/a
Eric B. Bednarz <be*****@fahr-zur-hoelle.org> writes:
OTOH, lots of recommended or even
standardized element-types are either not recognized as well or --
worse, actually -- not processed properly (e.g. NOSCRIPT).


My bad, SCRIPT and consequently NOSCRIPT are not part of ISO/IEC
15445:2000 either; make that e.g. 'e.g. Q' then.

Oct 25 '05 #5

P: n/a

st*********@hotmail.com wrote:
If a word has a hyphen in it, IE will permit a line break at the
hyphen, but Firefox/Mozilla won't. Apparently the Firefox behavior is
standards-compliant, but it is not what I want. Is there a way to
denote a hyphen in HTML, that the line can be broken after?


This is a known bug in the Mozilla family:

<https://bugzilla.mozilla.org/show_bug.cgi?id=95067>

I reported it ages ago, and it was worked on for a while, but fixing it
is not currently assigned to anyone.

I don't understand why it is not being fixed. I reported the same
problem in Opera and Safari, and it was fixed quite quickly.

--
Alan Wood
http://www.alanwood.net (Unicode, special characters, pesticide names)

Oct 25 '05 #6

P: n/a
Alan Wood wrote:
I don't understand why it is not being fixed. I reported the same
problem in Opera and Safari, and it was fixed quite quickly.


I don't think it was an improvement that Opera started dividing "-1"
into "-" at the end of a line and "1" at the start of the next line.

(Or consider the string "Latin-1". Do you really want "1" to go to the
next line?)

Granted, we can list down a dozen ways to prevent that. The problem is
that the average author knows none of them, does not even know the
problem exists except casually, and besides, all the dozen or more ways
have serious drawbacks. It's frustrating to decide between poor ways of
solving a problem that didn't exist before some software started to
break lines blindly by some rules.

A browser should _not_ break a string after a hyphen-minus character or
other special character _unless_ it applies reasonable constraints and a
working way to prevent the breaks is available. Nonbreaking hyphen won't
count, for several years. Unfortunately, the leading browser decided to
to break lines without reasonable constraints, so the more reasonable
behavior of Mozilla does not help authors. We need to learn the morale:

A word might be broken after a hyphen, or not be broken.

If this really matters, you probably have some boring work ahead. You
need to scatter extra markup around, perhaps quite a lot.

Yucca
Oct 25 '05 #7

P: n/a

Jukka K. Korpela wrote:
I don't think it was an improvement that Opera started dividing "-1"
into "-" at the end of a line and "1" at the start of the next line.

(Or consider the string "Latin-1". Do you really want "1" to go to the
next line?)


I agree that a simple "allow break after a hyphen" rule is not an ideal
solution.

One possible improvement would be to add a condition that a break
should not be allowed if it would leave a string of 3 or fewer
characters at tne end or start of a line. Another possible improvement
would be not to allow a break between a hyphen and a number, so that
negative numbers are not broken.

Opera is working on a new rendering engine for version 9, and a preview
is available:
- Windows: <http://snapshot.opera.com/windows/w90p1.html>
- UNIX: <http://snapshot.opera.com/unix/u90p1.html>
- Mac: <http://snapshot.opera.com/mac/m90p1.html>

This would be a good time to test its breaking algorithm and provide
feedback. Opera do take notice of feedback.

--
Alan Wood
http://www.alanwood.net (Unicode, special characters, pesticide names)

Oct 26 '05 #8

P: n/a
On Mon, 24 Oct 2005, Jukka K. Korpela wrote:
If a word has a hyphen in it, IE will permit a line break at the
hyphen, but Firefox/Mozilla won't. Apparently the Firefox behavior is
standards-compliant,
So is the IE behavior. The specifications are obscurely silent on the
matter.


http://www.unicode.org/reports/tr14/ is "obscurely silent"?
Technically, HTML 4.01 specification says [...]


The document character set is Unicode. Therefore it is not necessary
to repeat in the HTML specification what is already said in the
Unicode standard.

--
Netscape 3.04 does everything I need, and it's utterly reliable.
Why should I switch? Peter T. Daniels in <news:sci.lang>

Oct 26 '05 #9

P: n/a
On Tue, 25 Oct 2005, Jukka K. Korpela wrote:
(Or consider the string "Latin-1". Do you really want "1" to go to the
next line?)
We have the non-breaking hyphen for this purpose.
The problem is
that the average author knows none of them, does not even know the
problem exists except casually,


The "average author" can't tell an acute accent () from
an apostrophe (').
http://www.helsinki.fi/images/harmaa_ranska.gif

--
Netscape 3.04 does everything I need, and it's utterly reliable.
Why should I switch? Peter T. Daniels in <news:sci.lang>

Oct 26 '05 #10

P: n/a
On Wed, 26 Oct 2005, Andreas Prilop wrote:
On Mon, 24 Oct 2005, Jukka K. Korpela wrote:
If a word has a hyphen in it, IE will permit a line break at the
hyphen, but Firefox/Mozilla won't. Apparently the Firefox behavior is
standards-compliant,
So is the IE behavior. The specifications are obscurely silent on the
matter.


http://www.unicode.org/reports/tr14/ is "obscurely silent"?
Technically, HTML 4.01 specification says [...]


The document character set is Unicode.


If we're going to be pedantic: the document character set is
iso-10646. If you look at the relevant part of the HTML4.01 spec
(section 5), its specific reference to "Unicode" is in relation to the
bidirection text algorithm.
Therefore it is not necessary to repeat in the HTML specification
what is already said in the Unicode standard.


I don't think there was an intention to incorporate Unicode semantics,
in general, into HTML4. Those semantics which it intends to include,
it specifies.

IIRC, there have been some discussion documents about the extent to
which the other parts of Unicode semantics are meaningful in HTML (and
XHTML), but they aren't part of the HTML4 spec, which still refers
primarily to iso-10646 for its character model - rather than to
Unicode for its semantics.

Unfortunate, but that's the way it seems to be.
Oct 26 '05 #11

P: n/a
Andreas Prilop wrote:
http://www.unicode.org/reports/tr14/ is "obscurely silent"?


No, it's rather clear and definite, though there are many things in it
that could be heavily criticized on practical grounds. But what it says,
it says expressly, though not very readably.

But it's not part of or normatively referenced by any HTML specification
or any other applicable specification. (Besides, it says that a line
break is permitted after a hyphen-minus , unless forbidden by
higher-priority rules. It does not say that an application _must_ break
a line after a hyphen-minus.)
Technically, HTML 4.01 specification says [...]


The document character set is Unicode. Therefore it is not necessary
to repeat in the HTML specification what is already said in the
Unicode standard.


As Alan wrote, HTML does _not_ adopt Unicode semantics. Even if it
referred to Unicode and not to ISO 10646 for the document character set,
such a reference would per se just specify the interpretation of
character references (i.e., how "n" in "&#n;" is to be interpreted).
Even a wider reference to Unicode would not mean that Unicode semantics
is imported.

The definition of a markup language _could_ specify that software
interpreting the language shall conform to the Unicode standard, and
this would imply quite a lot. But neither HTML nor SGML or XML makes
such a statement. They simply _use_ Unicode as a character code, instead
of specifying conformance to the Unicode standard.
Oct 26 '05 #12

P: n/a
Andreas Prilop wrote:
(Or consider the string "Latin-1". Do you really want "1" to go to the
next line?)


We have the non-breaking hyphen for this purpose.


We do, but fonts usually don't; see
http://www.fileformat.info/info/unic...ontsupport.htm
(Note the lack of Times New Roman, Arial, and Verdana, for example.)

Using a non-breaking hyphen from a font other than the current one, when
needed, as many browsers (except IE) do and as suggested in CSS specs,
is often risky. It's particularly risky for hyphen-like characters. When
you take one from another font, it might be as wide as your normal
font's en dash. And that's _bad_.

Besides, we wouldn't need the non-breaking hyphen in most cases if
browsers didn't apply line breaking algorithms foolishly. There's a huge
amount of existing web pages with hyphen-minus as the only hyphen, and
breaking "-1" or "Latin-1" is... mad (no matter what you and I think
about using the minus sign in the former and the non-breaking hyphen in
the latter case).

It's more or less as mad as it would be to start breaking after "/" even
within an otherwise alphabetic string mechanically. Oops... Microsoft
actually did that, and...
Oct 26 '05 #13

P: n/a
JRS: In article <dj**********@phys-news1.kolumbus.fi>, dated Tue, 25
Oct 2005 15:20:21, seen in news:comp.infosystems.www.authoring.html,
Jukka K. Korpela <jk******@cs.tut.fi> posted :
Alan Wood wrote:
I don't understand why it is not being fixed. I reported the same
problem in Opera and Safari, and it was fixed quite quickly.


I don't think it was an improvement that Opera started dividing "-1"
into "-" at the end of a line and "1" at the start of the next line.


Indeed. At least for proportionally-spaced text, there should be no
word-break after a hyphen which starts a "word"; perhaps no break which
generates a fragment of fewer than three characters. Instead, the whole
line should be squashed up a bit, or the "word" slipped into the next
line.

Probably you'll think of a few exceptional cases ...

--
John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 MIME.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
The Big-8 newsgroup management is attempting to legitimise its questionable
practices while retaining its elitist hegemony. Read <URL:news:news.groups>.
Oct 26 '05 #14

P: n/a
Andreas Prilop wrote:

The "average author" can't tell an acute accent (´) from
an apostrophe (').


Nor can they tell prime (′) from right single quote (’). I see 2
reversed prime characters used in place of a left double quote often
enough. Besides being just plain wrong, it always looks awful.

--
Reply email address is a bottomless spam bucket.
Please reply to the group so everyone can share.
Oct 26 '05 #15

P: n/a
st*********@hotmail.com wrote:

If a word has a hyphen in it, IE will permit a line break at the
hyphen, but Firefox/Mozilla won't. Apparently the Firefox behavior is
standards-compliant, but it is not what I want. Is there a way to
denote a hyphen in HTML, that the line can be broken after?

I've read some stuff about soft hyphens and non-breaking hyphens, but
those seem like the opposite of what I'm looking for. I want a normal
hyphen, that always appears, and I want the line to be breakable after
it in Firefox as well as IE.

Does anyone know if this is possible?


The following Mozilla bug reports discuss the issue of breaking,
non-breaking, and soft hyphens:

9101 (open) soft hyphens
<URL:https://bugzilla.mozilla.org/show_bug.cgi?id=9101>

61803 (fixed) non-breaking hyphens
<URL:https://bugzilla.mozilla.org/show_bug.cgi?id=61803>

80068 (closed, "works for me") long hyphenated string breaking at
wrong place
<URL:https://bugzilla.mozilla.org/show_bug.cgi?id=80068>

95067 (open) lines should break after hyphen unless number follows
hyphen <URL:https://bugzilla.mozilla.org/show_bug.cgi?id=95067>

253317 (open) need hyphenation dictionary
<URL:https://bugzilla.mozilla.org/show_bug.cgi?id=253317>

312063 (open) soft hyphens in non-Latin text
<URL:https://bugzilla.mozilla.org/show_bug.cgi?id=312063>

The conclusion is that the overall issue of hyphenation is NOT
simple to resolve. Even for manully composed English text, the
rules are complicated and not totally comprehensive. When you add
the need for browsers to render other languages, the complications
become mangified.

--

David E. Ross
<URL:http://www.rossde.com/>

I use Mozilla as my Web browser because I want a browser that
complies with Web standards. See <URL:http://www.mozilla.org/>.
Oct 26 '05 #16

P: n/a
Eric B. Bednarz wrote:
IIRC, just recently the use of the width and height attribute for the
IMG element type were suggested here for practical reasons. They are
non-*standard* as well and nobody appeared to complain, as far as I can
recollect.


<http://www.w3.org/TR/html4/struct/objects.html#h-13.2>
Oct 27 '05 #17

P: n/a
JRS: In article <11*********************@g43g2000cwa.googlegroups. com>,
dated Wed, 26 Oct 2005 01:35:48, seen in news:comp.infosystems.www.autho
ring.html, Alan Wood <al*******@justis.com> posted :

One possible improvement would be to add a condition that a break
should not be allowed if it would leave a string of 3 or fewer
characters at tne end or start of a line.
ISTR that 3 should be acceptable; breaking "hot-dog" seems OK,
especially if there are not many characters per line.
Another possible improvement
would be not to allow a break between a hyphen and a number, so that
negative numbers are not broken.


Latter is not so good if the text happens to contain Part
#Q235-987-577-875-755-G

Maybe allow break before hyphen-minus if it has digits on both sides?

I presume hyphen-minus is \u002D, hyphen and minus are above \u007F,
hyphen always allows break and minus never does.

--
John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4
<URL:http://www.jibbering.com/faq/> JL/RC: FAQ of news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Oct 27 '05 #18

P: n/a
Leif K-Brooks <eu*****@ecritters.biz> writes:
Eric B. Bednarz wrote:
IIRC, just recently the use of the width and height attribute for the
IMG element type were suggested here for practical reasons. They are
non-*standard*

^ ^
(emphasis) as well and nobody appeared to complain, as far as I can
recollect.


<http://www.w3.org/TR/html4/struct/objects.html#h-13.2>


<https://www.cs.tcd.ie/15445/UG.html#OMITTED>
(FWIW, HTML 4.01 provides no mechanism to match the intrinsic
dimensions of an image either, nudge-nudge-wink-wink)
--
hexadecimal EBB
decimal 3771
octal 7273
binary 111010111011
Oct 27 '05 #19

P: n/a
On Thu, 27 Oct 2005, Eric B. Bednarz wrote:
(FWIW, HTML 4.01 provides no mechanism to match the intrinsic
dimensions of an image either, nudge-nudge-wink-wink)


Yeah, they confuddle the image height and width attributes in HTML
4.01 by cross-referencing the definition of the px unit in the CSS
specification, which, as we know, is meant to be scaled according to
the display situation. Heaven knows what they thought they intended
by doing that!

Or did you have some other discrepancy in mind?
Oct 27 '05 #20

P: n/a
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> writes:
Yeah, they confuddle the image height and width attributes in HTML
4.01 by cross-referencing the definition of the px unit in the CSS
specification, [...] Or did you have some other discrepancy in mind?


No no, nail on head :)
--
hexadecimal EBB
decimal 3771
octal 7273
binary 111010111011
Oct 27 '05 #21

P: n/a
On Thu, 27 Oct 2005 19:20:36 +0100, "Alan J. Flavell"
<fl*****@ph.gla.ac.uk> wrote:
On Thu, 27 Oct 2005, Eric B. Bednarz wrote:
(FWIW, HTML 4.01 provides no mechanism to match the intrinsic
dimensions of an image either, nudge-nudge-wink-wink)


Yeah, they confuddle the image height and width attributes in HTML
4.01 by cross-referencing the definition of the px unit in the CSS
specification, which, as we know, is meant to be scaled according to
the display situation. Heaven knows what they thought they intended
by doing that!


Well, I expect they thought they were trying to address how images were
handled on devices such as printers. Or did you think that HTML image
dimensions should refer to physical picture elements when using a
1200 dpi printer?

What would be a better way of handling this situation?

--
Stephen Poley

http://www.xs4all.nl/~sbpoley/webmatters/
Oct 28 '05 #22

P: n/a
On Fri, 28 Oct 2005, Stephen Poley wrote:
On Thu, 27 Oct 2005 19:20:36 +0100, "Alan J. Flavell"
<fl*****@ph.gla.ac.uk> wrote:
On Thu, 27 Oct 2005, Eric B. Bednarz wrote:
(FWIW, HTML 4.01 provides no mechanism to match the intrinsic
dimensions of an image either, nudge-nudge-wink-wink)
Yeah, they confuddle the image height and width attributes in HTML
4.01 by cross-referencing the definition of the px unit in the CSS
specification, which, as we know, is meant to be scaled according to
the display situation. Heaven knows what they thought they intended
by doing that!


[...] What would be a better way of handling this situation?


Pixel-based image formats contain their intrinsic pixel dimensions.
I *thought* the purpose of the HTML height and width attributes were
to pre-declare what these pixel dimensions were. On that basis, their
values *should* be whatever the image itself says they are. With CSS
being used, if desired, to re-scale the image to suit the presentation
situation.

The use of HTML attributes to re-scale an image *ought* to be
deprecated: it's a left-over from the old bad way of HTML/3.2.

In that sense, the px units in HTML would be real px units, whereas px
units in CSS would be what CSS says they are.

But that isn't what the HTML4 specification says.
Oct 28 '05 #23

This discussion thread is closed

Replies have been disabled for this discussion.