By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,864 Members | 1,262 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,864 IT Pros & Developers. It's quick & easy.

linebreak ignored inside <pre>

P: n/a
there's a HTML odditity. That is, line break will be ignored inside
<preif the adjacent lines are tags.

see the source code and description here:

http://xahlee.org/js/linebreak_after_tag.html

if anyone know the spec for XML case, please let me know. Thanks.

Xah
xa*@xahlee.org
http://xahlee.org/

Jul 31 '06 #1
Share this Question
Share on Google+
10 Replies


P: n/a
Xah Lee <xa*@xahlee.orgwrote:
there's a HTML odditity. That is, line break will be ignored inside
<preif the adjacent lines are tags.
The problem is that line breaks are treated in a special way inside a
PRE block. Outside a PRE block line breaks count as normal space
characters and therefore they should be ignored before/after end tags or
start tags because usually one would add line breaks into the source
code just there to make the code more readable. Also any sequence of
space characters, tabs and line breaks (the so called "white spaces")
will be treated as a single space character.

Inside a PRE block, line breaks characters and all other "white space"
characters are treated differently. They do not copllapse into a single
space character, and line breaks must be displayed where they occur and
won't convert into a space character. So I think that Firefox and Safari
don't display your test page correctly.

The problem seems to be the "display:table" for the PRE tag, which
is confusing Firefox and Safari. This is probably a bug in these
browsers.

I assume that you want to use display:table to make the box a small as
needed. Without the modified display property the box would span the
available width completly. You could make the box a float instead and
"clear" it afterwards to get the same effect. And then all browsers
would display it right.

--
Alexander
Jul 31 '06 #2

P: n/a
Xah Lee wrote:
there's a HTML odditity. That is, line break will be ignored inside
<preif the adjacent lines are tags.

see the source code and description here:

http://xahlee.org/js/linebreak_after_tag.html

if anyone know the spec for XML case, please let me know. Thanks.

Xah
xa*@xahlee.org
http://xahlee.org/
Why this construct? I was surprised to see that most browsers seem to
handle it. It seems somewhat odd to me apply such styling as display to
pre. My experience that practically the only styling appropriate to pre
is to the font.

(BTW IE does not recognise the table value for display)

To have the pre element space restrained you might take the styling off
pre and declare it in a div (or other element)

<div class="lyrics">
<pre>
line1
line2
</pre>
</div>

Louise

Aug 1 '06 #3

P: n/a
Xah Lee <xa*@xahlee.orgscripsit:
there's a HTML odditity. That is, line break will be ignored inside
<preif the adjacent lines are tags.
On your page that explains the problem, you quote the text "a line break
immediately following a start tag must be ignored, as must a line break
immediately before an end tag" from the HTML specification. The cited
requirement has, however, generally been violated by web browsers, so don't
count on it - but don't get surprised either if some browser actually
complies with it in some situation.

The original problem appears to be how to present poetry. Using <preisn't
really the best tool for it. Do you want your poem to appear in a monospace
font when your style sheet is ignored?

A better approach is to use

<div class="stanza">
<div>first line</div>
<div>second line</div>
...
</div>

together with a suitable piece of CSS that tries to prevent line breaks
(except between the <divelements) and may add colors or whatever you want.
Using this approach, each line is a stylable element, and you may add class
attributes for finer tuning. Moreover, you can add, say,
div.stanza div { margin-left: 1em; text-indent: -1em; }
so that if a line is divided, the continuation line appears with a little
indentation, so that the reader can still see the structure.
Followups trimmed.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

Aug 1 '06 #4

P: n/a
boclair wrote:
>
Why this construct? I was surprised to see that most browsers seem to
handle it. It seems somewhat odd to me apply such styling as display to
pre. My experience that practically the only styling appropriate to pre
is to the font.
Overflow styling on <preis both commonplace and useful.

--
Jack.
http://www.jackpot.uk.net/
Aug 1 '06 #5

P: n/a
Jack wrote:
boclair wrote:
>>
Why this construct? I was surprised to see that most browsers seem to
handle it. It seems somewhat odd to me apply such styling as display
to pre. My experience that practically the only styling appropriate to
pre is to the font.

Overflow styling on <preis both commonplace and useful.
I didn't know it was commonplace.

The basic reason I feel that styling of pre should be restricted is to
font properties is because it is somewhat unique being "preformatted".

Louise

Aug 1 '06 #6

P: n/a
In <aW************@reader1.news.jippii.neton Tue, 1 Aug 2006
09:56:12 +0300, "Jukka K. Korpela" <jk******@cs.tut.fiwrote:
>A better approach is to use

<div class="stanza">
<div>first line</div>
<div>second line</div>
I'm not a poet but I am a pro songwriter and many of the same
principles apply. This construction wouldn't really accurately
reflect the semantics of verse (ob.disclaimer : *as I actually
write*); i.e., this markup would imply a structure which would
differ from the meaning and intent of what I've written. In fact,
I would argue that it is actually compromising the meaning of the
content in an effort to shoehorn it into an inappropriate
construction.

The method which best conveys my intent as writer is to treat each
verse as being a self-contained block and each line as an inline
element contained within that block. While I'm very willing to be
convinced that there could be a more accurate semantic
representation, none of the alternatives I've ever seen has so far
persuaded me.

And yes, that does mean using line-breaks as this is completely
consistent with the meaning of the content. I'm well aware of the
arguments about the use of line-breaks but - as your suggestion of
divs here does indeed implicitly recognise - in the context of
verse, the break is not presentational, it is actually an
essential structural element for maintaining the intelligibility
of the content as written and should be clearly marked up as such
within the HTML. I would certainly accept that *most* use of
line-breaks constitutes abuse but I would also argue that is one
of the very rare occasions where avoiding using them causes more
harm than actually using them.

Every alternative I've ever seen proposed to using line-breaks in
verse has resulted either in
a) distortion of the semantics of the content
or
b) dependence on presentation, which distorts the content when the
styling is not being used.

Certainly, using divs as you suggest above will avoid doing the
latter, but it does fall foul of the former. I strongly disagree
that it can be correct for an individual line of a verse of a song
(or poem) to be treated as a self-contained element in this way
any more than it would be to do it with a subclause in a sentence
of prose. Semantically, it doesn't make sense as most lines in a
verse are dependent for their intelligibility upon the preceding
or following lines and are therefore interdependent elements
within the one semantic unit. I'll happily accept that there's a
good argument as to whether divs or paragraphs are best used to
mark up each verse, but you'd have to come up with a really,
really good argument - better than I've ever seen presented to
date - to persuade me that it would be correct to use that markup
for individual lines.

Until HTML includes a solution for accurately representing the
reality of verse as it is actually structured rather than forcing
semantically-distorting workarounds in an effort to avoid the use
of line breaks, I will continue to use them as the sane and
sensible solution. Hell, I recently even came across one
(apparantly serious) proposal that a verse should be regarded as
an ordered list with each line marked <li></li>! :) I kid you not.

So to me, an example of the accurate representation of the
semantics of a song would be the one which also happens to be the
simplest and most obvious (with the acceptance that div might in
fact be more correct than p) and is the one I use, have always
used and will continue to use until someone convinces me that
there exists a more correct alternative :

<p class="verse">
first line<br>
<span class="refrain">repeated refrain line, if present, where
indentation, italic or other styling might be useful</span>

<!-- etc -->
</p>

<p class="chorus">
first line<br>
next line ... etc
</p>

<p class="bridge">
(8 bars instrumental passage)
</p>

<p class="middle8">
first line<br>
next line ... etc
</p>

Admittedly, this view reflects the general underlying philosophy
amongst songwriters that a verse, while being part of a larger
dialogue, should be able to stand alone as a self-contained
semantic unit. I don't think I've ever written a verse which did
not end with a full stop and I would probably regard it as
extremely sloppy writing if I ever did.

--
DG
Aug 2 '06 #7

P: n/a
Dick Gaughan wrote:
>
I'm not a poet but I am a pro songwriter and many of the same
principles apply. This construction wouldn't really accurately
reflect the semantics of verse (ob.disclaimer : *as I actually
write*); i.e., this markup would imply a structure which would differ
from the meaning and intent of what I've written. In fact, I would
argue that it is actually compromising the meaning of the content in
an effort to shoehorn it into an inappropriate construction.
[snip pertinent and interesting observations]

Indeed. Most music is conceptually built from 'lines'. The relationship
between the lines of a song (as written) might in fact be even closer
than that one line follows the other; they might actually be the same
line - for example, where the notes of a tune are set out below the
words of the corresponding song.

There are also some kinds of poetry that rely heavily on layout for
their effect; I am looking at 'The Mouse's Tail' from Alice In
Wonderland, which would become meaningless without close control over
the way it is divided into lines (and indented). It also relies on
control of font-size for its effect.

This use of words to create a picture is shared with some kinds of
advertising (as the author of my Annotated Alice notes).

It seems fairly clear that The Mouse's Tail is unsuited for HTML
presentation, which must surely fail to render the material correctly in
some common situations. The reason is that HTML doesn't set out to
provide the kind of detailed typographic control on which Lewis Carroll
was depending.

I suppose that some material is still best read in a book; and some
other material should be viewed on an advertising hoarding, rather than
a computer screen.

Is it worth noting that any attempt to make The Mouse's Tail
'accessible' is doomed?

--
Jack.
http://www.jackpot.uk.net/
Aug 2 '06 #8

P: n/a
In <ea*******************@news.demon.co.ukon Wed, 02 Aug 2006
09:47:03 +0100, Jack <mr*********@nospam.jackpot.uk.netwrote:

<all snippages purely for space>
>The relationship
between the lines of a song (as written) might in fact be even closer
than that one line follows the other; they might actually be the same
line - for example, where the notes of a tune are set out below the
words of the corresponding song.
Indeed. This is a definite place where the visual presentation is
part of the intelligibility of the content and cannot be clearly
separated. I don't know if you're familiar with the ABC system of
musical notation in ASCII but I long ago gave up trying to get my
head round some sensible way of formatting it for web pages when
it is used in conjunction with lyrics. As you say, the vertical
alignment of the symbols with each individual syllable is
absolutely crucial to understanding but it is simply not possible
to do this without specifying exact fonts and sizes and padding it
out with multiple nbsp;, there are simply too many variables. And
I cannot think of any way of making it scalable. The only kludge I
can think of would be to treat it as <preand there are scaling
problems involved there too.

So now, I just convert the notation into a b/w gif (although I'm
converting to .png now) as being the only acceptable way of doing
it for general purposes. Blind people would not normally expect
stave notation to be accessible anyway so there is no direct loss
to them in accessibility any more than there is related to any
purely visual form, such as a painting. The way I try to get round
it is that wherever I provide stave notation in graphic form, I
try to also present a MIDI of the notation where possible as being
the best aural representation of the notation. For those blind
people who have learned ABC, providing notation in that form would
certainly resolve the accessibility issue but I can think offhand
of several major headaches (of nightmare proportions!) in trying
to make that work in HTML. Another lifetime, perhaps.

<snip again for space>
>I suppose that some material is still best read in a book; and some
other material should be viewed on an advertising hoarding, rather than
a computer screen.

Is it worth noting that any attempt to make The Mouse's Tail
'accessible' is doomed?
I think this is a crucial point. There are definitely cases, such
as your example, and that of music/lyrics, where it is impossible
to divorce presentation from content and still have the content
make sense. I would hate to even think about rendering something
like The Mouse's Tail accurately in HTML (although it's many years
since I read Alice in Wonderland and I'll have to go and look it
up now to see exactly what the problems would be now my curiosity
is aroused - good excuse not to do any work. Thanks :)

Feels a bit like heresy actually saying it, but I think it is
unrealistic to expect that *everything* can be acceptably
converted to standards-compliant HTML. It works wonderfully for
prose text but there are some problems it certainly cannot solve.
Or at least, not yet.

--
DG
Aug 2 '06 #9

P: n/a
Dick Gaughan <dg@dickgaughan.co.ukscripsit:
In <aW************@reader1.news.jippii.neton Tue, 1 Aug 2006
09:56:12 +0300, "Jukka K. Korpela" <jk******@cs.tut.fiwrote:
>A better approach is to use

<div class="stanza">
<div>first line</div>
<div>second line</div>

I'm not a poet but I am a pro songwriter and many of the same
principles apply. This construction wouldn't really accurately
reflect the semantics of verse (ob.disclaimer : *as I actually
write*);
Why do you say so? A poem logically consists of components, and since we
have no specific markup for them, we have to work with the markup we can
use. What I suggest seems to be as the best approximation. It's trivial, but
at least it has _some_ structure at least syntactically, as opposite to
scattering command-like <brtags around.

If you use <br>, you are only saying "line break". It is meaningful in
visual rendering only. By HTML specifications, "line" means just a printed
line, not any logical unit - like a "verse line" really is. Poetry is much
older than any written language, and poetry makes perfect sense as
unwritten, too - and then it carries a structure that I've expressed with
<divmarkup, but no line breaks, really.
i.e., this markup would imply a structure which would
differ from the meaning and intent of what I've written.
The markup really implies just a formal structure where a stanza consists of
block elements. What's wrong with that?
The method which best conveys my intent as writer is to treat each
verse as being a self-contained block and each line as an inline
element contained within that block.
An inline element that always needs to have a line break before and after it
in visual rendering sounds suspiciously like a block element, doesn't it? A
line as inline element doesn't even _sound_ logical.
>I strongly disagree
that it can be correct for an individual line of a verse of a song
(or poem) to be treated as a self-contained element in this way
any more than it would be to do it with a subclause in a sentence
of prose.
Making something an element does not mean making it "self-contained", which
is really something outside the scope of markup. A list item is surely not
"self-contained" - it seldom makes sense when taken in isolation, but it's
really an element. Incidentally, a poem _could_ be marked up as a list, <ul>
element. The main reason against this idea is that the default rendering is
bulleted.

A sentence or subclause of sentence of prose _could_ be made an element, but
it would have to be an inline element simply because its normal visual
rendering is not a block. (It would have to be <span>, in practice. This
would make sense if you would like to have styling that puts two spaces
after a sentence terminator, for example - but this does not seem worth
doing in practice.)
Hell, I recently even came across one
(apparantly serious) proposal that a verse should be regarded as
an ordered list with each line marked <li></li>! :) I kid you not.
The main logical flaw with the idea is the misconception of <olas an
ordered list as opposite to <ulas an unordered list. The names are
misleading and the history is confusing, but basically <oland <ulare the
same thing, a list, just with a different default rendering (numbered vs.
bulleted). It makes no sense to treat <ulas an unordered list; it surely
has an order, and it would surely be wrong if a browser displayed the items
of <ulin a randomized order - it just hasn't got the order "spelled out"
and emphasized with numbers as <olhas, so in practice <olis often more
suitable for lists of steps to be taken, lists in priority order, etc.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

Aug 2 '06 #10

P: n/a
In <nH****************@reader1.news.jippii.neton Thu, 3 Aug 2006
00:21:38 +0300, "Jukka K. Korpela" <jk******@cs.tut.fiwrote:
>Dick Gaughan <dg@dickgaughan.co.ukscripsit:
>In <aW************@reader1.news.jippii.neton Tue, 1 Aug 2006
09:56:12 +0300, "Jukka K. Korpela" <jk******@cs.tut.fiwrote:
>>A better approach is to use

<div class="stanza">
<div>first line</div>
<div>second line</div>

I'm not a poet but I am a pro songwriter and many of the same
principles apply. This construction wouldn't really accurately
reflect the semantics of verse (ob.disclaimer : *as I actually
write*);

Why do you say so?
Because it's what I meant. What I say is usually what I mean.
(Allowing for chronologically induced entropy in the brain.)
>A poem logically consists of components, and since we
have no specific markup for them, we have to work with the markup we can
use. What I suggest seems to be as the best approximation. It's trivial, but
at least it has _some_ structure at least syntactically, as opposite to
scattering command-like <brtags around.
Talk of "scattering tags around" doesn't really advance the
discussion much, particularly when we're comparing

first line<br>
next line

with

<div>first line</div>
<div>next line</div>

I know which of those two I regard as the clearer approximation of
the intent.

Perhaps the waters are muddied by the habitual but in fact
completely inaccurate classification of the segments of a verse as
"lines" which does carry the implication that they may be regarded
as individual entities. That does not in any way reflect the way
songwriters and poets perceive or conceive them, or the way they
are sung, read or spoken.

If I perceived the "lines" of a verse as being separate individual
entities, as in the prose sense of "lines", then I would certainly
regard the use of line breaks as semantically incorrect. But I
don't.
>If you use <br>, you are only saying "line break".
Yes, that's precisely what I'm saying. It's precisely what I
intend. That's why I use the tag the sole purpose of which is to
say "line break". I'm not seeing any error in the logic of that.
>It is meaningful in
visual rendering only.
To suggest that the breaks which reflect the fundamental structure
of verse exist only in visual presentation is to completely
dismiss the essentials of metre and rhythm which are the defining
features of verse. The whole point of breaking the line is that
the pre-existing structure demands it. The breaks are not there to
confer structure, they are there to reflect the structure that
metre and rhythm have already created.

So the primary requirement in the visual presentation of verse is
not to create a structure, it is to accurately deliver the
structure that already exists. Horse, cart, correct order. Without
that pre-existing structure it is not verse, it's prose, and this
discussion is redundant
>By HTML specifications, "line" means just a printed
line, not any logical unit - like a "verse line" really is. Poetry is much
older than any written language, and poetry makes perfect sense as
unwritten, too -
In the case I was discussing, song, the major intent is that it be
sung. The reasons for presenting the lyric on a web page are,
first, to make them easily available to those who wish to sing
them, and second, to provide them for those few people who might
wish to study them either together with or independently from
performance. The fundamental aim in both these scenarios is to
present the meaning and structure of the text as faithfully as
possible to the writer's intent.
>and then it carries a structure that I've expressed with
<divmarkup, but no line breaks, really.
The fact that marking individual lines as <divdelivers a
structure is self-evidently true - but it does not follow that
that structure is an accurate representation of the structure that
already exists in the actual content.
>i.e., this markup would imply a structure which would
differ from the meaning and intent of what I've written.

The markup really implies just a formal structure where a stanza consists of
block elements. What's wrong with that?
Well, if we accept that they can be regarded as block elements
(which I don't, for the reasons I gave above) I wouldn't describe
the markup as being "wrong". So I wouldn't say to you that you
shouldn't do it that way, I'm saying only that marking up the
segments ("lines") of a verse as blocks is not something I would
do as I would regard it as imposing a structure which would
frequently be at odds with my perception of the actual meaning and
intent of the content. As your perception of that intent seems to
be different from mine then naturally your approach to marking it
up will be different. Your markup is completely consistent with
your interpretation of the intent, so there's nothing that could
be called "wrong" with it. My disagreement would be with your
analysis of the structure.
>The method which best conveys my intent as writer is to treat each
verse as being a self-contained block and each line as an inline
element contained within that block.

An inline element that always needs to have a line break before and after it
in visual rendering sounds suspiciously like a block element, doesn't it? A
line as inline element doesn't even _sound_ logical.
Again, those breaks do not exist simply in visual presentation but
are part of the essential structure, dictated by the rhythmic and
metrical demands of the verse, but I suspect we may have to settle
for disagreeing about that.

I think our different views are perhaps that you seem to regard
the "lines" of the verse as being separate individual entities
whereas I see them as inseparable parts of one entity with the
break representing a hiatus in delivery rather than as actual
separation. In my view, there is a definite logical distinction
between verses but not between the elements which make up each
verse.

If we accept that the use of the word "line" is a convenient but
actually inaccurate description of the segments of a verse, then
regarding each segment as an inline element is perfectly logical.
I do not write songs in lines, I write them in verses. The
smallest self-contained semantic element in a series of verses,
whether song or poem, should be the verse. Which is why they are
usually organised in verses.

So perhaps our different views as to how they should best be
marked up can be summed up as different interpretations of the
word "lines" in the context of verse.
>Incidentally, a poem _could_ be marked up as a list, <ul>
element. The main reason against this idea is that the default rendering is
bulleted.
No, the main reason against this idea is that it's daft. You might
as well paint stripes on a donkey and call it a zebra.

Anything could be marked up as a list and a rationalisation
provided for it. That doesn't mean that there's the slightest
sense in doing it. Your suggestion of using divs has very sensible
logic to it, even if I disagree with the basis of that logic, but
calling it a list has none.

The only honest argument I can see for describing the parts of a
verse as a list is simply as a formalistic way of avoiding the use
of <br>. Doing something purely as a way of avoiding doing
something else does not of itself confer correctness, nor even any
point.

Incidentally, I have no ideological position on all this and I'm
not interested in defending any particular construct. I am simply
interested in what the most effective and accurate method might be
in the unique circumstances of marking up verse. The sole reason I
use the method I use is because I believe it to be the best
available, the most logical, which degrades most gracefully,
validates without error, and which more accurately reflects the
structure of the content than all the alternatives I've seen. So
far. When I see a method which better delivers on those criteria,
I'll happily adopt it.

--
DG
Aug 3 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.