By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,649 Members | 2,142 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,649 IT Pros & Developers. It's quick & easy.

How does a browser parse an html document?

P: n/a
Hello everyone,

Me again. Trying to learn some more :>) I hope I got the terminology
right.

How does a browser parse (correct term?) an HTML document. I'm sure
that every browser does it a little differently. Do they simply just
read a document top-to-bottom and left-to-right and just display
elements in the order in which they encounter them? Or, do they give
priority to certain types of content? For instance, would a browser
display text first, then all the images, then all Javascripts, etc?
Say, for example, I had a large image (let's say 1MB). The image is
floated left of my paragraph, so this is my code:

<img src="image.jpg" style="float: left">
<p> blah blah blah, yadda yadda yadda </p>

Would a browser just load these in order, so I would have to wait for
the 1MB image to load, then I'd see the text?

I ask because there are times I just can't tell. Of couse, a local copy
of a website is always fast - it's on your hard drive. But there are
times I will load the same page over and over again (emptying the cache
in between reloads) and it seems to load in a different order each
time. Sometimes I see text first, sometimes I see images first.

Also, knowing this would also help me optimize pages to load faster.

Thanks to all who reply. I'm really having fun with this and am
enjoying learning. I may even go to a local university to see if there
are any classes available :>)

Viken K.

Jan 25 '06 #1
Share this Question
Share on Google+
10 Replies


P: n/a
Viken Karaguesian wrote:
Thanks to all who reply. I'm really having fun with this and am
enjoying learning. I may even go to a local university to see if
there are any classes available :>)


If you missed the thread on alt.www.webmaster about Uni classes, have a
read of "Web Design Teachers" from a few days ago:

<http://groups.google.com/group/alt.www.webmaster/browse_frm/thread/9260ab7b04c2a812/2c3e5a604de2f532?tvc=1&q=web+design+teachers+alt.w ww.webmaster&hl=en#2c3e5a604de2f532>

Tread carefully with the school. You're likely to learn much more useful
stuff by lurking and hanging around these newsgroups. Remember to ask
intelligent questions. <g>

--
-bts
-Warning: I brake for lawn deer
Jan 25 '06 #2

P: n/a
Viken Karaguesian wrote:
How does a browser parse (correct term?) an HTML document. I'm sure
that every browser does it a little differently. Do they simply just
read a document top-to-bottom and left-to-right and just display
elements in the order in which they encounter them? Or, do they give
priority to certain types of content? For instance, would a browser
display text first, then all the images, then all Javascripts, etc?
Say, for example, I had a large image (let's say 1MB). The image is
floated left of my paragraph, so this is my code:

<img src="image.jpg" style="float: left">
<p> blah blah blah, yadda yadda yadda </p>

Would a browser just load these in order, so I would have to wait for
the 1MB image to load, then I'd see the text?
The image will be downloaded in the background while the rest of the
document is loaded/parsed (same happends for other external files like
stylesheet, applets... but not JavaScript files AFAIK). If there would be
WIDTH and HEIGHT attributes for the image, then the browser could reserve
the right amount of space on the page and the following text can flow
around it, even before the image is displayed. Without WIDTH/HEIGHT the
browser will sometimes render the following text as if there is no image
and then recalculate the layout when it knows the dimensions of the image.
Also, knowing this would also help me optimize pages to load faster.


If you specify WIDTH/HEIGHT in your images, the other stuff on your page
will not 'jump' around when the browser recalculates the layout. Speed
improvement might be minimal, but it just looks much smoother.

--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://www.odahoda.de/
Jan 25 '06 #3

P: n/a
> If you missed the thread on alt.www.webmaster about Uni classes, have a
read of "Web Design Teachers" from a few days ago:

<http://groups.google.com/group/alt.www.webmaster/browse_frm/thread/9260ab7b04c2a812/2c3e5a604de2f532?tvc=1&q=web+design+teachers+alt.w ww.webmaster&hl=en#2c3e5a604de2f532>


Yikes! Wow...I'm at a loss for words.

Viken K.

Jan 25 '06 #4

P: n/a
On Wed, 25 Jan 2006, Beauregard T. Shagnasty wrote:
If you missed the thread on alt.www.webmaster about Uni classes,
have a read of "Web Design Teachers" from a few days ago:


That's horrible... There was an analogous discussion on
uk.net.web.authoring a couple of weeks past, from someone who was
required to teach a web design module to a bizarre syllabus,
apparently written by someone who didn't really understand the WWW.

(An additional air of unreality was that they weren't allowed to use a
web server nor connect to the Internet. All the design and browsing
had to be done to the local filesystem.)

TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he
couldn't, in his worst nightmares, have imagined the kind of
preposterous HTML+CSS that would be extruded by (as far as I can see)
all of the currently widespread commercial tools - except when they
are in the hands of someone expert enough to keep them in check -
which, basically, means knowing /how/ to hand-code, even when not
actually /doing/ it.
Jan 25 '06 #5

P: n/a
In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
Alan J. Flavell <fl*****@ph.gla.ac.uk> wrote:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he
couldn't, in his worst nightmares, have imagined the kind of
preposterous HTML+CSS that would be extruded by (as far as I can see)
all of the currently widespread commercial tools - except when they
are in the hands of someone expert enough to keep them in check -
which, basically, means knowing /how/ to hand-code, even when not
actually /doing/ it.


I'm happy to say that all my web design work has been hand coded.
It's hard to avoid hand-coding when most of the pages are generated
dynamically by compiled C++ CGI programs or php scripts that perform
database manipulations.

The only times I've ever used a web page authoring tool was to try
something interesting to see what the code looked like, then I would
use my own version of the cleaned up code in my own pages.

-A
Jan 25 '06 #6

P: n/a
In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he
couldn't, in his worst nightmares, have imagined the kind of
preposterous HTML+CSS that would be extruded by (as far as I can see)
all of the currently widespread commercial tools - except when they
are in the hands of someone expert enough to keep them in check -
which, basically, means knowing /how/ to hand-code, even when not
actually /doing/ it.


Although there are many horrible examples of tools (and I still can't
find any tools I like except for StyleMaster), I do have the impression
that there is some gradual improvement. More important, at least some
of the people writing the tools now seem seriously interested in only
producing valid (X)HTML, and styling it with CSS. Since I don't want to
write HTML and CSS myself (and I especially don't want to write my own
CMS like I am doing at the moment), I looked through around 30 tools,
hoping to find something that gave results I liked.

On Macintosh, both Sandvox and Rapid Weaver (themes based, drag and
drop) apparently produce valid code, and it looks clean. Sandvox is
still beta, but I have great hopes for it in a few years. Rapid Weaver
has a longer history. Both produced by either one, or a few people.

More important, in OS X, the Cocoa HTML generator (and command line
textutil) that is used by default by most programs to produce HTML seems
to me to actually be improving from release to release. For instance,
you can now write a RTF document (the default) with links, lists and
tables in TextEdit (the default editor) and get clean and valid HTML
4.01 Strict output styled with CSS. It does a reasonable job with
title, keywords, description and other meta in the head. It is not
semantic, has no concept of h1, h2, etc or document outline, and a lot
of stuff is inline styles. You can't include images, voice or movies in
your HTML output (they get saved as a web archive instead of HTML). But
each version has added facilities. I can see how you could add a
post-processing step to clean and fix most of that up, and use it with
existing CSS style sheets. So I hope for continued improvement.

Mind you, I don't know why iWeb output seems such a bloated disaster ...
but even that is using CSS and will validate, which is a whole heap
better than a lot of past tools.

If the web is ever to be full of clean, lean, valid code, it will have
to come from tool makers being persuaded that that is the way their
tools need to work. Converting individuals like me (and others who
chance upon this group) is largely (alas) a waste of effort.

--
http://www.ericlindsay.com
Jan 25 '06 #7

P: n/a
Viken Karaguesian wrote:
How does a browser parse (correct term?) an HTML document. I'm sure
that every browser does it a little differently.
Well, um... That's like asking the ultimate question of life, the
universe and everything. No-body really knows for sure, although there
are some theories on the subject [1] with some relation to the
Heisenburg uncertainty principle. It varies significantly from browser
to browser, and then varies even more when you consider quirks mode;
though one thing we can be sure of is that no browser follows any
formally defined parsing rules.
Do they simply just read a document top-to-bottom and left-to-right
and just display elements in the order in which they encounter them?
They read through the source code from start to finish and attempt to
build a DOM along the way and as it is being built, each element in the
DOM is rendered on the screen according to the rules of CSS (in most
cases). For images and other objects, if the height and width is known
before it's loaded, the screen real estate is reserved for it. If it's
unknown, the page will reflow when the intrinsic dimensions are found.
Or, do they give priority to certain types of content?
No, they generally load it as quickly as possible as soon as they
receive it.
For instance, would a browser display text first, then all the images,
then all Javascripts, etc?


JavaScript in a script element is executed as soon as it is parsed
(except when the defer attribute is used)

[1] http://ln.hixie.ch/?start=1137740632&count=1
http://ln.hixie.ch/?start=1138169545&count=1
http://ln.hixie.ch/?start=1037910467&count=1
--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Jan 25 '06 #8

P: n/a
On Wed, 25 Jan 2006 17:01:16 +0000, "Alan J. Flavell"
<fl*****@ph.gla.ac.uk> wrote:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools.


AFAIR that quote only applied to HTML.. He was thinking of
"off-the-shelf" CSS schemes a la ZenGarden that were hand-coded by
skilled designers, then massively re-used. The discrepancy seems to be
that the emergent web voted overwhelmingly for originality in design
rather than quality.
Jan 26 '06 #9

P: n/a
Lachlan,

Thanks for the great answer. It was exactly what wat I was looking for.
Thanks to everyone for replying.

--
Viken K.
http://home.comcast.net/~vikenk
Jan 26 '06 #10

P: n/a
Tim
On Wed, 25 Jan 2006 17:01:16 +0000, Alan J. Flavell sent:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he couldn't,
in his worst nightmares, have imagined the kind of preposterous HTML+CSS
that would be extruded by (as far as I can see) all of the currently
widespread commercial tools - except when they are in the hands of someone
expert enough to keep them in check - which, basically, means knowing
/how/ to hand-code, even when not actually /doing/ it.


I seem to recall reading that it was the *intention* that HTML *would* be
machine generated (like most other document formats), but still human
readable (unlike most document formats).

The idea that you have to know how to build something by hand before you
can get a machine to do it for you is nothing new. You've only got to
look at how you train engineers, for example. Same for other skills, like
woodwork. I think the basic problem is that unskilled people think that
they can do skilled tasks just as well as an expert(*). God help us if
they take up an interest in first aid...

I mean "expert," not "professional." A professional is merely someone who
gets paid to do something, as opposed to an "amateur" who doesn't. An
expert knows what they're doing.

--
If you insist on e-mailing me, use the reply-to address (it's real but
temporary). But please reply to the group, like you're supposed to.

This message was sent without a virus, please destroy some files yourself.

Jan 26 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.