473,325 Members | 2,608 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,325 software developers and data experts.

How does a browser parse an html document?

Hello everyone,

Me again. Trying to learn some more :>) I hope I got the terminology
right.

How does a browser parse (correct term?) an HTML document. I'm sure
that every browser does it a little differently. Do they simply just
read a document top-to-bottom and left-to-right and just display
elements in the order in which they encounter them? Or, do they give
priority to certain types of content? For instance, would a browser
display text first, then all the images, then all Javascripts, etc?
Say, for example, I had a large image (let's say 1MB). The image is
floated left of my paragraph, so this is my code:

<img src="image.jpg" style="float: left">
<p> blah blah blah, yadda yadda yadda </p>

Would a browser just load these in order, so I would have to wait for
the 1MB image to load, then I'd see the text?

I ask because there are times I just can't tell. Of couse, a local copy
of a website is always fast - it's on your hard drive. But there are
times I will load the same page over and over again (emptying the cache
in between reloads) and it seems to load in a different order each
time. Sometimes I see text first, sometimes I see images first.

Also, knowing this would also help me optimize pages to load faster.

Thanks to all who reply. I'm really having fun with this and am
enjoying learning. I may even go to a local university to see if there
are any classes available :>)

Viken K.

Jan 25 '06 #1
10 4861
Viken Karaguesian wrote:
Thanks to all who reply. I'm really having fun with this and am
enjoying learning. I may even go to a local university to see if
there are any classes available :>)


If you missed the thread on alt.www.webmaster about Uni classes, have a
read of "Web Design Teachers" from a few days ago:

<http://groups.google.com/group/alt.www.webmaster/browse_frm/thread/9260ab7b04c2a812/2c3e5a604de2f532?tvc=1&q=web+design+teachers+alt.w ww.webmaster&hl=en#2c3e5a604de2f532>

Tread carefully with the school. You're likely to learn much more useful
stuff by lurking and hanging around these newsgroups. Remember to ask
intelligent questions. <g>

--
-bts
-Warning: I brake for lawn deer
Jan 25 '06 #2
Viken Karaguesian wrote:
How does a browser parse (correct term?) an HTML document. I'm sure
that every browser does it a little differently. Do they simply just
read a document top-to-bottom and left-to-right and just display
elements in the order in which they encounter them? Or, do they give
priority to certain types of content? For instance, would a browser
display text first, then all the images, then all Javascripts, etc?
Say, for example, I had a large image (let's say 1MB). The image is
floated left of my paragraph, so this is my code:

<img src="image.jpg" style="float: left">
<p> blah blah blah, yadda yadda yadda </p>

Would a browser just load these in order, so I would have to wait for
the 1MB image to load, then I'd see the text?
The image will be downloaded in the background while the rest of the
document is loaded/parsed (same happends for other external files like
stylesheet, applets... but not JavaScript files AFAIK). If there would be
WIDTH and HEIGHT attributes for the image, then the browser could reserve
the right amount of space on the page and the following text can flow
around it, even before the image is displayed. Without WIDTH/HEIGHT the
browser will sometimes render the following text as if there is no image
and then recalculate the layout when it knows the dimensions of the image.
Also, knowing this would also help me optimize pages to load faster.


If you specify WIDTH/HEIGHT in your images, the other stuff on your page
will not 'jump' around when the browser recalculates the layout. Speed
improvement might be minimal, but it just looks much smoother.

--
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://www.odahoda.de/
Jan 25 '06 #3
> If you missed the thread on alt.www.webmaster about Uni classes, have a
read of "Web Design Teachers" from a few days ago:

<http://groups.google.com/group/alt.www.webmaster/browse_frm/thread/9260ab7b04c2a812/2c3e5a604de2f532?tvc=1&q=web+design+teachers+alt.w ww.webmaster&hl=en#2c3e5a604de2f532>


Yikes! Wow...I'm at a loss for words.

Viken K.

Jan 25 '06 #4
On Wed, 25 Jan 2006, Beauregard T. Shagnasty wrote:
If you missed the thread on alt.www.webmaster about Uni classes,
have a read of "Web Design Teachers" from a few days ago:


That's horrible... There was an analogous discussion on
uk.net.web.authoring a couple of weeks past, from someone who was
required to teach a web design module to a bizarre syllabus,
apparently written by someone who didn't really understand the WWW.

(An additional air of unreality was that they weren't allowed to use a
web server nor connect to the Internet. All the design and browsing
had to be done to the local filesystem.)

TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he
couldn't, in his worst nightmares, have imagined the kind of
preposterous HTML+CSS that would be extruded by (as far as I can see)
all of the currently widespread commercial tools - except when they
are in the hands of someone expert enough to keep them in check -
which, basically, means knowing /how/ to hand-code, even when not
actually /doing/ it.
Jan 25 '06 #5
In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
Alan J. Flavell <fl*****@ph.gla.ac.uk> wrote:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he
couldn't, in his worst nightmares, have imagined the kind of
preposterous HTML+CSS that would be extruded by (as far as I can see)
all of the currently widespread commercial tools - except when they
are in the hands of someone expert enough to keep them in check -
which, basically, means knowing /how/ to hand-code, even when not
actually /doing/ it.


I'm happy to say that all my web design work has been hand coded.
It's hard to avoid hand-coding when most of the pages are generated
dynamically by compiled C++ CGI programs or php scripts that perform
database manipulations.

The only times I've ever used a web page authoring tool was to try
something interesting to see what the code looked like, then I would
use my own version of the cleaned up code in my own pages.

-A
Jan 25 '06 #6
In article <Pi*******************************@ppepc56.ph.gla. ac.uk>,
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he
couldn't, in his worst nightmares, have imagined the kind of
preposterous HTML+CSS that would be extruded by (as far as I can see)
all of the currently widespread commercial tools - except when they
are in the hands of someone expert enough to keep them in check -
which, basically, means knowing /how/ to hand-code, even when not
actually /doing/ it.


Although there are many horrible examples of tools (and I still can't
find any tools I like except for StyleMaster), I do have the impression
that there is some gradual improvement. More important, at least some
of the people writing the tools now seem seriously interested in only
producing valid (X)HTML, and styling it with CSS. Since I don't want to
write HTML and CSS myself (and I especially don't want to write my own
CMS like I am doing at the moment), I looked through around 30 tools,
hoping to find something that gave results I liked.

On Macintosh, both Sandvox and Rapid Weaver (themes based, drag and
drop) apparently produce valid code, and it looks clean. Sandvox is
still beta, but I have great hopes for it in a few years. Rapid Weaver
has a longer history. Both produced by either one, or a few people.

More important, in OS X, the Cocoa HTML generator (and command line
textutil) that is used by default by most programs to produce HTML seems
to me to actually be improving from release to release. For instance,
you can now write a RTF document (the default) with links, lists and
tables in TextEdit (the default editor) and get clean and valid HTML
4.01 Strict output styled with CSS. It does a reasonable job with
title, keywords, description and other meta in the head. It is not
semantic, has no concept of h1, h2, etc or document outline, and a lot
of stuff is inline styles. You can't include images, voice or movies in
your HTML output (they get saved as a web archive instead of HTML). But
each version has added facilities. I can see how you could add a
post-processing step to clean and fix most of that up, and use it with
existing CSS style sheets. So I hope for continued improvement.

Mind you, I don't know why iWeb output seems such a bloated disaster ...
but even that is using CSS and will validate, which is a whole heap
better than a lot of past tools.

If the web is ever to be full of clean, lean, valid code, it will have
to come from tool makers being persuaded that that is the way their
tools need to work. Converting individuals like me (and others who
chance upon this group) is largely (alas) a waste of effort.

--
http://www.ericlindsay.com
Jan 25 '06 #7
Viken Karaguesian wrote:
How does a browser parse (correct term?) an HTML document. I'm sure
that every browser does it a little differently.
Well, um... That's like asking the ultimate question of life, the
universe and everything. No-body really knows for sure, although there
are some theories on the subject [1] with some relation to the
Heisenburg uncertainty principle. It varies significantly from browser
to browser, and then varies even more when you consider quirks mode;
though one thing we can be sure of is that no browser follows any
formally defined parsing rules.
Do they simply just read a document top-to-bottom and left-to-right
and just display elements in the order in which they encounter them?
They read through the source code from start to finish and attempt to
build a DOM along the way and as it is being built, each element in the
DOM is rendered on the screen according to the rules of CSS (in most
cases). For images and other objects, if the height and width is known
before it's loaded, the screen real estate is reserved for it. If it's
unknown, the page will reflow when the intrinsic dimensions are found.
Or, do they give priority to certain types of content?
No, they generally load it as quickly as possible as soon as they
receive it.
For instance, would a browser display text first, then all the images,
then all Javascripts, etc?


JavaScript in a script element is executed as soon as it is parsed
(except when the defer attribute is used)

[1] http://ln.hixie.ch/?start=1137740632&count=1
http://ln.hixie.ch/?start=1138169545&count=1
http://ln.hixie.ch/?start=1037910467&count=1
--
Lachlan Hunt
http://lachy.id.au/
http://GetFirefox.com/ Rediscover the Web
http://GetThunderbird.com/ Reclaim your Inbox
Jan 25 '06 #8
On Wed, 25 Jan 2006 17:01:16 +0000, "Alan J. Flavell"
<fl*****@ph.gla.ac.uk> wrote:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools.


AFAIR that quote only applied to HTML.. He was thinking of
"off-the-shelf" CSS schemes a la ZenGarden that were hand-coded by
skilled designers, then massively re-used. The discrepancy seems to be
that the emergent web voted overwhelmingly for originality in design
rather than quality.
Jan 26 '06 #9
Lachlan,

Thanks for the great answer. It was exactly what wat I was looking for.
Thanks to everyone for replying.

--
Viken K.
http://home.comcast.net/~vikenk
Jan 26 '06 #10
Tim
On Wed, 25 Jan 2006 17:01:16 +0000, Alan J. Flavell sent:
TimBL predicted, ages back, that hand-coding would rapidly go out of
fashion and be replaced by high-level design tools. Surely he couldn't,
in his worst nightmares, have imagined the kind of preposterous HTML+CSS
that would be extruded by (as far as I can see) all of the currently
widespread commercial tools - except when they are in the hands of someone
expert enough to keep them in check - which, basically, means knowing
/how/ to hand-code, even when not actually /doing/ it.


I seem to recall reading that it was the *intention* that HTML *would* be
machine generated (like most other document formats), but still human
readable (unlike most document formats).

The idea that you have to know how to build something by hand before you
can get a machine to do it for you is nothing new. You've only got to
look at how you train engineers, for example. Same for other skills, like
woodwork. I think the basic problem is that unskilled people think that
they can do skilled tasks just as well as an expert(*). God help us if
they take up an interest in first aid...

I mean "expert," not "professional." A professional is merely someone who
gets paid to do something, as opposed to an "amateur" who doesn't. An
expert knows what they're doing.

--
If you insist on e-mailing me, use the reply-to address (it's real but
temporary). But please reply to the group, like you're supposed to.

This message was sent without a virus, please destroy some files yourself.

Jan 26 '06 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Don | last post by:
I want the server-side php script to return a browser page that is essentially a copy of the original client page that contained the <form> which referenced the php script in the first place....
6
by: bayram guzer | last post by:
hi everybody, i have very strange error. i can not see some of the asp pages on my browser. when i look from view source, all the source is there but browser does not show anything, just an empty...
11
by: Simon Wigzell | last post by:
I cobbled together the following function from examples on the internet to set named spanned items in the parent form. It works fine for IE but not at all for netscape. What other browser...
4
by: Brian Glen Palicia | last post by:
My goal is to accept input from the user into a text box and then parse the data using split(). The first step is this tiny program to test the split() function. It runs in IE, but in Mozilla it...
2
by: John | last post by:
The following code works OK in IE 6.0 but does not work in Netscape 7. The image does not shift when one scrolls down but stays stationary in Netscape. Please help Thank you John function...
3
by: uv2003 | last post by:
Greetings, I've been searching for a way to use the W3 DOM Level 1 interfaces in a native .NET implementation without any luck. Does anyone know if something like this exists? Specifically,...
22
by: Gianni Rondinini | last post by:
hi all. please excuse the misusage of some tech terms, but writing in english is not as easy as in italian :) i'm designing our new website and, since i want to do something that will last as...
44
by: rhythmace | last post by:
W3C HTML validator passes this: .... <script type="text/javascript" src="foo.js"> <script type="text/javascript"> ....script in here... </script> ....
1
by: zort15 | last post by:
I'm trying to get the HTTP Response header out of a WebBrowser control so I can figure out what type of document it is, specifically RSS versus HTML. I tried using WebBrowser.DocumentType, but the...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.