473,386 Members | 1,973 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

DOM Model - HTML Parser

HI ALL

I would like to know if anyone knows how to traverse the document.body (or any other method) to get the contents of the current page. It must be general enough so that I can traverse any html page using the same algorithm, because this will be included in a Firefox extension.

THANKS A LOT!
Apr 27 '06 #1
3 3958
If by "traverse" you mean visit each node then you can use "node.childNodes" to get the collection of child nodes for that node. Or once you're at a node you can use "node.nextSibling". If you don't care what order they come in (but I assume you do) you can use "document.getElementsByTagName("*")". That should give you every single node on the page.

If you just want to see the source then once you get the HTMLHtmlElement object you just use "node.innerHTML". This will get you all the text between "<html>" and "</html>". The best part about that is it will show you any changes made by JavaScript. Using the browsers view source option only shows the document before JavaScript has it's way.
Nov 9 '06 #2
hi,
even I am looking to parse a HTML DOM. By parse I mean I want to extract all the tags that DOM contains (div,span,body etc) and also their classes and ids. and then store them in a hash table. Has anyone ever done this before or have any clue on how its done?
Any help is greatly appreciated.
Thanks
Dec 9 '06 #3
mltsy
1
Are you saying you would like to use javascript on a page to examine all the contents of the page after the DOM has been loaded? The childNodes collection would be the way to do that. If you mean you want to use some other language to parse a string or file containing an HTML document, that's a different story. That's actually what I'm looking to do, but not finding much help so far.
Aug 23 '07 #4

Sign in to post your reply or Sign up for a free account.

Similar topics

7
by: YoBro | last post by:
Hi I have used some of this code from the PHP manual, but I am bloody hopeless with regular expressions. Was hoping somebody could offer a hand. The output of this will put the name of a form...
0
by: Himanshu Garg | last post by:
Hello, I am using HTML::Parser to extract text from html pages from http://bbc.co.uk/urdu/ However the encoding of the input text seems to change to some unknown encoding in the output. The...
3
by: Himanshu Garg | last post by:
Hello, I am trying to pinpoint an apparent bug in HTML::Parser. The encoding of the text seems to change incorrectly if the locale isn't set properly. However Parser.pm in the directory...
2
by: Ken Philips | last post by:
As far as I know the Schema Object Model (SOM) is only working for the MSXML Parser together with CSharp or VisualBasic but not Java. Is there a similar Tool/Spec/Api for the Java World? Ken
4
by: bariole | last post by:
Hi I am trying to make lexical analysis of some simplified html code with flex tool. However that kind of work is new to me and I don't know where to start. I have searched a web but I didn't...
5
by: Rob | last post by:
Hi, I am looking for a reference to the Document Object Model interface for Javascript, preferably in HTMLHELP format. Do you know of any such reference?
10
by: Michael Strorm | last post by:
Hi! I've been having problems with a DTD. Having had the Sun XML validator reject a document, I put it through 'xmllint' for more information. 'Xmllint' noted a problem with the DTD itself;...
2
by: Kevin Yu | last post by:
hi all can anyone point me the right document on asp.net 1.1 page compile model? I need to know the detail from user request to IIS and aspnet_isapi.dll to aspnet_wk.exe then how the page is...
0
by: june | last post by:
Hi, I have a big problem with parsing HTML into a XHTML using Cberneko to validate the html. First I tried to work with a HTML-File. This solutions works fine: String aHTMLFile =...
12
by: Steve | last post by:
I have been studying the Adjacency List Model as a means of achieving a folder structure in a project I am working on. Started with the excellent article by Gijs Van Tulder ...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.