473,786 Members | 2,795 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

HtmlDocument

Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write it?
Sep 26 '06 #1
9 4084
Le Minh,

Well, the HtmlDocument is the root. It contains references to all the
elements in the tree. It's just a matter of enumerating those elements and
performing whatever operation you want on them.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard. caspershouse.co m

"Le Minh" <ha*******@hotm ail.comwrote in message
news:%2******** ********@TK2MSF TNGP05.phx.gbl. ..
Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write
it?

Sep 26 '06 #2
Hi,

In order to populate an HtmlDocument you need to load your HTML into a WebBrowser control. Then, using the WebBrowser.Docu ment
property, which returns an HtmlDocument, you can iterate over the nodes as Nicholas suggested and populate a TreeView control.

--
Dave Sexton

"Le Minh" <ha*******@hotm ail.comwrote in message news:%2******** ********@TK2MSF TNGP05.phx.gbl. ..
Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write it?

Sep 26 '06 #3
can you show me the way, i 'm confuse about this. you can write a small app,
can't you ?

Sep 26 '06 #4
can you show me the way, i 'm confuse about this. you can write a small app,
can't you ?
Sep 26 '06 #5
Hi Le Minh,

I'm not sure where to begin. Do you have Visual Studio.NET 2005? Are you familiar with WinForms applications and Controls? Are
you familiar with the TreeView or WebBrowser controls? Are you familiar with handling events? If you don't understand some or any
of the above questions then I'm probably not going to be able to help you short of writing the entire application, which I won't do,
but I'll try to give you some guidance.

That said, here's the general idea if you're using Visual Studio.NET (any edition, including Express):

1. Create a new Windows application project.
2. Add a WebBrowser control from the toolbox onto Form1. (You can position it however you'd like)
3. Add a TreeView control from the toolbox onto Form1. (You can position it however you'd like)
4. Set the WebBrowser.Url property to the url of your html document (this is the easiest way to load the document).
5. Create an event handler for the WebBrowser.Docu mentCompleted event. It should look like the following:

private void webBrowser1_Doc umentCompleted( object sender,
WebBrowserDocum entCompletedEve ntArgs e)
{
// make sure that the TreeView is cleared in case the browser navigates to another url
treeView1.Nodes .Clear();

// create the root node for the TreeView, which will contain all other nodes
TreeNode rootNode = treeView1.Nodes .Add("Root");

// Fill the TreeView, recursively, starting from the root node
FillTreeViewRec ursively(rootNo de, webBrowser1.Doc ument.All);
}

6. Create the FillTreeViewRec ursively method:

private void FillTreeViewRec ursively(TreeNo de currentNode, HtmlElementColl ection elements)
{
// loop through the specified collection of elements and create their respective nodes
foreach (HtmlElement element in elements)
{
// create a new node for the current element and add it under the currentNode
TreeNode node = currentNode.Nod es.Add(element. TagName);

// optional: store a reference to the element that this node represents
node.Tag = element;

// create the nodes under this node for the elements contained by the current element
FillTreeViewRec ursively(node, element.All);
}
}
Please be aware that I didn't try to build this code and so it will probably need some modifications.

--
Dave Sexton

"Le Minh" <ha*******@hotm ail.comwrote in message news:OO******** ******@TK2MSFTN GP02.phx.gbl...
can you show me the way, i 'm confuse about this. you can write a small app,
can't you ?

Sep 26 '06 #6
thanks for your help. Let me try!
Sep 26 '06 #7
I had tried it, but there's something not good. It's seem the value is
dupilcate in the tree.
I think is may be the recursively function. Why do need use recursively
function here ? is there any way ?
Sep 26 '06 #8
Hi Le Minh,

As I mentioned, I didn't test that code. You might have to make some modifications.

The recursive method is necessary because you can't predict how deep the tree structure is before it needs to be parsed. Your other
option is to hard-code a fixed number of loops to allow only a certain level of deepness to be parsed. Not very dynamic though.

What value is being duplicated in the tree?

--
Dave Sexton

"Le Minh" <ha*******@hotm ail.comwrote in message news:%2******** ********@TK2MSF TNGP05.phx.gbl. ..
>I had tried it, but there's something not good. It's seem the value is dupilcate in the tree.
I think is may be the recursively function. Why do need use recursively function here ? is there any way ?

Sep 26 '06 #9
Le Minh,

XML is created (invented) to overcome the problems with the non structured
format of HTML.

It was and is impossible to use HTML code in a structured way. Therefore I
am curious a little bit more what you want to do? In the way you ask this,
there can in my idea be no answer.

Cor

"Le Minh" <ha*******@hotm ail.comschreef in bericht
news:%2******** ********@TK2MSF TNGP05.phx.gbl. ..
Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write
it?

Sep 27 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
6359
by: James Marshall | last post by:
I'm writing a library where I want to override document.write(), but for all document objects; thus, I want to put it in the prototype. I tried Document.prototype.write= my_doc_write ; but it didn't work. I discovered that this seemed to work: HTMLDocument.prototype.write= my_doc_write ; Why does HTMLDocument work here but not Document? Will this second
1
8774
by: kavitha | last post by:
Can we construct a HTMLDocument? I mean something like this string outerHTML = "<HTML><BODY>Some sample text...</BODY></HTML>"; HTMLDocument doc = new HTMLDocument(); doc.addElement("HTML"); doc.all.item("HTML", 0).outerHTML = outerHTML; Actually I have a url, I need to build a HTMLDocument
8
8212
by: pierre | last post by:
Hi, I got a problem which may easy to resolve, but I can't find any issue: I want to parse html files, so, I want first get it from an url, and I do like that: Dim objMSHTML As New mshtml.HTMLDocument() Dim objDocument As mshtml.HTMLDocument objDocument = objMSHTML.createDocumentFromUrl("http://www.google.fr",
0
1966
by: Filippo Bettinaglio | last post by:
VS2005, C# I have developed a UserControl embedded in a HTML web page. And I can access to the DOM with the following code: HTML page: …….. <BODY onload=loadDoc()> …….
2
14808
by: Paul Hemans | last post by:
I am very new at .Net. I have a small project where I need to manipulate the contents of a web page. I have a form with a web browser control (webBrowser1) on it. Within the webBrowser1_DocumentCompleted method I have the following code. mshtml.HTMLDocument oDoc = new HTMLDocumentClass(); oDoc = (mshtml.HTMLDocument)webBrowser1.Document;
0
2697
by: Irfan | last post by:
Hello, I want to load HTML file into HTMLDocument object. I don't want to use webbrowser object or any asyncrohonous call to load HTML into this file. Like if I call HTTPWebRequest to download the web page, I can somehow redirect the received HTML in HTMLDocument object. Any help how to do this. Thanks & Regards,
0
1469
by: forcedfx | last post by:
I'm faced with a bit of a conundrum. I'm trying to post a form using the HTMLDocument object. I've got the form posting working prefectly, however, in order to retrieve the HTML page that contains the form element I need to authenticate wih the server. Using Internet Explorer or FireFox when retrieving the HTML page I am presented with the following authnetication box before I can access the page....
0
1879
by: nickin4u | last post by:
I have a application that is used to automate certain task, I have been using mshtml.HTMLDocument class but certain events like click a button do not fire. I have tried a number of combinations but in vain. I was now trying to use System.Windows.Forms.Htmldocument class; here is the code Dim do1 As System.Windows.Forms.HtmlDocument = web1.Document web1 is axwebbrowser however when i execute the above line i get the error cannot...
2
3997
by: CSharper | last post by:
Is there a class I can use which loads HtmlDocument and performs default html validation to see if the document is a valid html document like XDocument? HtmlDocument seems to me only used to create HtmlDocument and there are no methods to load the existing HTML document. Thanks,
0
9650
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9497
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10363
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10110
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9962
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8992
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7515
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5398
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3670
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.