By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,960 Members | 2,232 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,960 IT Pros & Developers. It's quick & easy.

HtmlDocument

P: n/a
Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write it?
Sep 26 '06 #1
Share this Question
Share on Google+
9 Replies


P: n/a
Le Minh,

Well, the HtmlDocument is the root. It contains references to all the
elements in the tree. It's just a matter of enumerating those elements and
performing whatever operation you want on them.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Le Minh" <ha*******@hotmail.comwrote in message
news:%2****************@TK2MSFTNGP05.phx.gbl...
Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write
it?

Sep 26 '06 #2

P: n/a
Hi,

In order to populate an HtmlDocument you need to load your HTML into a WebBrowser control. Then, using the WebBrowser.Document
property, which returns an HtmlDocument, you can iterate over the nodes as Nicholas suggested and populate a TreeView control.

--
Dave Sexton

"Le Minh" <ha*******@hotmail.comwrote in message news:%2****************@TK2MSFTNGP05.phx.gbl...
Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write it?

Sep 26 '06 #3

P: n/a
can you show me the way, i 'm confuse about this. you can write a small app,
can't you ?

Sep 26 '06 #4

P: n/a
can you show me the way, i 'm confuse about this. you can write a small app,
can't you ?
Sep 26 '06 #5

P: n/a
Hi Le Minh,

I'm not sure where to begin. Do you have Visual Studio.NET 2005? Are you familiar with WinForms applications and Controls? Are
you familiar with the TreeView or WebBrowser controls? Are you familiar with handling events? If you don't understand some or any
of the above questions then I'm probably not going to be able to help you short of writing the entire application, which I won't do,
but I'll try to give you some guidance.

That said, here's the general idea if you're using Visual Studio.NET (any edition, including Express):

1. Create a new Windows application project.
2. Add a WebBrowser control from the toolbox onto Form1. (You can position it however you'd like)
3. Add a TreeView control from the toolbox onto Form1. (You can position it however you'd like)
4. Set the WebBrowser.Url property to the url of your html document (this is the easiest way to load the document).
5. Create an event handler for the WebBrowser.DocumentCompleted event. It should look like the following:

private void webBrowser1_DocumentCompleted(object sender,
WebBrowserDocumentCompletedEventArgs e)
{
// make sure that the TreeView is cleared in case the browser navigates to another url
treeView1.Nodes.Clear();

// create the root node for the TreeView, which will contain all other nodes
TreeNode rootNode = treeView1.Nodes.Add("Root");

// Fill the TreeView, recursively, starting from the root node
FillTreeViewRecursively(rootNode, webBrowser1.Document.All);
}

6. Create the FillTreeViewRecursively method:

private void FillTreeViewRecursively(TreeNode currentNode, HtmlElementCollection elements)
{
// loop through the specified collection of elements and create their respective nodes
foreach (HtmlElement element in elements)
{
// create a new node for the current element and add it under the currentNode
TreeNode node = currentNode.Nodes.Add(element.TagName);

// optional: store a reference to the element that this node represents
node.Tag = element;

// create the nodes under this node for the elements contained by the current element
FillTreeViewRecursively(node, element.All);
}
}
Please be aware that I didn't try to build this code and so it will probably need some modifications.

--
Dave Sexton

"Le Minh" <ha*******@hotmail.comwrote in message news:OO**************@TK2MSFTNGP02.phx.gbl...
can you show me the way, i 'm confuse about this. you can write a small app,
can't you ?

Sep 26 '06 #6

P: n/a
thanks for your help. Let me try!
Sep 26 '06 #7

P: n/a
I had tried it, but there's something not good. It's seem the value is
dupilcate in the tree.
I think is may be the recursively function. Why do need use recursively
function here ? is there any way ?
Sep 26 '06 #8

P: n/a
Hi Le Minh,

As I mentioned, I didn't test that code. You might have to make some modifications.

The recursive method is necessary because you can't predict how deep the tree structure is before it needs to be parsed. Your other
option is to hard-code a fixed number of loops to allow only a certain level of deepness to be parsed. Not very dynamic though.

What value is being duplicated in the tree?

--
Dave Sexton

"Le Minh" <ha*******@hotmail.comwrote in message news:%2****************@TK2MSFTNGP05.phx.gbl...
>I had tried it, but there's something not good. It's seem the value is dupilcate in the tree.
I think is may be the recursively function. Why do need use recursively function here ? is there any way ?

Sep 26 '06 #9

P: n/a
Le Minh,

XML is created (invented) to overcome the problems with the non structured
format of HTML.

It was and is impossible to use HTML code in a structured way. Therefore I
am curious a little bit more what you want to do? In the way you ask this,
there can in my idea be no answer.

Cor

"Le Minh" <ha*******@hotmail.comschreef in bericht
news:%2****************@TK2MSFTNGP05.phx.gbl...
Hi, i want to write a program. Input of this is HTML source code of a web
page and output is a treeview representation it structure.
I want to write it with HtmlDocument in .net framework 2.0. how i write
it?

Sep 27 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.