469,364 Members | 2,332 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,364 developers. It's quick & easy.

Find text within HTML file

Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?

Thanks
PK

Sep 8 '07 #1
5 10943
Hi Piotrekk,

For example, System.IO.File.ReadAllText(@"C:\text.txt").Contain s("something")

Regards, Alex
[TechBlog] http://devkids.blogspot.com
Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?
Thanks
PK

Sep 8 '07 #2
This will not do what i asked for.
This method only opens file and reads text. I need to find text within
HTML TAGS - text visible for the user opening the page.

Hi Piotrekk,

For example, System.IO.File.ReadAllText(@"C:\text.txt").Contain s("something")
Sep 8 '07 #3
Hi Alex,

Hmm, yeah, sorry. The simplest way is to match Regex like "search_string(?=[^>]*<)".
Other is defined by props of html (is it valid, what tags should be ingnored
and so on).

Regards, Alex
[TechBlog] http://devkids.blogspot.com
Hi Piotrekk,

For example,
System.IO.File.ReadAllText(@"C:\text.txt").Contain s("something")

Regards, Alex
[TechBlog] http://devkids.blogspot.com
>Hi

Having a keyword i need to search HTML file for keyword dismissing
all
the tags, and checking only plain text.
Is there an easy way to do it in C#?
Thanks
PK

Sep 8 '07 #4
You could use a Regex.Replace statement with the correct Regex expression to
"clean" all the HTML tags from the text string of the HTML Page, but that
might not even be necessary since it is unlikely your keyword will be found
in HTML tag names or attributes.
Have you tried just:
int foundPosition = myHtmlString.IndexOf(keyWord) ... ?
this will return the first position of the keyword, or -1 if not found.
-- Peter
Recursion: see Recursion
site: http://www.eggheadcafe.com
unBlog: http://petesbloggerama.blogspot.com
BlogMetaFinder: http://www.blogmetafinder.com

"Piotrekk" wrote:
Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?

Thanks
PK

Sep 8 '07 #5
On Sat, 08 Sep 2007 03:01:15 -0700, Piotrekk
<Pi*************@gmail.comwrote:
>Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?

Thanks
PK
Have a look at
http://www.codeplex.com/Wiki/View.as...tmlagilitypack
It is a luvly bit of freeware that will parse just about any xml
structure.
It is great for wandering around inside a HTML page plucking out and
inserting field values.
hth
Bob
Sep 9 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by ash | last post: by
4 posts views Thread by Ralf Koms | last post: by
9 posts views Thread by trihanhcie | last post: by
reply views Thread by zhoujie | last post: by
1 post views Thread by Marylou17 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.