By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
457,734 Members | 832 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 457,734 IT Pros & Developers. It's quick & easy.

Find text within HTML file

P: n/a
Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?

Thanks
PK

Sep 8 '07 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Hi Piotrekk,

For example, System.IO.File.ReadAllText(@"C:\text.txt").Contain s("something")

Regards, Alex
[TechBlog] http://devkids.blogspot.com
Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?
Thanks
PK

Sep 8 '07 #2

P: n/a
This will not do what i asked for.
This method only opens file and reads text. I need to find text within
HTML TAGS - text visible for the user opening the page.

Hi Piotrekk,

For example, System.IO.File.ReadAllText(@"C:\text.txt").Contain s("something")
Sep 8 '07 #3

P: n/a
Hi Alex,

Hmm, yeah, sorry. The simplest way is to match Regex like "search_string(?=[^>]*<)".
Other is defined by props of html (is it valid, what tags should be ingnored
and so on).

Regards, Alex
[TechBlog] http://devkids.blogspot.com
Hi Piotrekk,

For example,
System.IO.File.ReadAllText(@"C:\text.txt").Contain s("something")

Regards, Alex
[TechBlog] http://devkids.blogspot.com
>Hi

Having a keyword i need to search HTML file for keyword dismissing
all
the tags, and checking only plain text.
Is there an easy way to do it in C#?
Thanks
PK

Sep 8 '07 #4

P: n/a
You could use a Regex.Replace statement with the correct Regex expression to
"clean" all the HTML tags from the text string of the HTML Page, but that
might not even be necessary since it is unlikely your keyword will be found
in HTML tag names or attributes.
Have you tried just:
int foundPosition = myHtmlString.IndexOf(keyWord) ... ?
this will return the first position of the keyword, or -1 if not found.
-- Peter
Recursion: see Recursion
site: http://www.eggheadcafe.com
unBlog: http://petesbloggerama.blogspot.com
BlogMetaFinder: http://www.blogmetafinder.com

"Piotrekk" wrote:
Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?

Thanks
PK

Sep 8 '07 #5

P: n/a
On Sat, 08 Sep 2007 03:01:15 -0700, Piotrekk
<Pi*************@gmail.comwrote:
>Hi

Having a keyword i need to search HTML file for keyword dismissing all
the tags, and checking only plain text.
Is there an easy way to do it in C#?

Thanks
PK
Have a look at
http://www.codeplex.com/Wiki/View.as...tmlagilitypack
It is a luvly bit of freeware that will parse just about any xml
structure.
It is great for wandering around inside a HTML page plucking out and
inserting field values.
hth
Bob
Sep 9 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.