By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
428,590 Members | 663 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 428,590 IT Pros & Developers. It's quick & easy.

Reading Content Of Web Pages Using Vb

P: 6
HI,
All

I need a code some thing that can read content of webpages using vb tht is without tags.

Or else a code that can remove all the tags from viewsource after gettin the viewsource in txet file or variable


please let me know is it possible.

Mail if can to (<Removed by Moderator>)

Thanks In Advance.
Apr 24 '07 #1
Share this Question
Share on Google+
7 Replies


100+
P: 149
HI,
All

I need a code some thing that can read content of webpages using vb tht is without tags.

Or else a code that can remove all the tags from viewsource after gettin the viewsource in txet file or variable


please let me know is it possible.

Mail if can to (...)

Thanks In Advance.
You can use VB 6.0 Inet Control to read the contents of the webpage and then do operations on them Just do a google search on "vb 6.0 Inet Control" and you will info for this.

-ansuman sahu
Apr 24 '07 #2

P: 6
You can use VB 6.0 Inet Control to read the contents of the webpage and then do operations on them Just do a google search on "vb 6.0 Inet Control" and you will info for this.

-ansuman sahu

i am using the same but the problem is that it returns the source code from which i found difficulty in retrieving the main contents


eg:

<td width="99%" class="Small"> Also include <span class="SmallBold"> Resume Summary </span></td>

i want "Resume summary"
this is just example.

Thanks for reply
Apr 24 '07 #3

Robbie
100+
P: 180
i am using the same but the problem is that it returns the source code from which i found difficulty in retrieving the main contents


eg:

<td width="99%" class="Small"> Also include <span class="SmallBold"> Resume Summary </span></td>

i want "Resume summary"
this is just example.

Thanks for reply
If you simply want to remove all text between '<' and '>', I'll make a function for that, it shouldn't be very hard. ;)
After making function: Okay, it was a little harder than I expected. ~_~;

Expand|Select|Wrap|Line Numbers
  1. Public Function StripHTMLTags(OriginalHTMLCode As String, Optional TagReplaceText As String = "") As String
  2. '
  3. 'OriginalHTMLCode - HTML code to strip tags from
  4. 'TagReplaceText - What this function will put in place of the
  5. 'tag (by default, nothing - an empty string)
  6. '
  7. 'Gives back the HTML code with tags replaced by TagReplaceText
  8. '
  9.     Dim StartTagPos As Long
  10.     Dim EndTagPos As Long
  11.     Dim TempTagPos As Long
  12.  
  13.     Dim StartTagNum As Long
  14.     Dim EndTagNum As Long
  15.  
  16.     Dim TempChar As String
  17.  
  18.     StartTagPos = InStr(1, OriginalHTMLCode, "<")
  19.  
  20. While StartTagPos > 0
  21.  
  22.  
  23.     If StartTagPos > 0 Then
  24.     'An open tag has been found
  25.         StartTagNum = 1
  26.         EndTagNum = 0
  27.  
  28.         'Keep searching until same number of open tags and close tags
  29.         'have been found (i.e. until nested tags finish >_<)
  30.         TempTagPos = StartTagPos + 1
  31.  
  32.         While (EndTagNum < StartTagNum And TempTagPos <= Len(OriginalHTMLCode))
  33.  
  34.             TempChar = Mid(OriginalHTMLCode, TempTagPos, 1)
  35.             If TempChar = "<" Then StartTagNum = StartTagNum + 1
  36.             If TempChar = ">" Then EndTagNum = EndTagNum + 1
  37.  
  38.             TempTagPos = TempTagPos + 1
  39.         Wend
  40.  
  41.  
  42.     End If
  43.  
  44.     EndTagPos = TempTagPos - 1
  45.  
  46.  
  47.     StripHTMLTags = TagReplaceText + StripHTMLTags
  48.     If StartTagPos > 1 Then
  49.         StripHTMLTags = Mid(OriginalHTMLCode, 1, StartTagPos - 1)
  50.     End If
  51.         StripHTMLTags = StripHTMLTags + Mid(OriginalHTMLCode, EndTagPos + 1, Len(OriginalHTMLCode) - 2)
  52.  
  53.         OriginalHTMLCode = StripHTMLTags
  54.  
  55.  
  56.     StartTagPos = InStr(1, OriginalHTMLCode, "<")
  57.     If StartTagPos > 0 Then
  58.         EndTagPos = InStr(StartTagPos, OriginalHTMLCode, "<")
  59.     End If
  60.  
  61. Wend
  62.  
  63.  
  64. End Function
  65.  
Here's an example of how to use it and what it does.
Text1.Text is:
Expand|Select|Wrap|Line Numbers
  1. <html>
  2. <b>Hi!!</b>
  3. Here's <i>more</i>.
  4. </html>
  5. Yep.
  6.  
Execute this:
Text2.Text = StripHTMLTags(Text1.Text)

Text2.Text is now:
Expand|Select|Wrap|Line Numbers
  1.  
  2. Hi!! 
  3. Here's more.
  4.  
  5. Yep.
  6.  
Hope it's what you needed. :)
Apr 25 '07 #4

Expert 5K+
P: 8,434
...
Mail if can to (<Removed by Moderator>)
Hi.

Just a note to let you know I've removed your e-mail address from the post. See the posting guidelines.
Apr 25 '07 #5

Robbie
100+
P: 180
Hi.

Just a note to let you know I've removed your e-mail address from the post. See the posting guidelines.
Err, Killer, it's still in the second post by ansumansahu. ;)
Apr 25 '07 #6

Expert 5K+
P: 8,434
Err, Killer, it's still in the second post by ansumansahu.
No it isn't. :p
Apr 25 '07 #7

P: 6
If you simply want to remove all text between '<' and '>', I'll make a function for that, it shouldn't be very hard. ;)
After making function: Okay, it was a little harder than I expected. ~_~;

Expand|Select|Wrap|Line Numbers
  1. Public Function StripHTMLTags(OriginalHTMLCode As String, Optional TagReplaceText As String = "") As String
  2. '
  3. 'OriginalHTMLCode - HTML code to strip tags from
  4. 'TagReplaceText - What this function will put in place of the
  5. 'tag (by default, nothing - an empty string)
  6. '
  7. 'Gives back the HTML code with tags replaced by TagReplaceText
  8. '
  9.     Dim StartTagPos As Long
  10.     Dim EndTagPos As Long
  11.     Dim TempTagPos As Long
  12.  
  13.     Dim StartTagNum As Long
  14.     Dim EndTagNum As Long
  15.  
  16.     Dim TempChar As String
  17.  
  18.     StartTagPos = InStr(1, OriginalHTMLCode, "<")
  19.  
  20. While StartTagPos > 0
  21.  
  22.  
  23.     If StartTagPos > 0 Then
  24.     'An open tag has been found
  25.         StartTagNum = 1
  26.         EndTagNum = 0
  27.  
  28.         'Keep searching until same number of open tags and close tags
  29.         'have been found (i.e. until nested tags finish >_<)
  30.         TempTagPos = StartTagPos + 1
  31.  
  32.         While (EndTagNum < StartTagNum And TempTagPos <= Len(OriginalHTMLCode))
  33.  
  34.             TempChar = Mid(OriginalHTMLCode, TempTagPos, 1)
  35.             If TempChar = "<" Then StartTagNum = StartTagNum + 1
  36.             If TempChar = ">" Then EndTagNum = EndTagNum + 1
  37.  
  38.             TempTagPos = TempTagPos + 1
  39.         Wend
  40.  
  41.  
  42.     End If
  43.  
  44.     EndTagPos = TempTagPos - 1
  45.  
  46.  
  47.     StripHTMLTags = TagReplaceText + StripHTMLTags
  48.     If StartTagPos > 1 Then
  49.         StripHTMLTags = Mid(OriginalHTMLCode, 1, StartTagPos - 1)
  50.     End If
  51.         StripHTMLTags = StripHTMLTags + Mid(OriginalHTMLCode, EndTagPos + 1, Len(OriginalHTMLCode) - 2)
  52.  
  53.         OriginalHTMLCode = StripHTMLTags
  54.  
  55.  
  56.     StartTagPos = InStr(1, OriginalHTMLCode, "<")
  57.     If StartTagPos > 0 Then
  58.         EndTagPos = InStr(StartTagPos, OriginalHTMLCode, "<")
  59.     End If
  60.  
  61. Wend
  62.  
  63.  
  64. End Function
  65.  
Here's an example of how to use it and what it does.
Text1.Text is:
Expand|Select|Wrap|Line Numbers
  1. <html>
  2. <b>Hi!!</b>
  3. Here's <i>more</i>.
  4. </html>
  5. Yep.
  6.  
Execute this:
Text2.Text = StripHTMLTags(Text1.Text)

Text2.Text is now:
Expand|Select|Wrap|Line Numbers
  1.  
  2. Hi!! 
  3. Here's more.
  4.  
  5. Yep.
  6.  
Hope it's what you needed. :)




ok i have solved the problem of removing tags, i have done it getting the source in text file and then removing the the tags.
But what i need is wanna store the source code in a variable as string using inet or web browser, as its possible in it but i think tht variable has some limit of characters.


So any other way to store the source code in variable
Thanks,
Nilesh Patil
Apr 26 '07 #8

Post your reply

Sign in to post your reply or Sign up for a free account.