473,382 Members | 1,648 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

parsing HTML programmatically

Is there an easy way to convert HTML that comes from database as a
string into text and display it on winform...

Thanks.

Nov 21 '05 #1
1 963
A year ago I wrote a spider that crawls around http://www.wtng.info and
extracts all the country codes, country names, and telephone numbering
information. I haven't looked at it since then, but I think this is the
utility function I made that strips all HTML tags from a string leaving only
the text:

Use it like this:

'...get your data first
Dim htmlText As String = 'wherever you got your html text

' This will keep extra empty lines
Dim textOnly As String = stripHTML(htmlText, True)

' This will remove [most] extra empty lines
Dim textOnly As String = stripHTML(htmlText, False)

For this function to compile, you must Import
System.Text.RegularExpressions.

Private Function stripHTML( _
ByVal html As String, _
ByVal keepCRLF As Boolean _
) As String

' isolates a value between HTML tags and other control chars

Dim rxDrop As New Regex("(\<[^\>]+)\>(\r\n)*")
Dim rxKeep As New Regex("(\<[^\>]+)\>")

If keepCRLF Then
Return rxKeep.Replace(html, "").Trim()
Else
Return rxDrop.Replace(html, "").Trim()
End If

End Function
--
Peace & happy computing,

Mike Labosh, MCSD
"Musha ring dum a doo dum a da!" -- James Hetfield
<un*****@gmail.com> wrote in message
news:11**********************@g14g2000cwa.googlegr oups.com...
Is there an easy way to convert HTML that comes from database as a
string into text and display it on winform...

Thanks.

Nov 21 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
14
by: Viktor Rosenfeld | last post by:
Hi, I need to create a parser for a Python project, and I'd like to use process kinda like lex/yacc. I've looked at various parsing packages online, but didn't find anything useful for me: -...
16
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed...
4
by: ALESSANDRO Baraldi | last post by:
Good evening. Few day ago i ask about this Object. A very kindly replay of Sthepen Lebans give me the solution to use ClipBoard to synthesise the Conversion, but i've some truble. My scene: ...
18
by: pkassianidis | last post by:
Hello everybody, I am in the process of writing my very first web application in Python, and I need a way to generate dynamic HTML pages with data from a database. I have to say I am...
1
by: yonido | last post by:
hello, my goal is to get patterns out of email files - say "message forwarding" patterns (message forwarded from: xx to: yy subject: zz) now lets say there are tons of these patterns (by gmail,...
4
by: Rick Walsh | last post by:
I have an HTML table in the following format: <table> <tr><td>Header 1</td><td>Header 2</td></tr> <tr><td>1</td><td>2</td></tr> <tr><td>3</td><td>4</td></tr> <tr><td>5</td><td>6</td></tr>...
9
by: ankitdesai | last post by:
I would like to parse a couple of tables within an individual player's SHTML page. For example, I would like to get the "Actual Pitching Statistics" and the "Translated Pitching Statistics"...
1
by: avpkills2002 | last post by:
I seem to be getting this weird problem in Internet explorer. I have written a code for parsing a XML file and displaying the output. The code works perfectly fine with ffx(Firefox).However is not...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.