473,738 Members | 5,934 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

parsing HTML programmaticall y

Is there an easy way to convert HTML that comes from database as a
string into text and display it on winform...

Thanks.

Nov 21 '05 #1
1 989
A year ago I wrote a spider that crawls around http://www.wtng.info and
extracts all the country codes, country names, and telephone numbering
information. I haven't looked at it since then, but I think this is the
utility function I made that strips all HTML tags from a string leaving only
the text:

Use it like this:

'...get your data first
Dim htmlText As String = 'wherever you got your html text

' This will keep extra empty lines
Dim textOnly As String = stripHTML(htmlT ext, True)

' This will remove [most] extra empty lines
Dim textOnly As String = stripHTML(htmlT ext, False)

For this function to compile, you must Import
System.Text.Reg ularExpressions .

Private Function stripHTML( _
ByVal html As String, _
ByVal keepCRLF As Boolean _
) As String

' isolates a value between HTML tags and other control chars

Dim rxDrop As New Regex("(\<[^\>]+)\>(\r\n)*")
Dim rxKeep As New Regex("(\<[^\>]+)\>")

If keepCRLF Then
Return rxKeep.Replace( html, "").Trim()
Else
Return rxDrop.Replace( html, "").Trim()
End If

End Function
--
Peace & happy computing,

Mike Labosh, MCSD
"Musha ring dum a doo dum a da!" -- James Hetfield
<un*****@gmail. com> wrote in message
news:11******** **************@ g14g2000cwa.goo glegroups.com.. .
Is there an easy way to convert HTML that comes from database as a
string into text and display it on winform...

Thanks.

Nov 21 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
9442
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $ Last-Modified: $Date: 2003/10/28 19:48:44 $ Author: A.M. Kuchling <amk@amk.ca> Status: Draft Type: Standards Track
14
2637
by: Viktor Rosenfeld | last post by:
Hi, I need to create a parser for a Python project, and I'd like to use process kinda like lex/yacc. I've looked at various parsing packages online, but didn't find anything useful for me: - PyLR seems promising but is for Python 1.5 - Yappy seems promising, but I couldn't get it to work. It doesn't even compile the main example in it's documentation - mxTexttools is way complicated. I'd like something that I can give a BNF
16
2906
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed loaded into cache, the slideshow doesn't look very nice. I am not sure how/when to call the slideshow() function to make sure it starts after the preload has been completed.
4
8952
by: ALESSANDRO Baraldi | last post by:
Good evening. Few day ago i ask about this Object. A very kindly replay of Sthepen Lebans give me the solution to use ClipBoard to synthesise the Conversion, but i've some truble. My scene: I use a RichTextBox to write the Formatted Text, and i need to send the content by Mail with Winsok.
18
4971
by: pkassianidis | last post by:
Hello everybody, I am in the process of writing my very first web application in Python, and I need a way to generate dynamic HTML pages with data from a database. I have to say I am overwhelmed by the plethora of different frameworks, templating engines, HTML generation tools etc that exist. After some thought I decided to leave the various frameworks aside for the
1
2427
by: yonido | last post by:
hello, my goal is to get patterns out of email files - say "message forwarding" patterns (message forwarded from: xx to: yy subject: zz) now lets say there are tons of these patterns (by gmail, outlook, etc) - and i want to create some rules of how to get them out of the mail's html body. so at first i tried using regular expressions: for example - "any pattern that starts with a <p> and contains "from:"..." etc.
4
4864
by: Rick Walsh | last post by:
I have an HTML table in the following format: <table> <tr><td>Header 1</td><td>Header 2</td></tr> <tr><td>1</td><td>2</td></tr> <tr><td>3</td><td>4</td></tr> <tr><td>5</td><td>6</td></tr> </table> With an XSLT styles sheet, I can use for-each to grab the values in
9
4061
by: ankitdesai | last post by:
I would like to parse a couple of tables within an individual player's SHTML page. For example, I would like to get the "Actual Pitching Statistics" and the "Translated Pitching Statistics" portions of Babe Ruth page (http://www.baseballprospectus.com/dt/ruthba01.shtml) and store that info in a CSV file. Also, I would like to do this for numerous players whose IDs I have stored in a text file (e.g.: cobbty01, ruthba01, speaktr01, etc.)....
1
4848
by: avpkills2002 | last post by:
I seem to be getting this weird problem in Internet explorer. I have written a code for parsing a XML file and displaying the output. The code works perfectly fine with ffx(Firefox).However is not working in Internet Explorer.(I m using Internet Explorer 6.0). The code is as follows: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http:// www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html...
0
9476
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9335
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9263
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9208
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6751
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6053
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4570
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3279
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2745
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.