473,486 Members | 2,407 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Parsing HTML document, how?

This what my program should do, you give it the URL to a page and a
template file, it downloads that page and then using the template file it
returns some information.

The way I thought of doing it was that the template file uses regex and
then in my program I just do re.search(template, htmlpage) and this would
work but the HTML document has characters like ? and * that I need to
escape in the template, so this solution doesn't work. What is a better
way to accomplish what I want? does Python have any standard library for
this?

The parsing has to be dynamic, from the template file, the URLs are not
fixed.

Jul 18 '05 #1
0 1018

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
9423
by: Gerrit Holl | last post by:
Posted with permission from the author. I have some comments on this PEP, see the (coming) followup to this message. PEP: 321 Title: Date/Time Parsing and Formatting Version: $Revision: 1.3 $...
6
1904
by: egg | last post by:
Dear Sir/Madam, I'm about to start a project, the idea is to have a web interface to let user specify the URL of a target HTML document, based on that URL, retrieved and parsed the HTML document...
16
2849
by: Terry | last post by:
Hi, This is a newbie's question. I want to preload 4 images and only when all 4 images has been loaded into browser's cache, I want to start a slideshow() function. If images are not completed...
5
12831
by: STeve | last post by:
Hey guys, I currently have a 100 page word document filled with various "articles". These articles are delimited by the Style of the text (IE. Heading 1 for the various titles) These articles...
1
1003
by: UJ | last post by:
Is there an easy way to parse apart an HTML document? I've got a document that has text stuck in the body of the document and I want to pull it out. TIA. Jeffrey.
10
29404
by: Curtis | last post by:
Does anyone have any good examples of parsing WebPages in VB.Net. My application needs to get information from certain HTML tables and I haven't been able to find a good way to approach the...
0
1974
by: june | last post by:
Hi, I have a big problem with parsing HTML into a XHTML using Cberneko to validate the html. First I tried to work with a HTML-File. This solutions works fine: String aHTMLFile =...
2
2078
by: hzgt9b | last post by:
I've written a simple javascript page that parses an XML file... (Actually I just modified the "Parsing an XML File" sample from http://www.w3schools.com/dom/dom_parser.asp) The page works great...
1
4807
by: avpkills2002 | last post by:
I seem to be getting this weird problem in Internet explorer. I have written a code for parsing a XML file and displaying the output. The code works perfectly fine with ffx(Firefox).However is not...
0
7100
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6964
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7126
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7330
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5434
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4559
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3070
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3070
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
262
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.