474,048 Members | 33,632 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Question: processing HTML, re-write default processing action of many tags

I know you could use the
def start_a
.....

def end_a
.....

to process the <a href=...> anchor </a> tags, but is there a
default method for processing ALL tags? If I just want change
some parts of the hyperlink and want to keep other parts of the HTML
could I just print them out? There should be such a method.
Can't find it...

Thank you.
Jul 18 '05 #1
2 1204
Hubert Hung-Hsien Chang <hu****@cs.nyu. edu> wrote:
I know you could use the
def start_a
....

def end_a
....

to process the <a href=...> anchor </a> tags, but is there a
default method for processing ALL tags? If I just want change
some parts of the hyperlink and want to keep other parts of the HTML
could I just print them out? There should be such a method.
Can't find it...


You could subclass HTMLParser.HTML Parser and override handle_starttag
and handle_endtag (also, if needed, handle_charref, handle_entityre f,
and last but not least handle_data -- that's assuming that while you
only talk about processing _tags_ you may in fact also want to process
references and text nodes... possibly handle_comment, too, btw).
Alex
Jul 18 '05 #2
hu****@cs.nyu.e du (Hubert Hung-Hsien Chang) wrote in message news:<98******* *************** ****@posting.go ogle.com>...
I know you could use the
def start_a
....

def end_a
....

to process the <a href=...> anchor </a> tags, but is there a
default method for processing ALL tags? If I just want change
some parts of the hyperlink and want to keep other parts of the HTML
could I just print them out? There should be such a method.
Can't find it...

Thank you.


If you are modifying the contents of tags I've written a simple HTML
parser class called Scraper that does this. Unlike the HTMLParser in
the standard library it doesn't choke so much on badly formed HTML....

It's part of approx.py my cgiproxy....
http://www.voidspace.org.uk/atlantib....html#cgiproxy

HTH

Regards,

Fuzzy
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
2130
by: Jan Plastenjak | last post by:
This question is about html. I have this: <frameset rows="90%,10%" frameborder="NO" border="0" framespacing="0" cols="1"> <frame name="topFrame" noresize src="c:\bpp.xml"> <frame name="bottom" scrolling="NO" noresize src="c:\btn.htm"> </frameset> Is there a way to assign html code to frames "in code", not to have to specify src property? Something like:
4
1978
by: | last post by:
I do have icq on my computer, and I am interested it for another reason. But I don't have time at the moment to look into it. My question is regard to HTML (and either asp or xml in HTML mode). Suppose I want to use a home directory approach(if you don't know me I can elucidate later), show that I could have a master HTML file go to the directory below it. Will you tell me how to it?
7
2347
by: David Hayes | last post by:
I tried finding an answer on http://www.quirksmode.org/ without success. I am attempting a complicated Frames structure. I have made it work in IE, but not Netscape. I begin with three frames, where the two lower ones are within a Frameset within the master Frameset: 1111111111111111111
4
3057
by: Nigel Molesworth | last post by:
I've Googled, but can't find what I need, perhaps I asking the wrong question! I want a "FAQ" page on a web site, I hate those pages that scroll you to the answer so and I figured that a good way to do it would be to have hidden content under each question, something like this : What is the first letter of the alphabet?
3
1355
by: Patrick | last post by:
Hi experts, I am trying to help a guy fix up an incomplete site, other coder quit. I have a question about how the site narrows the main page in a frame view. The page is formatted to have large margins of blank (but colored) margins and is centered. I am using a 17" monitor at high res, so I figure it's like that so lesser resolutions can see the entire thing at once. Here is the frameset index file: /*entire index.htm file */
2
2313
by: Jukka K. Korpela | last post by:
This question is fairly theoretical (even for me), but it started to puzzle me: According to the SGML declaration for HTML 4.01, at http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html#h-20.1 the Form Feed character, U+000C (12 in decimal), is UNUSED, i.e. forbidden: DESCSET 0 9 UNUSED 9 2 9 11 2 UNUSED
6
1516
by: alexrussell101 | last post by:
For anyone who can't be bothered to read my code and examples, scroll to the bottom, the question's there. Thanks. I'm using php and regular expressions to convert bbcode style things to html. My code to convert something like this: Hello there
4
3092
by: vincent90152900 | last post by:
How to remove Security Question and Security Answer from membership provider? Following is my codes. Please tell me how to remove Question and Answer from membership provider. Thank you for replying. CreateNewWizard.aspx <asp:CreateUserWizard ID="CreateUserWizard1" runat="server"> <WizardSteps> <asp:CreateUserWizardStep runat="server"> </asp:CreateUserWizardStep> ...
3
1267
by: maximisedk | last post by:
I need to create a HTML table for a number of users. Each user will have a unique HTML table in regard to number of columns, as the user can choose to append columns to his HTML table. The way I went about doing this is as follows: create a table for the row count: list ------------ id user_id
2
1704
by: Vohaul | last post by:
Hi there, I'm new to this site, but have been doing webpages with flash, html, css for a few years. I'm a convert from graphic designer to web Anyway, I have a client site that i designed a year or more ago using Iframes (yes, i know, I should of known better).. the link is here: www.mdltd.ca
0
10557
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10358
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
12049
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
8717
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7882
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
6670
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
6865
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
5430
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3984
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.