Hubert Hung-Hsien Chang <hu****@cs.nyu.edu> wrote:
I know you could use the
def start_a
....
def end_a
....
to process the <a href=...> anchor </a> tags, but is there a
default method for processing ALL tags? If I just want change
some parts of the hyperlink and want to keep other parts of the HTML
could I just print them out? There should be such a method.
Can't find it...
You could subclass HTMLParser.HTMLParser and override handle_starttag
and handle_endtag (also, if needed, handle_charref, handle_entityref,
and last but not least handle_data -- that's assuming that while you
only talk about processing _tags_ you may in fact also want to process
references and text nodes... possibly handle_comment, too, btw).
Alex