-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
"Kris Wempa" <calmincents(NO_SPAM)@yahoo.com> wrote in
news:bm*********@kcweb01.netnews.att.com:
"Gunnar Hjalmarsson" <no*****@gunnar.cc> wrote in message
news:KD********************@newsc.telia.net... jjliu wrote: > Could someone tell me how to remove all html tags (and anything
> inside tags) by perl.
Sure.
s/.*//s;
That will remove ALL characters.
Gunnar knows that. :-)
He really needs something along the
lines of:
s/\<[^\<]+\>//;
Why all the backslashes?
Also, I suspect you meant the second < to be a >.
This only works if the entire TAG is within the same string. If the
tag spans multiple lines, they will need to be concatenated into 1
string.
It also doesn't work if anything within the tag or its attributes contain
a > symbol. Example:
<img src="mathexpression.gif" alt="5 is > 4" />
<input type="submit" onclick="if (count > 1) true else false" />
- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print
-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>
iQA/AwUBP4ftJGPeouIeTNHoEQJxpACghIOdjOo5xr7rh9N5zQ6d9E F3KvIAmwdA
R0qdv3U33ZyBzW4L7u8Vq6jf
=sIdz
-----END PGP SIGNATURE-----