By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,275 Members | 1,745 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,275 IT Pros & Developers. It's quick & easy.

Beautiful Soup Question: Filtering Images based on their width and height attributes

P: n/a
Hello,

I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?

Nov 30 '06 #1
Share this Question
Share on Google+
3 Replies


P: n/a
On 30 Nov 2006 12:43:45 -0800, PicURLPy <fb********@gmail.comwrote:
Hello,

I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?
Most image tags "in the wild" don't have height attributes, you have
to download the image to see what size it is.
--
http://mail.python.org/mailman/listinfo/python-list
Nov 30 '06 #2

P: n/a
Chris Mellon wrote:
>I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?

Most image tags "in the wild" don't have height attributes, you have
to download the image to see what size it is.
or at least a small portion of it; see the example at the bottom of this
page for one way to get the size without downloading more than 1k or so:

http://effbot.org/zone/pil-image-size.htm

</F>

Dec 1 '06 #3

P: n/a
Hello,

I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?
Yes.

soup.findAll(lambda tag: tag.name=="img" and tag.has_key("height")
and int(tag["height"]) 200)
Dec 4 '06 #4

This discussion thread is closed

Replies have been disabled for this discussion.