473,659 Members | 2,765 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Beautiful Soup Question: Filtering Images based on their width and height attributes

Hello,

I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?

Nov 30 '06 #1
3 3123
On 30 Nov 2006 12:43:45 -0800, PicURLPy <fb********@gma il.comwrote:
Hello,

I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?
Most image tags "in the wild" don't have height attributes, you have
to download the image to see what size it is.
--
http://mail.python.org/mailman/listinfo/python-list
Nov 30 '06 #2
Chris Mellon wrote:
>I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?

Most image tags "in the wild" don't have height attributes, you have
to download the image to see what size it is.
or at least a small portion of it; see the example at the bottom of this
page for one way to get the size without downloading more than 1k or so:

http://effbot.org/zone/pil-image-size.htm

</F>

Dec 1 '06 #3
Hello,

I want to extract some image links from different html pages, in
particular i want extract those image tags which height values are
greater than 200. Is there an elegant way in BeautifulSoup to do this?
Yes.

soup.findAll(la mbda tag: tag.name=="img" and tag.has_key("he ight")
and int(tag["height"]) 200)
Dec 4 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
8856
by: | last post by:
Hello all, This is an IE6 question: When I click on an image file on my desktop, ie will automatically resize the image to fit the window. But, when I use html to load the image, i.e. <img src=whatever>whatever</img> the image will not resize. Can this be done ? Tia.
6
6198
by: Robert J. O'Hara | last post by:
I'm one of those people who practices what some consider "dull" and others consider "elegantly conservative" page design. I appreciate good traditional typography and standards-compliant liquid displays, and I only reluctantly experiment with two-column layouts from time to time. One component of traditional book typography has always been the figure/caption combination. I haven't been following discussion on this topic for a year or...
2
2921
by: Van der Weij | last post by:
Hi, I want to preload some images for a webpage _and_ determing their width and height. The problem is that the scripts continue while the images are loaded in the background, while I need the thus undefined values of image.width and image.height. Thus I'm looking for a function which stops executing my script until the images are all fully loaded.
0
6426
by: Michelle Keys | last post by:
I am trying to call a print function to print a string from a database using javascript. Which is RC_DATA of Varchar2(2500). This is a javascript is not being used. I have a thing that needs to be modified: if(e.CommandName =="Print") { string parsedreceipt = null; parsedreceipt = DecodeReceipt (e.Item.Cells.Text); Session = parsedreceipt;
3
1412
by: rh0dium | last post by:
Hi all, I am trying to parse into a dictionary a table and I am having all kinds of fun. Can someone please help me out. What I want is this: dic={'Division Code':'SALS','Employee':'LOO ABLE'} Here is what I have..
2
3535
by: no one | last post by:
Hi, I want to write a pearl script that will go out to a web site and download the page. Then pull certain value pairs out of it. Basically I want to see graphs of snr etc. Is grabbing the html possible in perl? I know the parsing is. Basically I want to see if any of the values change and it corresponds to cable outages.
15
5980
by: Francach | last post by:
Hi, I'm trying to use the Beautiful Soup package to parse through the "bookmarks.html" file which Firefox exports all your bookmarks into. I've been struggling with the documentation trying to figure out how to extract all the urls. Has anybody got a couple of longer examples using Beautiful Soup I could play around with? Thanks, Martin.
0
1173
by: Samy | last post by:
Hi There, I am trying to display images in a gridview and display only valid images from the html in the database (and not display spacers, 1x1 pixel images etc). For this, I have a gridview with an asp:Image control in one of its template. I populate the image by setting the imageurl (remote url) of the image in the rowdatabound event of the gridview. I have an aspx page called draw.aspx and i do it this way.. imageurl =...
0
6097
by: Romulo NF | last post by:
Greetings again everyone Recently i´ve been asked to develop a script to allow filtering in the content of the table, with dinamic options based on the own content. Example: a table with the name of some students and their respective numbers, and then you wanna show only studentes called "Joao", or students with number "5", or even only students called "joao" with number "5". The structure we are going to use is a basic html table, like: ...
0
8339
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8851
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8535
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7360
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6181
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5650
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4338
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1982
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1739
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.