473,395 Members | 2,798 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Is it possible to get image size before/without downloading?

Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

Cheers, Al.

Jul 22 '06 #1
4 8613
In the head of an HTTP response, most servers will specify a
Content-Length that is the number of bytes in the body of the response.
Normally, when using the GET method, the header is returned with the
body following. It is possible to make a HEAD request to the server
that will only return header information that will hopefully tell you
the file size.

If you want to know the actual dimensions of the image, I don't know of
anything in HTTP that will tell you. You will probably just have to
download the image to find that out. Relevant HTTP specs below if you
care.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

The above is true regardless of language. In python it appears there an
httplib module. I would call request using the method head.

http://docs.python.org/lib/httpconnection-objects.html

al*********@gmail.com wrote:
Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

Cheers, Al.
Jul 22 '06 #2
Thanks Josiah

I thought as much... Still, it'll help me immensely to cut the
downloads from a page to only those that are within a file-size range,
even if this gets me some images that are out-of-spec dimensionally.

Cheers, Al.

(Oh, and if anyone still has a bright idea about how to get image
dimensions without downloading, it'd be great to hear!)

Josiah Manson wrote:
In the head of an HTTP response, most servers will specify a
Content-Length that is the number of bytes in the body of the response.
Normally, when using the GET method, the header is returned with the
body following. It is possible to make a HEAD request to the server
that will only return header information that will hopefully tell you
the file size.

If you want to know the actual dimensions of the image, I don't know of
anything in HTTP that will tell you. You will probably just have to
download the image to find that out. Relevant HTTP specs below if you
care.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

The above is true regardless of language. In python it appears there an
httplib module. I would call request using the method head.

http://docs.python.org/lib/httpconnection-objects.html

al*********@gmail.com wrote:
Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

Cheers, Al.
Jul 22 '06 #3
al*********@gmail.com wrote:
Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?
The PIL can determine the size of an image from some "large enough" chunk at
the beginning of the image, e. g:

import Image
import urllib
from StringIO import StringIO

f = urllib.urlopen("http://www.python.org/images/success/nasa.jpg")
s = StringIO(f.read(512))
print Image.open(s).size

Peter
Jul 22 '06 #4
In <11*********************@i42g2000cwa.googlegroups. com>, aldonnelley
wrote:
(Oh, and if anyone still has a bright idea about how to get image
dimensions without downloading, it'd be great to hear!)
Most image formats have some sort of header with the dimensions
information so it's enough to download this header. Depends on the image
format how much of the file has to be read and how the information is
encoded.

Ciao,
Marc 'BlackJack' Rintsch
Jul 22 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Keiron Waites | last post by:
Hi, Is there any JavaScript that can force the download of an image or a list of images before the HTML, or any other images are downloaded? This is necessary so I can force the background image...
7
by: Haines Brown | last post by:
I'm in the practice of specifying img width and height in my style definitions, although I usually use the actual size of the image. I'm reworking the style and it would be convenient to leave...
4
by: no-spam | last post by:
Hello, I have an HTML question that I'm not sure can be solved. I want to restrict the maximum size of an inline image. For example, I can force the image to be 200x200 if I do this: <img...
2
by: Dave | last post by:
Dear Sirs, Is there a way to get the width and height of an image without downloading the image, or with just downloading a minimal portion of the image? For instance, I have a list of 10,000...
35
by: Stan Sainte-Rose | last post by:
Hi, What is the better way to save image into a database ? Just save the path into a field or save the image itself ? I have 20 000 images (~ 10/12 Ko per image ) to save. Stan
13
by: Dino Buljubasic | last post by:
I want to get the size of a file stored in SQL Database (as image data type). Anybody knows how to do this? Any help will be greatelly appreciated
18
by: pecan | last post by:
I have hundreds of pictures on my site. The thumbnails are all a standard size, and most of the bigger ones are the same size too. When I run my html code through my optimizer it throws out a...
1
by: mmcc128 | last post by:
Currently using the "document.images" to "preload" images - not for future pages, but for the page being loaded. I got it from http://www.dynamicdrive.com/dynamicindex4/imagetooltip.htm Its a...
3
by: Christoph Burschka | last post by:
Is there some way to get the dimensions of an image, given the binary data of the image, without having to write it to a temporary file? It seems that getimagesize() will only take a filename,...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.