469,126 Members | 1,292 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,126 developers. It's quick & easy.

Is it possible to get image size before/without downloading?

Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

Cheers, Al.

Jul 22 '06 #1
4 7860
In the head of an HTTP response, most servers will specify a
Content-Length that is the number of bytes in the body of the response.
Normally, when using the GET method, the header is returned with the
body following. It is possible to make a HEAD request to the server
that will only return header information that will hopefully tell you
the file size.

If you want to know the actual dimensions of the image, I don't know of
anything in HTTP that will tell you. You will probably just have to
download the image to find that out. Relevant HTTP specs below if you
care.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

The above is true regardless of language. In python it appears there an
httplib module. I would call request using the method head.

http://docs.python.org/lib/httpconnection-objects.html

al*********@gmail.com wrote:
Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

Cheers, Al.
Jul 22 '06 #2
Thanks Josiah

I thought as much... Still, it'll help me immensely to cut the
downloads from a page to only those that are within a file-size range,
even if this gets me some images that are out-of-spec dimensionally.

Cheers, Al.

(Oh, and if anyone still has a bright idea about how to get image
dimensions without downloading, it'd be great to hear!)

Josiah Manson wrote:
In the head of an HTTP response, most servers will specify a
Content-Length that is the number of bytes in the body of the response.
Normally, when using the GET method, the header is returned with the
body following. It is possible to make a HEAD request to the server
that will only return header information that will hopefully tell you
the file size.

If you want to know the actual dimensions of the image, I don't know of
anything in HTTP that will tell you. You will probably just have to
download the image to find that out. Relevant HTTP specs below if you
care.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html

The above is true regardless of language. In python it appears there an
httplib module. I would call request using the method head.

http://docs.python.org/lib/httpconnection-objects.html

al*********@gmail.com wrote:
Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?

Cheers, Al.
Jul 22 '06 #3
al*********@gmail.com wrote:
Hi there: a bit of a left-field question, I think.
I'm writing a program that analyses image files downloaded with a basic
crawler, and it's slow, mainly because I only want to analyse files
within a certain size range, and I'm having to download all the files
on the page, open them, get their size, and then only analyse the ones
that are in that size range.
Is there a way (in python, of course!) to get the size of images before
or without downloading them? I've checked around, and I can't seem to
find anything promising...

Anybody got any clues?
The PIL can determine the size of an image from some "large enough" chunk at
the beginning of the image, e. g:

import Image
import urllib
from StringIO import StringIO

f = urllib.urlopen("http://www.python.org/images/success/nasa.jpg")
s = StringIO(f.read(512))
print Image.open(s).size

Peter
Jul 22 '06 #4
In <11*********************@i42g2000cwa.googlegroups. com>, aldonnelley
wrote:
(Oh, and if anyone still has a bright idea about how to get image
dimensions without downloading, it'd be great to hear!)
Most image formats have some sort of header with the dimensions
information so it's enough to download this header. Depends on the image
format how much of the file has to be read and how the information is
encoded.

Ciao,
Marc 'BlackJack' Rintsch
Jul 22 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Haines Brown | last post: by
4 posts views Thread by no-spam | last post: by
35 posts views Thread by Stan Sainte-Rose | last post: by
18 posts views Thread by pecan | last post: by
1 post views Thread by mmcc128 | last post: by
3 posts views Thread by Christoph Burschka | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.