473,406 Members | 2,343 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Find similar images using python

How can I use python to find images that looks quite similar? Thought
I'd scale the images down to 32x32 and convert it to use a standard
palette of 256 colors then compare the result pixel for pixel etc, but
it seems as if this would take a very long time to do when processing
lots of images.

Any hint/clue on this subject would be appreciated.

Best regards,
Thomas

Mar 29 '06 #1
15 4432
Use PIL..of course..

Sudharshan S

Mar 29 '06 #2
Thomas W wrote:
How can I use python to find images that looks quite similar? Thought
I'd scale the images down to 32x32 and convert it to use a standard
palette of 256 colors then compare the result pixel for pixel etc, but
it seems as if this would take a very long time to do when processing
lots of images.

Any hint/clue on this subject would be appreciated.


This question is immensely non-trivial unless you can form a precise
definition of "images that look quite similar." It is one of those
deceptive problems that seem straightforward, but become less and less
well-defined the more you study it. If you can solve this you can
get a PhD, get rich, get famous, or a combination of all three.

The US Supreme Court gave up on identifying pornography, because
the best definition anyone ever came up with was "I know it when I
see it," a judgment quite reasonable in a legal system based on
trusted authorities, but not a good one in a society "ruled by laws
and not men."

--Scott David Daniels
sc***********@acm.org
Mar 29 '06 #3
Thomas W wrote:
How can I use python to find images that looks quite similar? Thought
I'd scale the images down to 32x32 and convert it to use a standard
palette of 256 colors then compare the result pixel for pixel etc, but
it seems as if this would take a very long time to do when processing
lots of images.

Any hint/clue on this subject would be appreciated.


You are aware that this is one of the most sophisticated research areas in
CS in general? Your approach is by no means appropriate for even the
slightest of differences in the image - after all, your only reducing
resolution. That doesn't e.g account for different lighting conditions -
you wouldn't be able to connect a still photograph of a house taken by a
mounted camera at dusk and at dawn. And so on. So as long as you don't have
a _very_ homogene image source, this is a way more complicated task - if
not undoable.

Diez
Mar 29 '06 #4
I dont get it..cant the matching take place efficiently with PIL, only
that you need to have a condition i.e if the mismatch exceeds a certain
threshold, they are not similar,

http://gumuz.looze.net/wordpress/ind...ion-detection/

Check the above link, only diiference is that instead of files as in ur
case, the code here compares two pixels of consecutive frames for
changes..

Sudharshan S

Mar 29 '06 #5
su******@gmail.com wrote:
I dont get it..cant the matching take place efficiently with PIL, only
that you need to have a condition i.e if the mismatch exceeds a certain
threshold, they are not similar,

http://gumuz.looze.net/wordpress/ind...ion-detection/
Check the above link, only diiference is that instead of files as in ur
case, the code here compares two pixels of consecutive frames for
changes..


No, the difference is fundamental: two consecutive frames of a still-mounted
camera are - except noise and changing lightning conditions - the same.
detecting a difference in case of motion is easy.

But similarity between two images is a totally different beast. I would say
that an image of my grandma with me on her knee and another one with my
brother are very similar. But your approach would certainly fail to say
so...

diez
Mar 29 '06 #6

[Thomas]
How can I use python to find images that looks quite similar?


Have you looked at http://www.imgseek.net/ ? It's an Open Source Python photo
collection manager that does exactly what you're asking for.

--
Richie Hindle
ri****@entrian.com
Mar 29 '06 #7
I did this once for a motion dection algorithm. I used luminescence
calculations to determine this. I basically broke the image into a
grid of nine (3x3) areas and calculated the luminescene for each
section and if it changed signficantly enough then there has been
motions. The more sections, the more precise the detection will be.
This was written in Visual Basic 4 and was able to do about 10 frames
per seconds so you can get decent performance.

I got the luminescence calculation from a classic math/computer science
equation book. I should know the title but I'm blanking on it.

Andy

Mar 29 '06 #8
> How can I use python to find images that looks quite similar? Thought
I'd scale the images down to 32x32 and convert it to use a standard
palette of 256 colors then compare the result pixel for pixel etc, but
it seems as if this would take a very long time to do when processing
lots of images.

Any hint/clue on this subject would be appreciated.


A company I used to work for has been doing research in this area
(finding differences between images) for years, and the results are
still hardy generalizable, so don't expect to get perfect results after
a weekend ;-)

I'm not sure what you mean by "similar": I assume for the moment that
you want to detect if you really have the same photo, but scanned with
a different resolution, or with a different scanner or with a digital
camera that's slightly out of focus. This is still hard enough!

There are many approaches to this problem, downsampling the image might
work (use supersampling!), but it doesn't cover rotations, or different
borders or blur..., so you'll have to put some additional efforts into
the comparison algorithm. Also, converting the images to a paletted
format is almost definitly the wrong way - convert them to greyscale,
or work on 24 bit (RGB or HSV).
Another approach that you might try is comparing the image histograms:
they aren't affected by geometric transformations, and should still
contain some information about the original image. Even if they aren't
sufficient, they might help you to narrow down your search, so you have
more processing time for advanced algorithms.

If you have performance problems, NumPy and Psyco might both be worth a
look.

Mar 29 '06 #9
Richie Hindle <ri****@entrian.com> writes:
[Thomas]
How can I use python to find images that looks quite similar?


Have you looked at http://www.imgseek.net/ ? It's an Open Source Python photo
collection manager that does exactly what you're asking for.


Maybe... I don't recall if it had a duplicate search feature. What I
remember is the GUI in which you scribbled a picture, and asked it to
pull up images that looked like that. Amusing, though didn't seem to
work terribly well. No bad reflection on the author: it's a hard
problem of course.
John

Mar 31 '06 #10
John J. Lee schreef:
Richie Hindle <ri****@entrian.com> writes:
[Thomas]
How can I use python to find images that looks quite similar?

Have you looked at http://www.imgseek.net/ ? It's an Open Source Python photo
collection manager that does exactly what you're asking for.


Maybe... I don't recall if it had a duplicate search feature. What I
remember is the GUI in which you scribbled a picture, and asked it to
pull up images that looked like that. Amusing, though didn't seem to
work terribly well. No bad reflection on the author: it's a hard
problem of course.


It does have a duplicate search feature (see screenshot
http://www.imgseek.net/sshot/e1c93fe...a22921c49a.png for
example), though I don't know how well it works.

--
If I have been able to see further, it was only because I stood
on the shoulders of giants. -- Isaac Newton

Roel Schroeven
Mar 31 '06 #11
Thomas W wrote:
How can I use python to find images that looks quite similar? Thought
I'd scale the images down to 32x32 and convert it to use a standard
palette of 256 colors then compare the result pixel for pixel etc, but
it seems as if this would take a very long time to do when processing
lots of images.

Any hint/clue on this subject would be appreciated.

This really depends on what is meant by "quite similar".

If you mean "to the human eye, the two pictures are identical",
as in the case of a tool to get rid of trivially-different duplications,
then you can use the technique you propose. I don't imagine that
you can save any time over that process. You'd use something
like PIL to do the comparisons, of course -- I suspect you want to
do something like:

1) resize both
2) quantize the colors
3) subtract the two images
4) resize to 1x1
5) threshhold the result (i.e. we've used PIL to sum the differences)

strictly speaking, it might be more mathematically ideal to take
the sum of the difference of the squares of the pixels (i.e. compute
chi-square). This of course, avoids the painfully slow process of
comparing pixel-by-pixel in a Python loop, which would, of course
be painfully slow.

This is conceptually equivalent to using an "epsilon" to test "equality"
of floating point numbers.

The more general case of matching images with similar content (but
which would be recognizeably different to the human eye), is a much
more challenging cutting-edge AI problem, as has already been
mentioned -- but I was going to mention imgSeek myself (I see someone's
already given you the link).

Mar 31 '06 #12
On 29 Mar 2006 05:06:10 -0800, rumours say that "Thomas W"
<th***********@gmail.com> might have written:
How can I use python to find images that looks quite similar? Thought
I'd scale the images down to 32x32 and convert it to use a standard
palette of 256 colors then compare the result pixel for pixel etc, but
it seems as if this would take a very long time to do when processing
lots of images.


I see someone suggested imgseek. This uses a Haar transform to compare
images (check on it). I did make a module based on imgseek, and together
with PIL, I manage my archive of email attachments (it's incredible how many
different versions of the same picture people send you: gif, jpg in
different sizes etc) and it works fairly well.

E-mail me if you want the module, I don't think I have it currently online
anywhere.
--
TZOTZIOY, I speak England very best.
"Dear Paul,
please stop spamming us."
The Corinthians
Mar 31 '06 #13
Christos Georgiou wrote:
.... I did make a module based on imgseek, and together with PIL,
I manage my archive of email attachments (it's incredible how many
different versions of the same picture people send you: gif, jpg
in different sizes etc) and it works fairly well.

E-mail me if you want the module, I don't think I have it currently online
anywhere.


This sounds like a great recipe for the cookbook:
http://aspn.activestate.com/ASPN/Cookbook/Python

--
-Scott David Daniels
sc***********@acm.org
Mar 31 '06 #14
Finding similar images is not at all a trivial task. Entire PhD
dissertations have been committed to it. The solutions are still very
unreliable as of yet. If you want to find more, you can read the
research out of the ongoing Image CLEF track. I worked with them
briefly a couple of years ago in context of medical images.

http://ir.shef.ac.uk/imageclef/

Apr 1 '06 #15
On Fri, 31 Mar 2006 15:10:11 -0800, rumours say that Scott David Daniels
<sc***********@acm.org> might have written:
Christos Georgiou wrote:
.... I did make a module based on imgseek, and together with PIL,
I manage my archive of email attachments (it's incredible how many
different versions of the same picture people send you: gif, jpg
in different sizes etc) and it works fairly well.

E-mail me if you want the module, I don't think I have it currently online
anywhere.
This sounds like a great recipe for the cookbook:
http://aspn.activestate.com/ASPN/Cookbook/Python


Actually, it should go to the CheeseShop, since it is a python module that
is a bridge between PIL and the C module (I don't believe multi-file modules
are appropriate for the cookbook, but ICBW); however, my web space is out of
reach for some months now (in a web server at a previous company I worked
for), and I'm in the process of fixing that :)
--
TZOTZIOY, I speak England very best.
"Dear Paul,
please stop spamming us."
The Corinthians
Apr 4 '06 #16

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Andreas Volz | last post by:
Hi, I used SGMLParser to parse all href's in a html file. Now I need to cut some strings. For example: http://www.example.com/dir/example.html Now I like to cut the string, so that only...
108
by: Bryan Olson | last post by:
The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would...
2
by: dr_tyson | last post by:
I am trying to embed images into a wxPython app (created using Boa Constructor), but have not been able to do so. I know how to embed plots, but images seem to be a problem. I've tried using code...
25
by: Tor Erik Sønvisen | last post by:
Hi I need to browse the socket-module source-code. I believe it's contained in the file socketmodule.c, but I can't locate this file... Where should I look? regards tores
9
by: tshad | last post by:
How do I find (and set) a couple of labels in the Footer after a DataGrid is filled? I have a bunch of DataGrids that get displayed nested inside a DataList. The datagrid looks like: ...
6
by: lanwrangler | last post by:
I know it's a long shot but does anyone have any pointers to generic algorithms - or, even better, Python code - for comparing images and computing a value for the "difference" between them? ...
8
by: Larry Bates | last post by:
I have a project that I wanted to solicit some advice on from this group. I have millions of pages of scanned documents with each page in and individual .JPG file. When the documents were scanned...
9
by: jeremito | last post by:
My Python script makes a bunch of images that I want to use as frames in a movie. I've tried searching for a module that will take these images and put them together in a Quicktime or mpeg movie,...
7
by: Spectrum | last post by:
I am writing some Python code using the Message Passing Interface (MPI), an API used in parallel computing. There exist a number of Python implementations of MPI, but apparently they all rely on...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.