473,399 Members | 2,278 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

list conversion question

Howdy,

I am using PIL, and would like to
make a list of the pixel values
in an image, sorted by quantity of
pixels with that value.

im.histogram() returns a list which
is a pixel count for each brightness value.

So, I want to sort the histogram, and have
the resulting list contain
the indexes instead of the counts.

This seems like it'd be a fairly common
task, but I don't know what to call
it to look for guidance.

Any help is appreciated.

Thanks,
Kent

Jul 18 '05 #1
11 1662
"Kent Tenney" <ke**@springfed.com> wrote in message
news:ma**************************************@pyth on.org...
Howdy,

I am using PIL, and would like to
make a list of the pixel values
in an image, sorted by quantity of
pixels with that value.

im.histogram() returns a list which
is a pixel count for each brightness value.

So, I want to sort the histogram, and have
the resulting list contain
the indexes instead of the counts.

This seems like it'd be a fairly common
task, but I don't know what to call
it to look for guidance.

Any help is appreciated.

Thanks,
Kent

Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]

-- Paul
Jul 18 '05 #2
Paul McGuire wrote:
Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]


or tweaked a bit for speed (a sort with a lambda is expensive)
and for clarity, IMO,

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]

In Python2.4 this is allowed
hist = [ 0, 1, 0, 5, 43 ]
[pair[0] for pair in sorted(enumerate(hist), .... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4]


Not yet sure that that's a good thing.

Andrew
da***@dalkescientific.com
Jul 18 '05 #3
"Andrew Dalke" <ad****@mindspring.com> wrote in message
news:iv*****************@newsread1.news.pas.earthl ink.net...
Paul McGuire wrote:
Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]


or tweaked a bit for speed (a sort with a lambda is expensive)
and for clarity, IMO,

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]

In Python2.4 this is allowed
>>> hist = [ 0, 1, 0, 5, 43 ]
>>> [pair[0] for pair in sorted(enumerate(hist), ... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4] >>>


Not yet sure that that's a good thing.

Andrew
da***@dalkescientific.com


I assumed that the resulting list should be sorted in descending frequency
order. Probably the fastest way to do this (since we are tweaking for
speed) would be to just change the pairs list comp to "pairs = [(-value,
offset) for (offset, value) in enumerate(hist)]", or your Py 2.4 key clause
from "key=lambda pair: pair[1]" to "key=lambda pair: -pair[1]".

-- Paul
Jul 18 '05 #4
Paul McGuire wrote:
Probably the fastest way to do this (since we are tweaking for
speed) would be to just change the pairs list comp to "pairs = [(-value,
offset) for (offset, value) in enumerate(hist)]", or your Py 2.4 key clause
from "key=lambda pair: pair[1]" to "key=lambda pair: -pair[1]".


Ahh, right. The new Python2.4 'sorted' also has a
"reversed" flag, so
hist = [ 0, 1, 0, 5, 43 ]
[pair[0] for pair in sorted(enumerate(hist), .... key=lambda pair: pair[1],
.... reverse=True)]
[4, 3, 1, 0, 2]


This isn't the same as as reverse() after the sort(),

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]
indexes.reverse()

or

list(reversed([pair[0] for pair in sorted(enumerate(hist),
key=lambda pair: pair[1])]))

Both of these give

[4, 3, 1, 2, 0]

The difference is that sort is now a guaranteed
stable sort. Using 'reverse=True' means the index
to the first 0 (at index 0) appears before the index
to the seond 0 (at index 2). Using reverse() or
reversed() flips that direction.

Andrew
da***@dalkescientific.com

which would give
[0, 2, 1, 3, 4]

Andrew
Jul 18 '05 #5
Andrew Dalke wrote:
Paul McGuire wrote:
Assuming im.histogram() returns a list like [ 0, 1, 0, 5, 43, etc. ] how
about:

hist = [ 0, 1, 0, 5, 43 ]
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
indexes = [ a for a,b in values ]

or tweaked a bit for speed (a sort with a lambda is expensive)
and for clarity, IMO,

pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
indexes = [offset for (value, offset) in pairs]

In Python2.4 this is allowed
>>> hist = [ 0, 1, 0, 5, 43 ]
>>> [pair[0] for pair in sorted(enumerate(hist), ... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4] >>>
Even faster, though perhaps not as clear:
import operator
hist = [ 0, 1, 0, 5, 43 ]
map(operator.itemgetter(0), sorted(enumerate(hist),

... key=operator.itemgetter(1)))
[0, 2, 1, 3, 4]

Jp
Jul 18 '05 #6
Jp Calderone wrote:
Even faster, though perhaps not as clear:
>>> import operator
>>> hist = [ 0, 1, 0, 5, 43 ]
>>> map(operator.itemgetter(0), sorted(enumerate(hist), ... key=operator.itemgetter(1)))
[0, 2, 1, 3, 4]


Yet another option, not benchmarked, perhaps clearer:
hist = [0, 1, 0, 5, 43]
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
indices

[0, 2, 1, 3, 4]

Peter

Jul 18 '05 #7
Peter Otten wrote:
Yet another option, not benchmarked, perhaps clearer:

hist = [0, 1, 0, 5, 43]
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
indices


[0, 2, 1, 3, 4]


Looks the fastest to me. Here's my results, with
the benchmark below. In this list:
sort0 = Paul McGuire's solution with a lambda
sort1 = my solution with the pair flipped so the
default sort works
sort2 = my solution using Python 2.4's "keys"
sort3 = Jp Calderone's variant of that with itemgetter
sort4 = your (Petter Otten's) solution

list of size 0
sort0 0.0932559967041
sort1 0.0773079395294
sort2 0.176142930984
sort3 0.23094701767
sort4 0.089989900589
list of size 10
sort0 0.866734981537
sort1 0.524667978287
sort2 0.553129911423
sort3 0.529242992401
sort4 0.265748023987
list of size 100
sort0 18.9233298302
sort1 5.04968500137
sort2 5.11950612068
sort3 4.75884985924
sort4 3.0580329895
list of size 1000
sort0 262.217221022
sort1 70.5779349804
sort2 69.9501719475
sort3 54.6528921127
sort4 39.5281660557

Here's my benchmark code
import random, operator

def make_random_list(n):
# Ensure there will be duplicates
choose_from = range(n//3+1)
return [random.choice(choose_from) for i in range(n)]

def do_sort0(hist):
values = [ i for i in enumerate(hist)]
values.sort(lambda a,b: cmp(b[1],a[1]))
return [ a for a,b in values ]

def do_sort1(hist):
pairs = [(value, offset) for (offset, value) in enumerate(hist)]
pairs.sort()
return [offset for (value, offset) in pairs]

def do_sort2(hist):
return [pair[0] for pair in sorted(enumerate(hist),
key=lambda pair: pair[1])]

def do_sort3(hist):
return map(operator.itemgetter(0), sorted(enumerate(hist),
key=operator.itemgetter(1)))

def do_sort4(hist):
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
return indices

data = make_random_list(100)
assert (do_sort0(data) == do_sort1(data) == do_sort2(data) ==
do_sort3(data) == do_sort4(data))

import timeit

for size in [0, 10, 100, 1000]:
print " list of size", size
for i in range(5):
name = "sort%d" % i
t = timeit.Timer(
setup= ("import __main__\nhist=__main__.make_random_list(%d)" %
size),
stmt = "__main__.do_%s(hist)" % name)
print name, t.timeit(10000)

Andrew
da***@dalkescientific.com
Jul 18 '05 #8
[Peter Otten]
Even faster, though perhaps not as clear:
>>> import operator
>>> hist = [ 0, 1, 0, 5, 43 ]
>>> map(operator.itemgetter(0), sorted(enumerate(hist),

... key=operator.itemgetter(1)))
[0, 2, 1, 3, 4]


Yet another option, not benchmarked, perhaps clearer:
hist = [0, 1, 0, 5, 43]
indices = range(len(hist))
indices.sort(key=hist.__getitem__)
indices

[0, 2, 1, 3, 4]


These are both fast and clean. Consider posting them to the ASPN
cookbook so they won't get lost.
Raymond Hettinger
Jul 18 '05 #9
[Peter Otten]
Yet another option, not benchmarked, perhaps clearer:
>hist = [0, 1, 0, 5, 43]
>indices = range(len(hist))
>indices.sort(key=hist.__getitem__)
>indices


[0, 2, 1, 3, 4]


[Andrew Dalke] Looks the fastest to me. Here's my results, with
the benchmark below.

. . .

These results are good. Consider posting them in the ASPN Cookbook so they
won't get lost.
Raymond Hettinger
Jul 18 '05 #10
Andrew Dalke <ad****@mindspring.com> wrote:
...
In Python2.4 this is allowed
>>> hist = [ 0, 1, 0, 5, 43 ]
>>> [pair[0] for pair in sorted(enumerate(hist), ... key=lambda pair: pair[1])]
[0, 2, 1, 3, 4] >>>


Not yet sure that that's a good thing.


it makes the fast (DSU) way of sorting easier, and can be sped up
further as:

[ p[0] for p in sorted(enumerate(hist), operator.itemgetter(1)) ]

The new itemgetter and attrgetter HOFs in module operator are quite good
for this kind of use (which is exactly why they got introduced).
Alex

Jul 18 '05 #11
Paul McGuire <pt***@austin.rr._bogus_.com> wrote:
...
values = [ i for i in enumerate(hist)]


values = list(enumerate(hist))

is IMHO a better way to make a list in such cases.
Alex
Jul 18 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Stefan Seefeld | last post by:
hi there, I'm trying to convert a tuple to a list, and get a 'TypeError: list objects are unhashable'. Can anybody enlighten me as to the possible causes for this ? Where does hashing come...
2
by: Alexander Malkis | last post by:
//Consider: class A { /*...*/ }; template<class T> class list {/*... */ }; void f(const list<const A*> lst) { /*...doesn't change the arg...*/ } void g(list<A*> lst) { f(lst);...
8
by: JustSomeGuy | last post by:
I need to write an new class derived from the list class. This class stores data in the list to the disk if an object that is added to the list is over 1K in size. What methods of the std stl...
31
by: Bjørn Augestad | last post by:
Below is a program which converts a double to an integer in two different ways, giving me two different values for the int. The basic expression is 1.0 / (1.0 * 365.0) which should be 365, but one...
90
by: Christoph Zwerschke | last post by:
Ok, the answer is easy: For historical reasons - built-in sets exist only since Python 2.4. Anyway, I was thinking about whether it would be possible and desirable to change the old behavior in...
65
by: Steven Watanabe | last post by:
I know that the standard idioms for clearing a list are: (1) mylist = (2) del mylist I guess I'm not in the "slicing frame of mind", as someone put it, but can someone explain what the...
2
by: k1ckthem1dget | last post by:
I need to display the unsorted list of names and display the sorted list of names. My program is getting a bunch of errors though, and i dont know why. I am getting the following errors. 28:...
27
by: comp.lang.tcl | last post by:
My TCL proc, XML_GET_ALL_ELEMENT_ATTRS, is supposed to convert an XML file into a TCL list as follows: attr1 {val1} attr2 {val2} ... attrN {valN} This is the TCL code that does this: set...
10
by: Angel Tsankov | last post by:
Hello! Is the following code illformed or does it yield undefined behaviour: class a {}; class b {
21
by: REH | last post by:
It it permissible to use the constructor style cast with primitives such as "unsigned long"? One of my compilers accepts this syntax, the other does not. The failing one chokes on the fact that the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.