473,395 Members | 1,502 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

heapq question

Hi,
this is related to what I'm trying to implement here:
http://groups.google.com/group/comp....96724c1daf1e1#

My question is the following: is it safe to avoid to re-heapify() a
heap when I remove or move an element which is not the first one?
Example:
>>from heapq import *
heap = [2,4,6,7,1,2,3]
heapify(heap)
del heap[4]
# Am I forced to heapify() the heap here?
Thanks in advance.
--- Giampaolo
http://code.google.com/p/pyftpdlib/

Jul 12 '08 #1
7 4121
"Giampaolo Rodola'" <gn****@gmail.comwrote:
>
My question is the following: is it safe to avoid to re-heapify() a
heap when I remove or move an element which is not the first one?
Example:
>>>from heapq import *
heap = [2,4,6,7,1,2,3]
heapify(heap)
del heap[4]
# Am I forced to heapify() the heap here?

Thanks in advance.
No, it is not safe to avoid re-heapifying.

After you call heapify the list is ordered such that for any 'n' in the
range 1..len(heap)//2 it is the case that heap[n-1] <= heap[2*n-1] and when
heap[2*n] exists heap[n-1] <= heap[2*n].

So:
>>heap = [0, 100, 1, 101, 102, 2, 3]
heapify(heap)
heap
[0, 100, 1, 101, 102, 2, 3]
>>del(heap[4])
heap
[0, 100, 1, 101, 2, 3]
>>heapify(heap)
heap
[0, 2, 1, 101, 100, 3]

after deleting heap[4] it is no longer the case that heap[1] >= heap[4] as
the old heap[5] was a child of heap[2] not of heap[1].
Jul 12 '08 #2
On 12 Lug, 20:15, Duncan Booth <duncan.bo...@invalid.invalidwrote:
"Giampaolo Rodola'" <gne...@gmail.comwrote:
My question is the following: is it safe to avoid to re-heapify() a
heap when I remove or move an element which is not the first one?
Example:
>>from heapq import *
heap = [2,4,6,7,1,2,3]
heapify(heap)
del heap[4]
# Am I forced to heapify() the heap here?
Thanks in advance.

No, it is not safe to avoid re-heapifying.

After you call heapify the list is ordered such that for any 'n' in the
range 1..len(heap)//2 it is the case that heap[n-1] <= heap[2*n-1] and when
heap[2*n] exists heap[n-1] <= heap[2*n].

So:
>heap = [0, 100, 1, 101, 102, 2, 3]
heapify(heap)
heap

[0, 100, 1, 101, 102, 2, 3]>>del(heap[4])
>heap

[0, 100, 1, 101, 2, 3]>>heapify(heap)
>heap

[0, 2, 1, 101, 100, 3]

after deleting heap[4] it is no longer the case that heap[1] >= heap[4] as
the old heap[5] was a child of heap[2] not of heap[1].
Even if I avoid to re-heapify() it seems that the first element
returned by heappop() is always the smaller one.
>>heap = [0, 100, 1, 101, 102, 2, 3]
heapify(heap)
heap
[0, 100, 1, 101, 102, 2, 3]
>>del heap[4]
heap
[0, 100, 1, 101, 2, 3]
>>heappop(heap)
0
>>heappop(heap)
1
>>heappop(heap)
2
>>heappop(heap)
3
>>heappop(heap)
100
>>heappop(heap)
101
>>heappop(heap)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index out of range
>>>

That would be ok for me, as long as heappop() always returns the
smallest item.
Having said that I'd like to understand if there are cases where
deleting or moving an element of the heap, causes heappop() to return
an element which is not the smallest one.
--- Giampaolo
http://code.google.com/p/pyftpdlib/
Jul 12 '08 #3
Giampaolo Rodola':
Even if I avoid to re-heapify() it seems that the first element
returned by heappop() is always the smaller one.
Yes, the heappop() function keeps the heap invariant, so it will keep
giving you the smallest item.

I'd like to understand if there are cases where
deleting or moving an element of the heap, causes heappop() to return
an element which is not the smallest one.
To be an heap it must keep its heap invariant true. The heap functions
keep that invariant. Generally any time you move/delete an item
yourself, you break the invariant, so you don't have an heap anymore
and you need to re-heapify to restore the heap invariant :-)

Bye,
bearophile
Jul 12 '08 #4
I understand that heapq is not that efficient to implement timeouts as
I thought at first.
It would have been perfect if there were functions to remove arbitrary
elements withouth needing to re-heapify() the heap every time.
It is efficient for that - you just need to use it correctly.

To remove the nth element from a heap, replace it with the last element,
and then _siftup that element:

def remove_at_index_n(heap, n):
if n == len(heap)-1:
return heap.pop()
result = heap[n]
heap[n] = heap.pop()
heapq._siftup(heap, n)
return result

The time complexity for that operation is O(log(len(heap))).

HTH,
Martin
Jul 13 '08 #5
On 13 Lug, 19:31, "Martin v. Löwis" <mar...@v.loewis.dewrote:
I understand that heapq is not that efficient to implement timeouts as
I thought at first.
It would have been perfect if there were functions to remove arbitrary
elements withouth needing to re-heapify() the heap every time.

It is efficient for that - you just need to use it correctly.

To remove the nth element from a heap, replace it with the last element,
and then _siftup that element:

def remove_at_index_n(heap, n):
* * if n == len(heap)-1:
* * * * return heap.pop()
* * result = heap[n]
* * heap[n] = heap.pop()
* * heapq._siftup(heap, n)
* * return result

The time complexity for that operation is O(log(len(heap))).

HTH,
Martin

Thanks, by doing some quick benchamrks it seems 20% faster than using
heapify().
And if instead of removing an element I'd want to change its value?
E.g.:
>>heap = [0, 2, 1, 4, 5, 6, 7, 8, 9]
heapify(heap)
heap
[0, 2, 1, 4, 5, 6, 7, 8, 9]
>>heap[4] = 12


--- Giampaolo
http://code.google.com/p/pyftpdlib/
Jul 13 '08 #6
"Giampaolo Rodola'" <gn****@gmail.comwrote:
Thanks, that's what I wanted to know.
I understand that heapq is not that efficient to implement timeouts as
I thought at first.
It would have been perfect if there were functions to remove arbitrary
elements withouth needing to re-heapify() the heap every time.
Thanks for your help anyway.

There could be suitable functions, but there aren't any. I *think* this
would work (untested):

def popitem(heap, pos):
if pos == 0:
return heappop(heap)
if pos == len(heap)-1:
return heap.pop(pos)
res = heap[pos]
heap[pos] = heap.pop()
heapq._siftup(heap, pos)
return res

the catch is that _siftup is written in Python whereas the publicly exposed
heapq functions are written in C, so although in theory this is 'more
efficient' than calling heapify() on the entire heap it may actually be
slower.

Bottom line though is that heaps aren't really suitable for timeouts.
Jul 13 '08 #7
On Sun, Jul 13, 2008 at 4:16 PM, Duncan Booth
<du**********@invalid.invalidwrote:
"Giampaolo Rodola'" <gn****@gmail.comwrote:
>I understand that heapq is not that efficient to implement timeouts as
I thought at first.
It would have been perfect if there were functions to remove arbitrary
elements withouth needing to re-heapify() the heap every time.
There could be suitable functions, but there aren't any.
...
Bottom line though is that heaps aren't really suitable for timeouts.
I would argue that heaps in general are ideally suited for timeouts;
it's just that the Python heapq implementation is lacking if you ever
need to cancel a timeout before it expires.

-Miles
Jul 13 '08 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Eric | last post by:
I'm missing something critical about how heapq works. I assumed I could iterate through the heap, but I get partial iteration: >>> listB >>> heapify(listB) >>> for h in listB:...
0
by: Stefan Behnel | last post by:
Hi! I filed a patch for a Heap class to be integrated into the heapq module (in addition the the currently available functions). The main features are support for iteration and for the standard...
3
by: Ekqvist Marko | last post by:
Hi, I have one Access database table including questions and answers. Now I need to give answer id automatically to questionID column. But I don't know how it is best (fastest) to do? table...
18
by: bearophileHUGS | last post by:
In few minutes I have just written this quite raw class, it lacks doctring (the same of the functions of the heapq module), it may contain bugs still, I haven't tested it much. It's just a simple...
2
by: jm.suresh | last post by:
I wanted to have a heap of custom objects, and in different heaps I wanted to have the weights for my elements differently. So, I modified the heapq module to accept key arguments also. The...
1
by: Davy | last post by:
Hi all, I have a dictionary with n elements, and I want to get the m(m<=n) keys with the largest values. For example, I have dic that includes n=4 elements, I want m=2 keys have the largest...
19
by: dave | last post by:
Hi All, I wrote a program that takes a string sequence and finds all the words inside a text file (one word per line) and prints them: def anagfind(letters): #find anagrams of these letters...
4
by: George Sakkis | last post by:
I spent several hours debugging some bogus data results that turned out to be caused by the fact that heapq.nlargest doesn't respect rich comparisons: import heapq import random class...
6
semanticnotion
by: semanticnotion | last post by:
Hi sir i want to transform the data of one table into another through foreign key but the following error come to my browser Here is my code and data base structure. CREATE TABLE IF NOT...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.