473,395 Members | 1,581 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Queues - Is Infinity all right?

The standard Python Queue module, allows to generate queues that
have no size limit, by passing the size argument as <= 0.

q = Queue(0)

In a multithreaded application, these queues could be useful
when you have many threads using the queue for data access
and synchronization.

Is there a performance hit when one uses an 'infinite' queue
when compared to a queue of fixed size?

Of course it should depend upon the O(n) of the Queue data structure,
how the time of access (get/put) varies with the number of items in
it.

I would like to have some light on this. Is it constant, O(n) or
a higher order?

Thanks

-Anand
Jul 18 '05 #1
7 5001
Anand Pillai wrote:

The standard Python Queue module, allows to generate queues that
have no size limit, by passing the size argument as <= 0.

q = Queue(0)

In a multithreaded application, these queues could be useful
when you have many threads using the queue for data access
and synchronization.

Is there a performance hit when one uses an 'infinite' queue
when compared to a queue of fixed size?

Of course it should depend upon the O(n) of the Queue data structure,
how the time of access (get/put) varies with the number of items in
it.

I would like to have some light on this. Is it constant, O(n) or
a higher order?


Checking the source, Queue is implemented as a sophisticated wrapper
around a standard Python list []. That means adding an item is
amortized O(1), while removing one is O(n).

That said, I'd consider using a fixed queue only when I was greatly
concerned about whether my consumer thread was going to be able to
keep up with my producer thread(s), and was therefore concerned about
overflowing memory with unconsumed input.

In other words, I'd worry more about the robustness of the app
than about optimizing it before it worked...

-Peter
Jul 18 '05 #2
Peter Hansen <pe***@engcorp.com> writes:
Checking the source, Queue is implemented as a sophisticated wrapper
around a standard Python list []. That means adding an item is
amortized O(1), while removing one is O(n).


Why is removing one O(n)? That should be easy to fix, if true.
Jul 18 '05 #3
Paul Rubin <http://ph****@NOSPAM.invalid> wrote in message news:<7x************@ruckus.brouhaha.com>...
Peter Hansen <pe***@engcorp.com> writes:
Checking the source, Queue is implemented as a sophisticated wrapper
around a standard Python list []. That means adding an item is
amortized O(1), while removing one is O(n).


Why is removing one O(n)? That should be easy to fix, if true.


The queue implementation he's referring to works somewhat like this:

class queue(list):
def enqueue(self, item):
self.append(item)
def dequeue(self):
return self.pop(0)

It's the self.pop(0) that's the transgressor here. All the elements
after it must be moved back one index, so it's O(n) in the size of the
queue.

Sure, it's a simple fix, and there are several ways to make O(1)
queues, but that implementation, aside from being simple, also has a
*very* low constant. Having tested several competing O(1) queues
against this implementation, none were faster until the number of
items in the queue increased beyond several hundred, sometimes even
into the (low) thousands.

As a side note, the fastest O(1) queue algorithm in my testing was the
following:

class queue(object):
__slots__ = ('back', 'front')
def __init__(self, iterable):
self.front = []
self.back = []
for item in iterable:
self.back.append(item)
def enqueue(self, item):
self.back.append(item)
def dequeue(self):
if not self.front:
self.back.reverse()
self.front = self.back
self.back = []
return self.front.pop()

(Although its efficiency obviously depends on how well enqueues and
dequeues are intermixed; while still O(1), it's not highly efficient
when its average length is 1, obviously.)

Jeremy
Jul 18 '05 #4
Paul Rubin <http://ph****@NOSPAM.invalid> wrote in message news:<7x************@ruckus.brouhaha.com>...
Peter Hansen <pe***@engcorp.com> writes:
Checking the source, Queue is implemented as a sophisticated wrapper
around a standard Python list []. That means adding an item is
amortized O(1), while removing one is O(n).


Why is removing one O(n)? That should be easy to fix, if true.


A list is an array of pointers to py_objects, not a linked list. So
removing element x from a list of size y requires something like:

for z in range(x,y):
l[z] = l[z+1]

But since you're only copying the pointers and not the underlying
objects, it's a pretty quick O(n)
Jul 18 '05 #5
Peter Hansen wrote:
....

That said, I'd consider using a fixed queue only when I was greatly
concerned about whether my consumer thread was going to be able to
keep up with my producer thread(s), and was therefore concerned about
overflowing memory with unconsumed input.

In other words, I'd worry more about the robustness of the app
than about optimizing it before it worked...


I've been bitten a few times by unbounded queues when the consumer
thread could not keep up: the queue grows to memory limits and
no robustness is left.

So I changed my default choice for queue implementations to bounded
queues. They are much more predictable under unexpected loads
as they evt. force the producer to wait for the consumer.

The next question is off course what maximum queue size to use.
You can get good answers from queuing theory, but 10 to 20
works well in many cases.

Kind regards,
Ype
--
email at xs4all.nl
Jul 18 '05 #6
yk*****@accessforall.nl wrote:

Peter Hansen wrote:

...

That said, I'd consider using a fixed queue only when I was greatly
concerned about whether my consumer thread was going to be able to
keep up with my producer thread(s), and was therefore concerned about
overflowing memory with unconsumed input.

In other words, I'd worry more about the robustness of the app
than about optimizing it before it worked...


I've been bitten a few times by unbounded queues when the consumer
thread could not keep up: the queue grows to memory limits and
no robustness is left.

So I changed my default choice for queue implementations to bounded
queues. They are much more predictable under unexpected loads
as they evt. force the producer to wait for the consumer.

The next question is off course what maximum queue size to use.
You can get good answers from queuing theory, but 10 to 20
works well in many cases.


Reading your reply, and my original response, I am second-thinking
my above-quoted philosophy. Certainly, as even I said above, robustness
should be the concern, and not performance. Nevertheless, clearly
this is an issue of robustness primarily, so it would seem that a
predefined limit, well above what is expected to be required, would be a
good idea to help ensure robustness.

-Peter
Jul 18 '05 #7
Anand Pillai wrote:
This was exactly what my problem was. The producer thread was
creating a lot of data to the queue which the consumer cannot
keep up with.

Since I am using python threads, it so happens most of the time
that the queue is filled to huge sizes, and the consumer thread
context switch is not done, causing memory problems.

I have not gone through the code of Queue.py, but I think that
the queue.get() method is an O(n) operation, considering the
ration of 'put's to the 'get's.
As Peter already remarked, the source of the your problem is not
the get() operation.
A related question here is... How do you avoid producer thread
deadlocking when the consumer queue is full?
By using a separate thread for the consumer(s), as you indicate
below.
If the question is not clear, my queue is of size 'n'. It is already
full with 'n' pieces of data inside it. All of my 'n' producer threads
are trying to push data into it at the *same* time and there is no
consumer thread popping the data off (In my program, the consumers
are the producers themselves, a bad design I agree...).
That's probably the source of your problem.
This is clearly a deadlock situtation, since no thread will enter the
queue forever. I solved it by creating a new consumer thread to pop
off some data, when deadlock occurs. But sometimes the deadlock
occurs very often resulting in many extra threads being created like this.

The other approach is to write separate out consumer and producer thread
roles so that there is always one consumer thread somewhere which is
popping data off the queue, which is what I have done now.

Anyway how do you make sure that dead lock situations not occur?
When there is a cycle between the producers and the consumers, ie.
back from the consumers to the producers, you may need one queue
(eg. the backwards queue) of infinite size to prevent deadlock.
However, you should make sure that no thread accesses a queue
in the wrong direction, because that might still cause deadlock.

In case you have a fixed ratio between the amounts of things
on the various queues, you might consider not using queues at all
for these fixed ratios.
Can you write some Event() based mechanisms to make sure that the
threads always access the queue at different times? Since python offers
A queue provides synchronized access already, you don't need
anything else.
no control over thread priority or mechanisms to suspend a thread, how
else could this be done?
You can make reasonably sure that work is going on everywhere by keeping the
queues small enough. This forces more thread switching, though.
That brings me to a favorite topic, to implement a thread wrapper
over the existing python threading API to allow suspending of threads
and modifying thread priority. Has this been considered as a PEP anytime
before?
Thread suspension is (very) difficult to implement correctly.

Thread priorities are good eg. when you have background work going
on and you want some short interactive pieces of work done at normal
speed. Also, if you have a series of threads and queues it can
be helpful to give the first thread a higher priority when it also has
to wait for external events.

I don't know whether there was a PEP for any of this.

Kind regards,
Ype

Thank you all

-Anand
yk*****@accessforall.nl wrote in message
news:<3f***********************@dreader4.news.xs4a ll.nl>...
Peter Hansen wrote:
...
>
> That said, I'd consider using a fixed queue only when I was greatly
> concerned about whether my consumer thread was going to be able to
> keep up with my producer thread(s), and was therefore concerned about
> overflowing memory with unconsumed input.
>
> In other words, I'd worry more about the robustness of the app
> than about optimizing it before it worked...
>


I've been bitten a few times by unbounded queues when the consumer
thread could not keep up: the queue grows to memory limits and
no robustness is left.

So I changed my default choice for queue implementations to bounded
queues. They are much more predictable under unexpected loads
as they evt. force the producer to wait for the consumer.

The next question is off course what maximum queue size to use.
You can get good answers from queuing theory, but 10 to 20
works well in many cases.

Kind regards,
Ype


--
email at xs4all.nl
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

28
by: Grant Edwards | last post by:
I finally figured out why one of my apps sometimes fails under Win32 when it always works fine under Linux: Under Win32, the pickle module only works with a subset of floating point values. In...
4
by: Ice | last post by:
Hi there, I'm not sure if this is the right group for this- If it isn't, could anyone point me in the right direction? For our data structures exam, we are usually asked to implement the...
108
by: Bryan Olson | last post by:
The Python slice type has one method 'indices', and reportedly: This method takes a single integer argument /length/ and computes information about the extended slice that the slice object would...
2
by: Russell Smith | last post by:
Timestamps support infinity. However if appears dates do not. When timestamps are cast to dates, there is no output. Is this an acceptable option or not? Below are a number of examples...
5
by: Peter Hansen | last post by:
I'm investigating a puzzling problem involving an attempt to generate a constant containing an (IEEE 754) "infinity" value. (I understand that special float values are a "platform-dependent...
2
by: Pierre Rouleau | last post by:
Hi all, When using Python 2.4.x on a Win32 box, marshal.loads(marshal.dumps(1e66666)) returns 1.0 instead of infinity as it should and does under Python 2.5 (also running on Win32 ). This...
5
by: westhood | last post by:
In my program I must have some variables which values are infinity. It means the variable must be bigger than any integer. And if we add some to it, its value should still be infinity. I try...
14
by: Jim Langston | last post by:
The output of the following program is: 1.#INF 1 But: 1.#INF 1.#INF was expected and desired. How can I read a value of infinity from a stream?
13
by: kimiraikkonen | last post by:
Hello, I have an aritmetic calculation like this: First note that: i need a "timer" to get the value for value3. (however removing "timer" didn't differ) Dim value1 As Long Dim value2 As...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.