When calling Enqueue, the internal array may need to be reallocated. My
question is by how much? In the old MFC array classes, you could tell MFC
how many additional elements to add to the array when it needed to
reallocate, which greatly boosted performance relative to adding 1 element
at a time.
Thanks,
Mike Ober. 7 2437
Michael D. Ober wrote:
When calling Enqueue, the internal array may need to be reallocated. My
question is by how much? In the old MFC array classes, you could
tell MFC how many additional elements to add to the array when it
needed to reallocate, which greatly boosted performance relative to
adding 1 element at a time.
Unfortunately, the .NET documentation fails to make any mention of the
reallocation strategy for any of the array-based generic containers. I'd
suggest doing some experimentation to try to measure the insert performance
as the queue gets larger, or use .NET Reflector to examine the IL behind the
queue class to determine how it grows.
Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's
been shown to give the best overall performance for a dynamically sized
array.
-cd
"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospam >
wrote in message news:uF**************@TK2MSFTNGP03.phx.gbl...
Michael D. Ober wrote:
>When calling Enqueue, the internal array may need to be reallocated. My question is by how much? In the old MFC array classes, you could tell MFC how many additional elements to add to the array when it needed to reallocate, which greatly boosted performance relative to adding 1 element at a time.
Unfortunately, the .NET documentation fails to make any mention of the
reallocation strategy for any of the array-based generic containers. I'd
suggest doing some experimentation to try to measure the insert
performance as the queue gets larger, or use .NET Reflector to examine the
IL behind the queue class to determine how it grows.
Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's
been shown to give the best overall performance for a dynamically sized
array.
Taking a quick look with reflector, it's clear that Queue<Tgrows
linearly - whenever the internal array is full, it's increased in size by 4
elements. That gives O(n^2) performance on average if nothing is ever
removed from the queue. Since you typically wouldn't expect a queue to have
a large number of elements, linear growth is probably a reasonable choice
for most applications.
List<T>, on the other hand, uses exponential growth (factor of 2), so it
will have amortized constant insert time as it grows.
-cd
Thanks - I'll switch to List<T>. My application needs the quicker
reallocation.
Mike.
"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospam >
wrote in message news:e8**************@TK2MSFTNGP04.phx.gbl...
"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospam >
wrote in message news:uF**************@TK2MSFTNGP03.phx.gbl...
>Michael D. Ober wrote:
>>When calling Enqueue, the internal array may need to be reallocated. My question is by how much? In the old MFC array classes, you could tell MFC how many additional elements to add to the array when it needed to reallocate, which greatly boosted performance relative to adding 1 element at a time.
Unfortunately, the .NET documentation fails to make any mention of the reallocation strategy for any of the array-based generic containers. I'd suggest doing some experimentation to try to measure the insert performance as the queue gets larger, or use .NET Reflector to examine the IL behind the queue class to determine how it grows.
Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's been shown to give the best overall performance for a dynamically sized array.
Taking a quick look with reflector, it's clear that Queue<Tgrows
linearly - whenever the internal array is full, it's increased in size by
4 elements. That gives O(n^2) performance on average if nothing is ever
removed from the queue. Since you typically wouldn't expect a queue to
have a large number of elements, linear growth is probably a reasonable
choice for most applications.
List<T>, on the other hand, uses exponential growth (factor of 2), so it
will have amortized constant insert time as it grows.
-cd
Actually, I ended up switching to LinkedList<Tto emulate the
Enqueue/Dequeue methods.. My application generates queues of a few items to
thousands of items. It appears that the slight overhead of Linked Lists is
worth the benefit of never reallocating an array.
Mike.
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message
news:e7**************@TK2MSFTNGP02.phx.gbl...
Thanks - I'll switch to List<T>. My application needs the quicker
reallocation.
Mike.
"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospam >
wrote in message news:e8**************@TK2MSFTNGP04.phx.gbl...
>"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospamw rote in message news:uF**************@TK2MSFTNGP03.phx.gbl...
>>Michael D. Ober wrote: When calling Enqueue, the internal array may need to be reallocated. My question is by how much? In the old MFC array classes, you could tell MFC how many additional elements to add to the array when it needed to reallocate, which greatly boosted performance relative to adding 1 element at a time.
Unfortunately, the .NET documentation fails to make any mention of the reallocation strategy for any of the array-based generic containers. I'd suggest doing some experimentation to try to measure the insert performance as the queue gets larger, or use .NET Reflector to examine the IL behind the queue class to determine how it grows.
Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's been shown to give the best overall performance for a dynamically sized array.
Taking a quick look with reflector, it's clear that Queue<Tgrows linearly - whenever the internal array is full, it's increased in size by 4 elements. That gives O(n^2) performance on average if nothing is ever removed from the queue. Since you typically wouldn't expect a queue to have a large number of elements, linear growth is probably a reasonable choice for most applications.
List<T>, on the other hand, uses exponential growth (factor of 2), so it will have amortized constant insert time as it grows.
-cd
Michael,
If you know the maximum/typical size of the queue you can use its
constructor that sets the initial capacity: http://msdn2.microsoft.com/en-us/library/35bz1ab7.aspx
I would recommend setting the initial capacity first no matter which data
type you use Queue(Of T) or List(Of T). Of course LinkedList(Of T) doesn't
have a capacity ;-)
I would also try Queue(Of T) with varying capacities before reverting to an
alternate data type.
--
Hope this helps
Jay B. Harlow
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message
news:eW**************@TK2MSFTNGP03.phx.gbl...
Actually, I ended up switching to LinkedList<Tto emulate the
Enqueue/Dequeue methods.. My application generates queues of a few items
to thousands of items. It appears that the slight overhead of Linked
Lists is worth the benefit of never reallocating an array.
Mike.
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message
news:e7**************@TK2MSFTNGP02.phx.gbl...
>Thanks - I'll switch to List<T>. My application needs the quicker reallocation.
Mike.
"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospamw rote in message news:e8**************@TK2MSFTNGP04.phx.gbl...
>>"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospam wrote in message news:uF**************@TK2MSFTNGP03.phx.gbl... Michael D. Ober wrote: When calling Enqueue, the internal array may need to be reallocated. My question is by how much? In the old MFC array classes, you could tell MFC how many additional elements to add to the array when it needed to reallocate, which greatly boosted performance relative to adding 1 element at a time.
Unfortunately, the .NET documentation fails to make any mention of the reallocation strategy for any of the array-based generic containers. I'd suggest doing some experimentation to try to measure the insert performance as the queue gets larger, or use .NET Reflector to examine the IL behind the queue class to determine how it grows.
Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's been shown to give the best overall performance for a dynamically sized array.
Taking a quick look with reflector, it's clear that Queue<Tgrows linearly - whenever the internal array is full, it's increased in size by 4 elements. That gives O(n^2) performance on average if nothing is ever removed from the queue. Since you typically wouldn't expect a queue to have a large number of elements, linear growth is probably a reasonable choice for most applications.
List<T>, on the other hand, uses exponential growth (factor of 2), so it will have amortized constant insert time as it grows.
-cd
Jay,
The maximum size of a typical queue for this application ranges from 0 to
60000 with most of the queues (note that there are several) running from
4000 - 6000, but I have absolutely no guarantees of this as the source queue
max sizes aren't under my control. I also have a time constraint on the
input portion so the inputs must be added as quickly as possible. Combine
this with the fact that I load the queues about 10 times faster than they
can be emptied and I was seeing long pauses (on the order of several
minutes) while the internal arrays were being reallocated and the GC was
running with both Queue<Tand List<T>. Creating a queue or list with an
initial size caused the application to stall when a queue or list was first
being created.
It turns out that emulating a queue using LinkedList<Twas very quick and
the O(1) insertion to the end and removal from the front made a huge
difference in the application performance. I did notice last night that I
had a ~25 minute pause for a full GC pass, which by the way dropped the
memory footprint of this application from 90Mb to 20Mb. This GC pass
occurred well after the queues had been loaded so the time constraint on the
input portion of my application was no longer applicable.
By the way, by rewriting to use multiple queues and threads, I took the
runtime of this application, which is a one-way DFS between two Samba
servers using a Windows Server as the controller, from over 24 hours
(attempting to read input the entire time) to about 7 hours (with only the
first 90 minutes reading input). I expect further performance gains now
that I finally have a complete pass and I don't have to copy gigabytes of
files over a T1.
Mike.
"Jay B. Harlow" <Ja************@tsbradley.netwrote in message
news:B6**********************************@microsof t.com...
Michael,
If you know the maximum/typical size of the queue you can use its
constructor that sets the initial capacity:
http://msdn2.microsoft.com/en-us/library/35bz1ab7.aspx
I would recommend setting the initial capacity first no matter which data
type you use Queue(Of T) or List(Of T). Of course LinkedList(Of T) doesn't
have a capacity ;-)
I would also try Queue(Of T) with varying capacities before reverting to
an alternate data type.
--
Hope this helps
Jay B. Harlow
.NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message
news:eW**************@TK2MSFTNGP03.phx.gbl...
>Actually, I ended up switching to LinkedList<Tto emulate the Enqueue/Dequeue methods.. My application generates queues of a few items to thousands of items. It appears that the slight overhead of Linked Lists is worth the benefit of never reallocating an array.
Mike.
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message news:e7**************@TK2MSFTNGP02.phx.gbl...
>>Thanks - I'll switch to List<T>. My application needs the quicker reallocation.
Mike.
"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospam wrote in message news:e8**************@TK2MSFTNGP04.phx.gbl... "Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospa mwrote in message news:uF**************@TK2MSFTNGP03.phx.gbl... Michael D. Ober wrote: >When calling Enqueue, the internal array may need to be reallocated. >My question is by how much? In the old MFC array classes, you could >tell MFC how many additional elements to add to the array when it >needed to reallocate, which greatly boosted performance relative to >adding 1 element at a time. > Unfortunately, the .NET documentation fails to make any mention of the reallocation strategy for any of the array-based generic containers. I'd suggest doing some experimentation to try to measure the insert performance as the queue gets larger, or use .NET Reflector to examine the IL behind the queue class to determine how it grows. > Hopefully, it grows exponentially (by a factor of 1.5 - 2) because that's been shown to give the best overall performance for a dynamically sized array.
Taking a quick look with reflector, it's clear that Queue<Tgrows linearly - whenever the internal array is full, it's increased in size by 4 elements. That gives O(n^2) performance on average if nothing is ever removed from the queue. Since you typically wouldn't expect a queue to have a large number of elements, linear growth is probably a reasonable choice for most applications.
List<T>, on the other hand, uses exponential growth (factor of 2), so it will have amortized constant insert time as it grows.
-cd
Final thoughts and followup:
Average run time has stabilized around 4.5 hours with the time critical
portion taking 2 and a quarter hours. Memory usage has stabilized around
32Mb (per Task Manager) and I have no long pauses for the GC. Queue<Tand
List<Tboth had far larger memory footprints (~90Mb) and long pauses for
the GC. The reallocation of the internal arrays would be my guess for the
culprit for the poor memory performance of Queue<Tand List<T>.
Once again, choosing the correct data structure made a huge difference. In
this case LinkedList<Tand wrapping it in a derived Queue class was the key
to performance. The LinkedList<Tclass has O(1) removal from the front and
addition to the end methods as well as a O(1) count method. It never needs
to be bulk copied to another memory location, which means the GC doesn't
have to work nearly as hard, even if individual LinkItems must be moved in
memory.
Jay, thanks for your assistance and feedback from the reflector. Do you
have any links to how to use the reflector in VS 2005 so I can do my own
research when performance (or interest) dictates?
Mike.
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message
news:eH***************@TK2MSFTNGP03.phx.gbl...
Jay,
The maximum size of a typical queue for this application ranges from 0 to
60000 with most of the queues (note that there are several) running from
4000 - 6000, but I have absolutely no guarantees of this as the source
queue max sizes aren't under my control. I also have a time constraint on
the input portion so the inputs must be added as quickly as possible.
Combine this with the fact that I load the queues about 10 times faster
than they can be emptied and I was seeing long pauses (on the order of
several minutes) while the internal arrays were being reallocated and the
GC was running with both Queue<Tand List<T>. Creating a queue or list
with an initial size caused the application to stall when a queue or list
was first being created.
It turns out that emulating a queue using LinkedList<Twas very quick and
the O(1) insertion to the end and removal from the front made a huge
difference in the application performance. I did notice last night that I
had a ~25 minute pause for a full GC pass, which by the way dropped the
memory footprint of this application from 90Mb to 20Mb. This GC pass
occurred well after the queues had been loaded so the time constraint on
the input portion of my application was no longer applicable.
By the way, by rewriting to use multiple queues and threads, I took the
runtime of this application, which is a one-way DFS between two Samba
servers using a Windows Server as the controller, from over 24 hours
(attempting to read input the entire time) to about 7 hours (with only the
first 90 minutes reading input). I expect further performance gains now
that I finally have a complete pass and I don't have to copy gigabytes of
files over a T1.
Mike.
"Jay B. Harlow" <Ja************@tsbradley.netwrote in message
news:B6**********************************@microsof t.com...
>Michael, If you know the maximum/typical size of the queue you can use its constructor that sets the initial capacity:
http://msdn2.microsoft.com/en-us/library/35bz1ab7.aspx
I would recommend setting the initial capacity first no matter which data type you use Queue(Of T) or List(Of T). Of course LinkedList(Of T) doesn't have a capacity ;-)
I would also try Queue(Of T) with varying capacities before reverting to an alternate data type.
-- Hope this helps Jay B. Harlow .NET Application Architect, Enthusiast, & Evangelist T.S. Bradley - http://www.tsbradley.net
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message news:eW**************@TK2MSFTNGP03.phx.gbl...
>>Actually, I ended up switching to LinkedList<Tto emulate the Enqueue/Dequeue methods.. My application generates queues of a few items to thousands of items. It appears that the slight overhead of Linked Lists is worth the benefit of never reallocating an array.
Mike.
"Michael D. Ober" <obermd.@.alum.mit.edu.nospamwrote in message news:e7**************@TK2MSFTNGP02.phx.gbl... Thanks - I'll switch to List<T>. My application needs the quicker reallocation.
Mike.
"Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nospa mwrote in message news:e8**************@TK2MSFTNGP04.phx.gbl... "Carl Daniel [VC++ MVP]" <cp*****************************@mvps.org.nosp amwrote in message news:uF**************@TK2MSFTNGP03.phx.gbl.. . >Michael D. Ober wrote: >>When calling Enqueue, the internal array may need to be reallocated. >>My question is by how much? In the old MFC array classes, you could >>tell MFC how many additional elements to add to the array when it >>needed to reallocate, which greatly boosted performance relative to >>adding 1 element at a time. >> >Unfortunately, the .NET documentation fails to make any mention of >the reallocation strategy for any of the array-based generic >containers. I'd suggest doing some experimentation to try to measure >the insert performance as the queue gets larger, or use .NET >Reflector to examine the IL behind the queue class to determine how >it grows. >> >Hopefully, it grows exponentially (by a factor of 1.5 - 2) because >that's been shown to give the best overall performance for a >dynamically sized array. > Taking a quick look with reflector, it's clear that Queue<Tgrows linearly - whenever the internal array is full, it's increased in size by 4 elements. That gives O(n^2) performance on average if nothing is ever removed from the queue. Since you typically wouldn't expect a queue to have a large number of elements, linear growth is probably a reasonable choice for most applications. > List<T>, on the other hand, uses exponential growth (factor of 2), so it will have amortized constant insert time as it grows. > -cd > This discussion thread is closed Replies have been disabled for this discussion. Similar topics
25 posts
views
Thread by Brian Patterson |
last post: by
|
12 posts
views
Thread by Fred |
last post: by
|
12 posts
views
Thread by serge |
last post: by
|
6 posts
views
Thread by teedilo |
last post: by
|
5 posts
views
Thread by Scott |
last post: by
|
115 posts
views
Thread by Mark Shelor |
last post: by
|
13 posts
views
Thread by bjarne |
last post: by
|
13 posts
views
Thread by Bern McCarty |
last post: by
|
1 post
views
Thread by jvn |
last post: by
| | | | | | | | | | |