On Fri, 17 Oct 2008 08:48:13 -0700, Trecius
<Tr*****@discussions.microsoft.comwrote:
[...]
So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.
The two approaches you're asking about are basically equivalent. The
List<Tclass uses an array internally, and will do effectively the same
operation as Array.Resize(). The only real difference between the two is
that List<Talways doubles the size of the storage, so that you need to
resize fewer and fewer times as the data gets larger. Of course, you
could always use that strategy when using Array.Resize() as well, if that
was important.
Personally, I wouldn't use either. I would make every effort to try to
process the bytes as they are read, so that they never have to be all in
memory at once. That's the most ideal solution, as it avoids the whole
business of having to buffer an arbitrarily large amount of data
altogether.
If you can't process the bytes as they are read, but instead need to store
them all up first, I would use a MemoryStream, and write to the
MemoryStream as the bytes come in. Then when you're done, you can use the
MemoryStream.ToArray() method to get the byte array representing the data.
I believe that MemoryStream uses the same double-and-copy algorithm as
List<T>, so if that wound up being a performance liability, I would switch
to allocating individual buffers and storing them in a List<byte[]>. That
is, rather than resizing a single byte[] over and over, just allocate a
new byte[] when you've run out of room in your current byte[], storing a
reference to each byte[] in the List<byte[]>.
One more alternative would be to have the i/o code use individual byte[]
instances only, and hand those off to a different thread that deals with
writing them to a MemoryStream. In terms of performance, this would
probably be somewhere in between using a List<byte[]to store individual
buffers and just always writing to a MemoryStream.
With this alternative, you could either use a double- or triple-buffering
scheme where you have two or three such buffers that are used in rotation,
or you could just allocate a new buffer as needed, letting the used ones
be garbage collected after they've been copied to the MemoryStream. The
former has the advantage of not causing a lot of repeated allocations and
collections, at the cost of complexity and the possibility of having the
i/o thread having to wait for a buffer to become available.
Personally, if you have to buffer all the data, I would start with writing
to a MemoryStream. It is by far the simplest approach, and may well
perform adequately for your needs. Only if I ran into some specific
performance issue would I then start exploring some of these other
options. They are reasonably straightforward to code, but would certainly
obfuscate the core purpose of the code and any complication of the code
should avoided unless absolutely necessary.
Pete