473,320 Members | 1,978 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Array.Resize or List<> or some other data structure

Hello, Newsgroupians:

I've an optimization question for you all really quick. I have a stream
that I am reading some bytes. At times, the stream can contain a small
amount of bytes such as 50 or so or it can contain as much 10000000 bytes.
In reality, I do not know the maximum number of bytes.

In my function, I am going to read() the byte stream using a buffer. Now,
is it better to read it into a buffer and dump the buffer into a List<byte>
maybe using AddRange() or should I Array.Resize the buffer to grow a specific
size everytime?

Code for List<byte>

List<bytelstBytes = new List<byte>();
byte[] buffer = new byte[2048];

while (stream.Read(buffer, 0, buffer.Length) != -1)
{
lstBytes.AddRange(buffer);
}
return lstBytes.ToArray();

Code for resizing array:

byte[] buffer = new byte[2048];
while (stream.Read(buffer, buffer.Length - 2048, 2048) != -1)
{
Array.Resize(ref buffer, buffer.Size + 2048);
}
return buffer;

So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.
Trecius
Oct 17 '08 #1
9 6151
Are you aware there is:

byte [] bytes = File.ReadAllBytes("file.bin");

?

"Trecius" wrote:
Hello, Newsgroupians:

I've an optimization question for you all really quick. I have a stream
that I am reading some bytes. At times, the stream can contain a small
amount of bytes such as 50 or so or it can contain as much 10000000 bytes.
In reality, I do not know the maximum number of bytes.

In my function, I am going to read() the byte stream using a buffer. Now,
is it better to read it into a buffer and dump the buffer into a List<byte>
maybe using AddRange() or should I Array.Resize the buffer to grow a specific
size everytime?

Code for List<byte>

List<bytelstBytes = new List<byte>();
byte[] buffer = new byte[2048];

while (stream.Read(buffer, 0, buffer.Length) != -1)
{
lstBytes.AddRange(buffer);
}
return lstBytes.ToArray();

Code for resizing array:

byte[] buffer = new byte[2048];
while (stream.Read(buffer, buffer.Length - 2048, 2048) != -1)
{
Array.Resize(ref buffer, buffer.Size + 2048);
}
return buffer;

So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.
Trecius
Oct 17 '08 #2
My stream isn't a file. :(

"Family Tree Mike" wrote:
Are you aware there is:

byte [] bytes = File.ReadAllBytes("file.bin");

?

"Trecius" wrote:
Hello, Newsgroupians:

I've an optimization question for you all really quick. I have a stream
that I am reading some bytes. At times, the stream can contain a small
amount of bytes such as 50 or so or it can contain as much 10000000 bytes.
In reality, I do not know the maximum number of bytes.

In my function, I am going to read() the byte stream using a buffer. Now,
is it better to read it into a buffer and dump the buffer into a List<byte>
maybe using AddRange() or should I Array.Resize the buffer to grow a specific
size everytime?

Code for List<byte>

List<bytelstBytes = new List<byte>();
byte[] buffer = new byte[2048];

while (stream.Read(buffer, 0, buffer.Length) != -1)
{
lstBytes.AddRange(buffer);
}
return lstBytes.ToArray();

Code for resizing array:

byte[] buffer = new byte[2048];
while (stream.Read(buffer, buffer.Length - 2048, 2048) != -1)
{
Array.Resize(ref buffer, buffer.Size + 2048);
}
return buffer;

So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.
Trecius
Oct 17 '08 #3
Then will Stream.Length work to initially size the array?

"Trecius" wrote:
My stream isn't a file. :(

"Family Tree Mike" wrote:
Are you aware there is:

byte [] bytes = File.ReadAllBytes("file.bin");

?

"Trecius" wrote:
Hello, Newsgroupians:
>
I've an optimization question for you all really quick. I have a stream
that I am reading some bytes. At times, the stream can contain a small
amount of bytes such as 50 or so or it can contain as much 10000000 bytes.
In reality, I do not know the maximum number of bytes.
>
In my function, I am going to read() the byte stream using a buffer. Now,
is it better to read it into a buffer and dump the buffer into a List<byte>
maybe using AddRange() or should I Array.Resize the buffer to grow a specific
size everytime?
>
Code for List<byte>
>
List<bytelstBytes = new List<byte>();
byte[] buffer = new byte[2048];
>
while (stream.Read(buffer, 0, buffer.Length) != -1)
{
lstBytes.AddRange(buffer);
}
return lstBytes.ToArray();
>
>
>
Code for resizing array:
>
byte[] buffer = new byte[2048];
while (stream.Read(buffer, buffer.Length - 2048, 2048) != -1)
{
Array.Resize(ref buffer, buffer.Size + 2048);
}
return buffer;
>
>
>
So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.
>
>
Trecius
Oct 17 '08 #4
Family Tree Mike wrote:
Then will Stream.Length work to initially size the array?
If his stream reads bytes from, say, a port, I guess Length is not
known before all bytes are read.

--
Rudy Velthuis http://rvelthuis.de

"The study of non-linear physics is like the study of non-elephant
biology." -- Unknown
Oct 17 '08 #5
In fact, it is a port. :)

"Rudy Velthuis" wrote:
Family Tree Mike wrote:
Then will Stream.Length work to initially size the array?

If his stream reads bytes from, say, a port, I guess Length is not
known before all bytes are read.

--
Rudy Velthuis http://rvelthuis.de

"The study of non-linear physics is like the study of non-elephant
biology." -- Unknown
Oct 17 '08 #6
On Fri, 17 Oct 2008 11:20:05 -0700, Trecius
<Tr*****@discussions.microsoft.comwrote:
In fact, it is a port. :)
If by "port", you mean a NetworkStream retrieved from a Socket instance,
then Rudy is correct...the Length property cannot be determined and in
fact will always throw a NotSupportedException.
Oct 17 '08 #7
On Fri, 17 Oct 2008 08:48:13 -0700, Trecius
<Tr*****@discussions.microsoft.comwrote:
[...]
So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.
The two approaches you're asking about are basically equivalent. The
List<Tclass uses an array internally, and will do effectively the same
operation as Array.Resize(). The only real difference between the two is
that List<Talways doubles the size of the storage, so that you need to
resize fewer and fewer times as the data gets larger. Of course, you
could always use that strategy when using Array.Resize() as well, if that
was important.

Personally, I wouldn't use either. I would make every effort to try to
process the bytes as they are read, so that they never have to be all in
memory at once. That's the most ideal solution, as it avoids the whole
business of having to buffer an arbitrarily large amount of data
altogether.

If you can't process the bytes as they are read, but instead need to store
them all up first, I would use a MemoryStream, and write to the
MemoryStream as the bytes come in. Then when you're done, you can use the
MemoryStream.ToArray() method to get the byte array representing the data.

I believe that MemoryStream uses the same double-and-copy algorithm as
List<T>, so if that wound up being a performance liability, I would switch
to allocating individual buffers and storing them in a List<byte[]>. That
is, rather than resizing a single byte[] over and over, just allocate a
new byte[] when you've run out of room in your current byte[], storing a
reference to each byte[] in the List<byte[]>.

One more alternative would be to have the i/o code use individual byte[]
instances only, and hand those off to a different thread that deals with
writing them to a MemoryStream. In terms of performance, this would
probably be somewhere in between using a List<byte[]to store individual
buffers and just always writing to a MemoryStream.

With this alternative, you could either use a double- or triple-buffering
scheme where you have two or three such buffers that are used in rotation,
or you could just allocate a new buffer as needed, letting the used ones
be garbage collected after they've been copied to the MemoryStream. The
former has the advantage of not causing a lot of repeated allocations and
collections, at the cost of complexity and the possibility of having the
i/o thread having to wait for a buffer to become available.

Personally, if you have to buffer all the data, I would start with writing
to a MemoryStream. It is by far the simplest approach, and may well
perform adequately for your needs. Only if I ran into some specific
performance issue would I then start exploring some of these other
options. They are reasonably straightforward to code, but would certainly
obfuscate the core purpose of the code and any complication of the code
should avoided unless absolutely necessary.

Pete
Oct 17 '08 #8
Peter Duniho wrote:
On Fri, 17 Oct 2008 11:20:05 -0700, Trecius
<Tr*****@discussions.microsoft.comwrote:
In fact, it is a port. :)

If by "port", you mean a NetworkStream retrieved from a Socket
instance, then Rudy is correct...the Length property cannot be
determined and in fact will always throw a NotSupportedException.
I actually meant a physical port, like an USB port with some kind of
lab device attached, but the kind of port you meant has the same
problems. You simply can't know the amount of data to expect.

After all, data can be read from so many sources. <g>

--
Rudy Velthuis http://rvelthuis.de

"1001 words say more than one picture" -- Chinese proverb
Oct 17 '08 #9
Thank you, Mr. Duniho. I will use your suggestion. It seems like it will
work perfectly for my needs. Thank you again.

Trecius

"Peter Duniho" wrote:
On Fri, 17 Oct 2008 08:48:13 -0700, Trecius
<Tr*****@discussions.microsoft.comwrote:
[...]
So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.

The two approaches you're asking about are basically equivalent. The
List<Tclass uses an array internally, and will do effectively the same
operation as Array.Resize(). The only real difference between the two is
that List<Talways doubles the size of the storage, so that you need to
resize fewer and fewer times as the data gets larger. Of course, you
could always use that strategy when using Array.Resize() as well, if that
was important.

Personally, I wouldn't use either. I would make every effort to try to
process the bytes as they are read, so that they never have to be all in
memory at once. That's the most ideal solution, as it avoids the whole
business of having to buffer an arbitrarily large amount of data
altogether.

If you can't process the bytes as they are read, but instead need to store
them all up first, I would use a MemoryStream, and write to the
MemoryStream as the bytes come in. Then when you're done, you can use the
MemoryStream.ToArray() method to get the byte array representing the data.

I believe that MemoryStream uses the same double-and-copy algorithm as
List<T>, so if that wound up being a performance liability, I would switch
to allocating individual buffers and storing them in a List<byte[]>. That
is, rather than resizing a single byte[] over and over, just allocate a
new byte[] when you've run out of room in your current byte[], storing a
reference to each byte[] in the List<byte[]>.

One more alternative would be to have the i/o code use individual byte[]
instances only, and hand those off to a different thread that deals with
writing them to a MemoryStream. In terms of performance, this would
probably be somewhere in between using a List<byte[]to store individual
buffers and just always writing to a MemoryStream.

With this alternative, you could either use a double- or triple-buffering
scheme where you have two or three such buffers that are used in rotation,
or you could just allocate a new buffer as needed, letting the used ones
be garbage collected after they've been copied to the MemoryStream. The
former has the advantage of not causing a lot of repeated allocations and
collections, at the cost of complexity and the possibility of having the
i/o thread having to wait for a buffer to become available.

Personally, if you have to buffer all the data, I would start with writing
to a MemoryStream. It is by far the simplest approach, and may well
perform adequately for your needs. Only if I ran into some specific
performance issue would I then start exploring some of these other
options. They are reasonably straightforward to code, but would certainly
obfuscate the core purpose of the code and any complication of the code
should avoided unless absolutely necessary.

Pete
Oct 17 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
by: Dave | last post by:
Hello all, After perusing the Standard, I believe it is true to say that once you insert an element into a std::list<>, its location in memory never changes. This makes a std::list<> ideal for...
8
by: Francisco | last post by:
Hello, Is there any code faster than this array position manipulation (some code omitted for brevity)?: internal struct TreeNodeTableItem { public int a; public int b; public int c; public...
35
by: Lee Crabtree | last post by:
This seems inconsistent and more than a little bizarre. Array.Clear sets all elements of the array to their default values (0, null, whatever), whereas List<>.Clear removes all items from the...
6
by: Peter | last post by:
I have a WebService which returns a List of RunningReport class How do I read this XML data on the client side. How do I convert List<RunningReportfrom the WebService side to List<RunningReporton...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.