Re: Array.Resize or List<> or some other data structure

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



On Fri, 17 Oct 2008 08:48:13 -0700, Trecius <Trecius@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

[...]
So which way should I use? Should I dump it into a list everytime, or
should I resize the array everytime? Is there another way you would
recommend? Thank you all for your help and suggestions.

The two approaches you're asking about are basically equivalent. The List<T> class uses an array internally, and will do effectively the same operation as Array.Resize(). The only real difference between the two is that List<T> always doubles the size of the storage, so that you need to resize fewer and fewer times as the data gets larger. Of course, you could always use that strategy when using Array.Resize() as well, if that was important.

Personally, I wouldn't use either. I would make every effort to try to process the bytes as they are read, so that they never have to be all in memory at once. That's the most ideal solution, as it avoids the whole business of having to buffer an arbitrarily large amount of data altogether.

If you can't process the bytes as they are read, but instead need to store them all up first, I would use a MemoryStream, and write to the MemoryStream as the bytes come in. Then when you're done, you can use the MemoryStream.ToArray() method to get the byte array representing the data.

I believe that MemoryStream uses the same double-and-copy algorithm as List<T>, so if that wound up being a performance liability, I would switch to allocating individual buffers and storing them in a List<byte[]>. That is, rather than resizing a single byte[] over and over, just allocate a new byte[] when you've run out of room in your current byte[], storing a reference to each byte[] in the List<byte[]>.

One more alternative would be to have the i/o code use individual byte[] instances only, and hand those off to a different thread that deals with writing them to a MemoryStream. In terms of performance, this would probably be somewhere in between using a List<byte[]> to store individual buffers and just always writing to a MemoryStream.

With this alternative, you could either use a double- or triple-buffering scheme where you have two or three such buffers that are used in rotation, or you could just allocate a new buffer as needed, letting the used ones be garbage collected after they've been copied to the MemoryStream. The former has the advantage of not causing a lot of repeated allocations and collections, at the cost of complexity and the possibility of having the i/o thread having to wait for a buffer to become available.

Personally, if you have to buffer all the data, I would start with writing to a MemoryStream. It is by far the simplest approach, and may well perform adequately for your needs. Only if I ran into some specific performance issue would I then start exploring some of these other options. They are reasonably straightforward to code, but would certainly obfuscate the core purpose of the code and any complication of the code should avoided unless absolutely necessary.

Pete
.



Relevant Pages

  • Re: Problems reversing strings
    ... First you create an array of 10 ints. ... ptr2, is created, and given the value of 0. ... Now we allocate a buffer with new, and assign the address of the buffer to ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Interoperability with C
    ... have a procedure return a variable size array that you didn't allocate ... Then allocate an array ... with 'sufficient' buffer and the routine stores the data. ...
    (comp.lang.fortran)
  • Re: Why does the &quot;Replace Array Subset&quot; double the used memory?
    ... data that's being modified is that LabVIEW has to allocate a buffer to ... Initialize Array function is always going to be the same, so LabVIEW ... don't always allocate a buffer for their use. ...
    (comp.lang.labview)
  • Re: How to determinate the size of buffer
    ... You don't need to allocate a new array. ... >I want to write a MemoryStream to a buffer. ... > DataSet ds2 = new DataSet; ...
    (microsoft.public.dotnet.languages.csharp)
  • How can I write variable length byte array?
    ... I am in need of writing a variable amount of bytes to memmory (using ByteArrayInputStream) but can not allocate a fixed byte array since I do not know in advance how much will be received. ... What is the smartest way of making a dynamic buffer in memory, I am hoping for something better than having to copy a fixed based buffer into an ever growing final buffer. ...
    (comp.lang.java.help)