Re: what is the best datatype for..

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance



On 2007-11-20 15:14:02 -0800, "Hilton" <nospam@xxxxxxxxxx> said:

You're combining two of my comments into one. When I refer to 'range size'
I mean are all the values 0-255 or 0-65535 etc. So a huge network stream
will only have a range size of 0-255 if you look at the individual bytes.
Same goes with calculating a histogram of an 8-bit indexed image; e.g. GIF
or PNG8 which is a very common operation - look at (almost) any image
application. You think they use an array or a dictionary? If I have a huge
input stream and was only looking at bytes (i.e. range of 0-255), would I
"definitely favour the array of ints?" - absolutely. Would using a
dictionary be faster than "x[y]++" where x is an array of ints? And which
would be clearer to read? Jon, Peter, take that as my throwing down a
challenge. :)

Challenge? Looks more like a straw man to me.

Jon's point appears to me to be that you wrote "with large amounts of data I would definitely favor the array of ints", but there's really nothing about the amount of data that predisposes it to being better solved by an array of ints when that amount is large.

He's not combining two comments. The single statement is by itself flawed. Even if you have a large amount of data, if the range is large and the population within that range is sparse, an array of ints is not likely to be a good solution.

I have already agreed that in many situations, I would prefer an array of ints for its simplicity. Raising that as an arguing point doesn't make any sense; we're already in agreement there.

The point of contention, which you seem to keep overlooking, is that your original claim was that an array of ints is efficient while a dictionary is NOT efficient. The claim is wrong twice: first, because the dictionary isn't inefficient at all even in cases where it's slower than the array of ints, and second because there are situations in which a dictionary solution is actually more efficient than the array of ints.

As far as readability goes, which is more readable depends on context. While I think that the array of ints is likely to be the simpler, more clear implementation in most cases, that doesn't mean it would be in all cases. For example, using a SortedDictionary produces very simple code IMHO because you never even have to have a separate section of code to sort the data. Even a Dictionary may be found to be simpler because of the ability to do compile-time type checking, allowing for it to be used in a simpler way elsewhere.

Context is everything, and there's not enough context to claim a priori that the array of ints is in fact the clearer implementation.

But besides, your contention has been that the array solution is superior due to _performance_, not readability. And on that basis, your claim has no firm foundation.

Pete

.



Relevant Pages

  • Re: question about pointer define
    ... p is a pointer to an array of 3 ints. ... Note that we considered this conversion rule in two different ...
    (comp.lang.c)
  • Re: "Parallel.For GC problems" and a solution.
    ... I think Dictionary(of dictionary(of small array of ints)) ... with the dictionaries holding arrays of keys, ... do not pass it a large array of things to process. ... With no breaks, memory goes crazy. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: what is the best datatype for..
    ... I would definitely favor the array of ints, ... I mean are all the values 0-255 or 0-65535 etc. So a huge network stream ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: nonhomogenous structs (was: lisp performance questions and observations)
    ... ints ... naturally represented as an array of structs, ... So actually, I don't care about the mixture between bytes, longs, ... machine words for each struct before I even get to the "data" for the ...
    (comp.lang.lisp)
  • Re: contiguity of arrays
    ... Couldn't arrays of ints have stricter alignment ... you can get around that by converting to a character pointer ... > That's because there is only one such array for a given effective type. ... If an object is declared as a struct, ...
    (comp.lang.c)