Re: Comment on how to uniquely track your objects in C# / hash table / get hash code

Tech-Archive recommends: Fix windows errors by optimizing your registry



On Aug 5, 2:25 am, raylopez99 <raylope...@xxxxxxxxx> wrote:
A quick sanity check, and I think I am correct, but just to make
sure:  if you have a bunch of objects that are very much like one
another you can uniquely track them simply by using an ArrayList or
Array, correct?

You can uniquely track an instance of any reference type, no matter
what data is inside, because it has identity, which can be compared
against by using Object.ReferenceEquals(). It is this identity that
serves as a basis for default implementations of GetHashCode() and
Equals(), and more often than not it is good enough.

But, as pointed out in the links below, hash functions can be non-
unique, or give collisions with two different objects having the same
hash (rare, but happens),

This is largely irrelevant to the issue of performance, since hash
tables work correctly even with collisions - just slower. You're still
guaranteed to have O(1) on average on random input, though
pathological cases can become O(n).

I've seen examples on overriding GetHashCode and understand how it
works (kind of, at least it made sense), but my question /comment is
that if you're not concerned with performance, then a simple index
based retrieval scheme is best.

Maintaining a hash table also has some additional overhead, both on
insertion and on retrieval (computing hashes, balancing buckets etc).
In practice, particularly for small collections (roughly ~100 items),
where both insertions and lookups happen frequently, a plain list with
linear search can be more efficient than a hashtable both in terms of
speed and memory. For larger stuff, hashtable is worth considering -
though even then, if inserts are frequent and lookups are very rare, a
list is probably still better (but it is not a usual case).

On the other hand, if you fill the collection once, and then just do
lookups, never modifying it, then the most efficient implementation is
pre-sorting the collection and doing binary search on it
(Array.BinarySearch or List<T>.BinarySearch, as needed - but if a
collection is immutable, might as well make it an array for the slight
performance increase that gives).
.



Relevant Pages

  • Re: Comment on how to uniquely track your objects in C# / hash table / get hash code
    ... Array, correct? ... This is largely irrelevant to the issue of performance, since hash ... where both insertions and lookups happen frequently, ... about fast lookups for balanced red/black binary trees. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: hash tables, non-random keys
    ... I have a hash table that will be containing a non-uniform distribution ... The keys will be inserted sequentially, ... would just use an array. ... something I also sometimes do is to use a small hash to optimize lookups ...
    (comp.programming)