Re: Hashing: "blair" == "brainlessness" !! !!
From: Jon Skeet [C# MVP] (skeet_at_pobox.com)
Date: 04/29/04
- Next message: Dirk Reske: "performance counter problem"
- Previous message: Miha Markic [MVP C#]: "Re: datatable"
- In reply to: Francesc Benavent: "Re: Hashing: "blair" == "brainlessness" !! !!"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 29 Apr 2004 15:24:56 +0100
Francesc Benavent <fbenavent@cogitoergosum.com> wrote:
> >It would be *nice* if they didn't occur, but everything is
> > designed around them happening, just not terribly often.
>
> Yes, I'm starting to know about it... :)
:)
> >what was the greatest number of string which ever produced
> >the same hashcode?
>
> All of the 64 collision were just double. No hash code were shared by
> 3 or more strings.
Great.
> > just means that two objects need to have Equals called on them when
> > doing a lookup, rather than one.
>
> > How expensive do you think it would be to call Equals on all of them,
> > bearing in mind that unless they're the same length, Equals doesn't even
> > need to look at the data itself?
>
> I'm not sure to understand those paragrpahs, sorry. Can you explain it
> a litle more?
Sure. Hashtables generally work roughly in this way:
The table has a number of buckets. Each bucket has a portion of the
hashspace. (Simple example: you might have two buckets, with one bucket
having all negative hashcodes, another having all the non-negative
ones.) This is done so you can easily find a list of items with
"roughly" the right hashcode.
In each bucket are entries. The entry contains the actual hashcode and
a reference to the key object.
When looking for something, you first find the right bucket. You then
look down the list of all the entries (which are usually ordered) and
find the entries with the right hashcode. For each such entry, you call
Equals on the key you're trying to find, passing in the the reference
to the potentially matching key (or the other way round - it shouldn't
matter).
So, having hash collisions slows things down a bit, because you need to
call Equals more times in order to find the right key. It doesn't slow
it down a *lot* though, because you still only have to call Equals on a
few items (at most two in your case, according to the bit you wrote
earlier).
> > Nothing - it's just not a problem.
>
> Same to you ;-), please read my next message with description of
> specific problem.
I haven't seen that message yet... I'll comment when I do.
-- Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet If replying to the group, please do not mail me too
- Next message: Dirk Reske: "performance counter problem"
- Previous message: Miha Markic [MVP C#]: "Re: datatable"
- In reply to: Francesc Benavent: "Re: Hashing: "blair" == "brainlessness" !! !!"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|