Re: fastest sorted list type?



"Peter Duniho" <NpOeStPeAdM@xxxxxxxxxxxxxxxx> wrote in message
news:op.t4uik9nk8jd0ej@xxxxxxxxxxxxxxxxxxxxxxx
On Sat, 12 Jan 2008 17:39:14 -0800, colin <colin.rowe1@xxxxxxxxxxxxxxxxxx>
wrote:

[...]
yes its confusing the BinarySearch needs a instance compared to the
List<t>.Sort
I just have to use a dummy struct is all.

I don't know why BinarySearch() doesn't offer an overload that uses
Comparison<T>. Maybe they figured that since BinarySearch() only makes
sense if you repeatedly use the same comparison, that anyone using it
would already have created a single IComparer implementation that would be
used.

Note that if you were sorting class instances directly, you could just
implement IComparable for that class and use the default BinarySearch()
method, without passing any sort of comparison to it.

well its an int so I cant do that as that would just compare the values
directly not via a index into an array.

aha ok thanks, but im unsure why the binarysearch and add method is
slower
than sort,
as it calls my comparer far fewer times, yet it seemed to never end with
the
bigger lists ...

Why would you say that using BinarySearch() and Add() would call your
comparer "far fewer times"? The order of the algorithm is basically the
same: O(log n) for each element, for an overall order of O(n log n).

well its strange then,
for a particular run of numerous lists list I counted it called compare
23261
times using Sort, and 12885 using binarysearch with add
thats a ratio of 1.8:1

The big problem is that not only does using BinarySearch() require doing
just as many comparisons, every time you insert an element into the array
at a position based on the binary search, the portion of the array after
that position needs to be moved. This adds an O(n) operation to your
algorithm for each element you add, which changes the algorithm from O(n
log n) into one that's O(n^2). Which is basically awful. You might as
well be doing an insertion sort with a linked list at that point.

ah i hadnt thought of that.

If you want to maintain the data in sorted order as it's being added, and
you know that the data will _not_ be added to the collection in an order
that's already sorted (that is, the input data is basically random), then
you may prefer to use the non-generic SortedList collection. It uses a
binary tree, which has the search performance of a binary search, but with
a constant-order cost for adding elements instead of O(n).

no i dont need to read the data till its all sorted.
as I need to process the shortest first so I cant do that untill the last
one ahs been sorted.

[...]
Im not moving the structs at all, im only moving the index,
so its much like a class reference only with an extra level of
indirection.

I don't understand the point of that. If I understand correctly, you have
an array of ints that are indices into an array of structs? Since an int
is the same size as a reference, I definitely don't see what you hope to
save by doing it that way. If you can allocate your actual data all at
once, then the class instances will all be in the same place in the heap
anyway, just like an array.

What are you hoping to gain by basically reimplementing the reference
logic with integers?

well im going to change it all to class reference now and see,
if I find alloc is taking a lot of time il just put the class references
into an array and re use them ...
il keep you posted.

[...]
it takes about 2 minutes to process many lists, only a few of wich are of
that size.
im considering if a binary tree would do better, but its not a trivial
thing
to eveluate,
and its been a while since I went into this sort of thing in depth.

A binary tree isn't going to be faster in terms of providing a final
sorted list. But like I said, if you have a need for accessing the data
in sorted order for some reason as you are adding things to the
collection, the binary tree will be better there.

Implementation is trivial: just use the non-generic SortedList collection
class instead of the generic List<> class.

thanks for the help, seems like the add and then sort I had to start with
might be best after all.
class reference might speed things up a bit by losing the level of
indirection,
so il give that a spin.

thanks.
Colin =^.^=


.



Relevant Pages

  • Re: Binary search class: small problem retrieving the last element in the ordered array
    ... the code as I provided it does loop forever if there is no match. ... I coded my own algo because I don't want BinarySearch to order by array ... The array is already ordered. ... if the searched string is lexicographically bigger than any ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Binary search class: small problem retrieving the last element in the ordered array
    ... I coded my own algo because I don't want BinarySearch to order by array every single time, ... For example, if the searched string is lexicographically bigger than any other string in the array, lower never gets to be equal to upper and the algo goes on in a loop until a until a stack overflow exception is rised. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: fastest sorted list type?
    ... I don't know why BinarySearch() doesn't offer an overload that uses Comparison. ... bigger lists ... ... The big problem is that not only does using BinarySearchrequire doing just as many comparisons, every time you insert an element into the array at a position based on the binary search, the portion of the array after that position needs to be moved. ... you have an array of ints that are indices into an array of structs? ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: matching a condition in a list
    ... > consider the linear search a real performance problem. ... search with runtime Thetais better than sorting in O) ... > Sorting the array. ... > sorting the array once, and using BinarySearch would be a benefit, at the ...
    (microsoft.public.dotnet.languages.vb)
  • Re: search in a array
    ... Dim FileIn() As String = ... ... 'BinarySearch' will only make sense for a sorted array. ... Binary search makes sense if you are searching several ...
    (microsoft.public.dotnet.languages.vb)

Loading