Re: ArrayList BinarySearch vs Contains

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



"Dave Sexton" <dave@jwa[remove.this]online.com> wrote in message
news:uxbyI3cBHHA.2140@xxxxxxxxxxxxxxxxxxxxxxx
Hi Jon,

So it seems that choosing an appropriate algorithm depends not only on
what the O notation says about performance, but other factors that the
notation doesn't account for such as scale, depth and distribution of the
data, perhaps, which also may affect the performance of the algorithm.

Yup. Welcome to the world of optimization. :) The "order" of an algorithm
is an important analytical aspect, but as you've noted one should not
consider it to the be the last word in which algorithm is "best". There are
so many other factors that come into play, the order by itself just doesn't
tell you everything you need to know.

I guess the best way to choose an appropriate algorithm for an application
when dealing with complex data structures, sorts and searches, is to pick
one and see if it performs better than any other algorithm that I decide
to test.

Well, yes and no. I mean, it's useful to use the order as a starting point.
After all, you can pretty much rule out even testing a bubble sort when
you've got a Quicksort implementation to use. You don't literally need to
test all possible solutions, thank goodness.

Also note that testing your potential solutions does not necessarily tell
you worst-case performance. It depends on how you pick your data for the
tests, and different algorithms may have different data that results in the
worst-case performance. So, testing can be useful, but it's not necessarily
an improvement over just looking at the order. To improve upon the order
analysis, you have to do more analysis. Testing alone won't cut it.

I can choose the algorithms to be initially tested based on all of the
information that is available, ahead of time, about the data and its
structure, the noted performance of the algorithms and the acceptable
criteria for performance and memory consumption.

I'll be outsourcing any project in the future where this is required ;)

That works. :)

That said, I'll note that much of the time, absolutely *optimal* performance
is not required, while at the same time the very worst cases are not
necessarily that hard to avoid. For example, even in the average case,
Quicksort is dependent on the distribution of the data and how you pick the
pivot element for each iteration. Even when the magnitude is reasonably
proportional to n log(n), there can be wide variations in real-time
performance.

This can be addressed either by playing some tricks when picking the pivot,
or by simply ignoring the problem and accepting that occasionally you'll get
data organized in a way that hinders the sorting performance. The very
worst case comes when the data is already sorted, and this condition can be
detected fairly easily (and of course, if you detect it, you need not do any
more work :) ).

I'm just using Quicksort as an example. The same ideas apply to pretty much
any algorithm. Knowing the order is useful, but it's not the end-all,
be-all for picking an algorithm or implementation. You can use the order to
make some broad generalizations, and then refine the implementation to
address specifics that apply in one's particular situation. And in the end,
most coders simply accept that sometimes the universe will throw some really
hard data at the algorithm, and it won't get the very best performance it
might have gotten. :)

Pete


.



Relevant Pages

  • Re: Building Unification Table - tranforming prolog like notation into lisp notation
    ... If I don't pass my AI intro with Lisp course I won't be able to play ... notation problem comes  from tranlating prolog notation into lisp ... basic algorithm is a heavy user of resources. ... A slow growing ...
    (comp.lang.lisp)
  • Re: merits of Lisp vs Python
    ... any algorithm that deals with a fixed-number of elements ... That isn't how Onotation is defined. ... performance parameter), that still expresses an upper bound of some ... regardless of how large N gets: the upper bound is a function of the ...
    (comp.lang.lisp)
  • Re: Big Oh Notation
    ... Notation, but what puzzles me is a question in Sedgewick's algorithm ... Some motorcycles are the traditional ones, ... which get roughly a constant miles/gallon. ...
    (comp.programming)
  • Re: which is better, string concatentation or substitution?
    ... here's a quick tutorial to "big-oh" notation. ... For any algorithm, if you process n things (in the case of the strings ... but it doesn't really matter for small values of n. ... of time no matter how long the string is. ...
    (comp.lang.python)
  • Re: ArrayList BinarySearch vs Contains
    ... So it seems that choosing an appropriate algorithm depends not only on what ... the O notation says about performance, but other factors that the notation ... QuickSort has an average O ... Then the difficulty comes in comparing two algorithms that have asymmetric ...
    (microsoft.public.dotnet.languages.csharp)