Re: improving the performance of a large switch block



Mycroft Holmes wrote:

that was exactly my hope: the result was slower, but I don't know if that's due to the compiler not producing a jump table or to the slowness of binary search.

I wouldn't be surprised to learn that the compiler generates a better binary search for your switch() than std::lower_bound. The latter one is a general-purpose algorithm, which inherently has a loop internally, and it can't be optimal for any sepcific case. The compiler can unroll the loop for your switch.

If you think you can do a better job than the compiler, try to hand-optimize the binary search. You have 70 items, so for a binary search tree you only need 7 hand-crafted ifs, which will be faster than lower_bound.

This solution has the added advantage of being able to optimize the search tree. Here I mean optimizing the order of comparisons, not optimizing the source code itself. In easy-to-understand terms, you can test the most frequently occuring IDs first. The optimal search tree is most likely an unbalanced tree. That will be far from optimal for a truely random sequence, but if you know the distribution of your IDs, it will perform the best. For example, if your factory creates a particular object 90% of the time, you can arrange the ifs so that its ID is compared first. This will greatly increase your performance.

There are existing and well-known algorithms for producing optimal search trees, given the distribution of your elements. The Huffman compression comes to my mind, which does the same thing, even though for a different purpose.

You can bet that the compiler can't generate an optimal search tree for your switch(), it can only generate a balanced one, which is the best thing to do without knowing the frequency of each case. If you know the actual frequency, you can outperform the compiler easily.

See
http://en.wikipedia.org/wiki/Binary_search_tree
and look at "optimal binary search tree".

Look at
http://en.wikipedia.org/wiki/Huffman_tree
for an algorithm to generate optimized search trees.

Once you have the optimal search tree for your sample, try to code it without using loops, just hard-coding all the nested ifs. The disadvantage of this is that every time you need to modify your factory, you need to recode the tree. Also, the factory won't be reusable if the statistics of your elements change. You have to be careful, because an optimal search tree becomes severely sub-optimal if your distribution changes.

Tom
.