Re: Question about function templates and performance
- From: "colin" <colin.rowe1@xxxxxxxxxxxxxxxxxx>
- Date: Thu, 10 Jan 2008 21:19:18 GMT
"Peter Duniho" <NpOeStPeAdM@xxxxxxxxxxxxxxxx> wrote in message
news:op.t4qe99bu8jd0ej@xxxxxxxxxxxxxxxxxxxxxxx
On Thu, 10 Jan 2008 12:49:40 -0800, colin <colin.rowe1@xxxxxxxxxxxxxxxxxx>
wrote:
[...]
if only there was a way to force inlining 'sigh'
I wouldn't worry about that if I were you. In general, I would expect the
JIT compiler to do a good job identifying scenarios where inlining will
actually help performance. If it's helpful to inline, it's already doing
it for you.
You need to keep in mind that the cost of a function call is not the only
thing that affects performance. Inlining code makes the code bigger,
causing cache issues, and it also creates duplicates of the same code,
making it harder for the CPU to take advantage of pipelining and branch
prediction. Since the CPU (x86 especially) is highly dependent on these
features to get best performance, it's very important to not defeat those
optimizations in the pursuit of a different one.
In other words, think twice before you try to second-guess the compiler.
And if after thinking twice, you still want to do it, think again.
wouldnt that be third guesing ?
The folks who wrote the JIT compiler aren't as dumb as some people seem to
think they are. Even with the best compilers, _someone_ can find an
optimization the compiler gets wrong. But the number of people in the
world who are capable of finding that optimization are far and few
between. I don't know you well enough to know if you're one of those
people, but from a purely statistical point of view, chances are
practically nil that you are. :)
yeah I appreciate that, I find compilers are a heck of a lot more robust
than they were
1 or 2 decades ago,
I used to break them a lot way back then, but cant remember the last time I
did recently,
although ive found less than optimum code on more than a few occasions,
partucularly on microcontrollers with no hardware divide,
such as not always recognizing divide by constant 2 is a simple shift to the
right.
the problem here comes when a function is called in excess of 100 million
times,
yet may only be called often by a handfull of places in the code.
such a function compares two 3d points, to test if they are closer than
rounding errors.
this comes about by having to compare every object with every other object,
obviously there are ways to segregate them so fewer comparisons need to be
done but
this itself is quite diffucult, and I think I have persued that as far as
possible.
its a just question of being able to determine places that give a good
performance gain for effort spent optimising.
Colin =^.^=
.
- References:
- Question about function templates and performance
- From: colin
- Re: Question about function templates and performance
- From: Peter Duniho
- Re: Question about function templates and performance
- From: colin
- Re: Question about function templates and performance
- From: Peter Duniho
- Re: Question about function templates and performance
- From: colin
- Re: Question about function templates and performance
- From: Peter Duniho
- Re: Question about function templates and performance
- From: colin
- Re: Question about function templates and performance
- From: Peter Duniho
- Question about function templates and performance
- Prev by Date: Re: DataContractSerializer returns null when deserializing an object
- Next by Date: Re: profiler which does not just sum up used time
- Previous by thread: Re: Question about function templates and performance
- Next by thread: Re: Question about function templates and performance
- Index(es):
Relevant Pages
|