Re: strange performance of nested virtual methods

From: Kismet (Kismet_at_discussions.microsoft.com)
Date: 03/18/05


Date: Fri, 18 Mar 2005 15:13:03 -0800

Sorry, you are correct.
I have done some further investigation.
If you dissasemble the code generated and count machine cycles,
there is very little overhead for the virtual calls when compared to the
cycles
required to execute the SetValue method in the first place.
I am not sure what the point of your exercise is, but if you are worried
about the overhead from calling a virtual method, don't.

As a side note, you do realize that there is essentially no valid result
obtained by running your code right? Since you call baseService.SetValue(),
only the original instance "Service service".val gets assigned to anything.
The ServiceDecorator Class instantiations inherit the val member, but never
set it. As a result, there is no way to access your final computed value. I
(in order to validate that i was actually calling the methods as intended)
added a virtual GetVal() method to the Service Class. I experimented with
several variations including using a delegate/event model, with slightly
slower, but similar results.

I was a little surprised that the code as small as it is, didnt get inlined
although that is probably because you cant inline virtual functions.

My best guess as to why the timings are as they are is as follows:
the cpu level 1 and 2 caches probably churn with 1 and 2 virtual calls,
while after that, the cache holds the code required to execute the virtual
methods.
This could be verified by disabling the cache and rerunning the tests. (I
am unable to disable the cache on my laptop which is where i am running the
code).

"Markus Vogel" wrote:

> > I ran your code under the February Tech Preview of .NET 2005 beta, and got
> > different results. It worked as i expected.
> >
> > Timings are as follows:
> > 1547
> > 2953
> > 8859
> > 10313
> > 11843
> > 13219
> > 14750
> >
> > I cant say why you didnt get results similar to mine in (i assume you are
> > using .net 2003)
>
> But I would consider my result similar to yours.
> My results are:
>
> 521
> 1051 (1051-521 = 530)
> 2444 (2444-1051 = 1393)
> 4787 (4787-2444 = 2303)
> 5217 (5217-4787 = 430)
> 5718 (5718-5217 = 501)
> 6239 (6239-5718 = 521)
>
> The Point of my interesst are the differences between the calling times.
> And the second and third one are quite irregular (like your second one)
>
> 1547
> 2953 => 1406
> 8859 => 5906
> 10313 => 1454
> 11843 => 1530
> 13219 => 1376
> 14750 => 1531
>
> My question is: What is the reason for this irregular result?
>
>
> > Some things to keep in mind:
> > 1) my system is fairly clean. The code you posted is cpu intensive which
> > means that any other thread/application doing processing on your system
> > during the test WILL skew the results.
> I made several tests. And appart from some miliseconds they were all the
> same
>
> > 2) I hope the tests were executed from the command line, not run inside the
> > IDE.
> Yes, the test were executed from the command line
>



Relevant Pages

  • Re: strange performance of nested virtual methods
    ... > If you dissasemble the code generated and count machine cycles, ... > about the overhead from calling a virtual method, ... > This could be verified by disabling the cache and rerunning the tests. ... which does exactly the same as "ServiceDecorator". ...
    (microsoft.public.dotnet.framework.performance)
  • Re: FreeBSD 5.3 Bridge performance take II
    ... If you do believe that the slab allocator has severe performance ... Until the slab allocator is fixed the system-wide overhead will skew ... memory cache vs a per-thread memory cache. ... performance-critical subsystems and when we do, ...
    (freebsd-current)
  • Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT
    ... promise to have no measurable runtime overhead. ... any real $cache overhead. ... The PV kernel has over 100K larger text size, nearly 40K alone in mm/ and ... User time also ...
    (Linux-Kernel)
  • Re: Thanks
    ... for numbers of objects growing with the size of the cache. ... It's pretty easy to implement and has little run time overhead. ... I'm not sure what EJB and Hibernate do about this but since this is such a common scenario I bet they have solved this already. ... There is also db4o which also has nice querying features IIRC: http://www.db4o.com/ ...
    (comp.lang.java.programmer)
  • Re: Lock Free -- where to start
    ... >> that on recent Pentiums just acquiring the lock is going ... A good reader/writer solution can greatly enhance cache performance. ... really want to be "cache friendly", try to avoid calling atomic operation ...
    (comp.programming.threads)