Re: Is there a faster hi resolution timer for diy profiling



"colin" <colin.rowe1@xxxxxxxxxxxxxxxxxx> wrote in message news:p0sjj.31043$ov2.4855@xxxxxxxxxxxxxxxxxxxxxxx
"Willy Denoyette [MVP]" <willy.denoyette@xxxxxxxxxx> wrote in message news:OrKMsJGWIHA.3556@xxxxxxxxxxxxxxxxxxxxxxx
"colin" <colin.rowe1@xxxxxxxxxxxxxxxxxx> wrote in message news:CNljj.348$tr1.211@xxxxxxxxxxxxxxxxxxxxxxx

"Willy Denoyette [MVP]" <willy.denoyette@xxxxxxxxxx> wrote in message news:ecqnZQBWIHA.4904@xxxxxxxxxxxxxxxxxxxxxxx
"colin" <colin.rowe1@xxxxxxxxxxxxxxxxxx> wrote in message news:4Jcjj.301$tr1.214@xxxxxxxxxxxxxxxxxxxxxxx
"Willy Denoyette [MVP]" <willy.denoyette@xxxxxxxxxx> wrote in message news:%23pkHBg8VIHA.4740@xxxxxxxxxxxxxxxxxxxxxxx
"Willy Denoyette [MVP]" <willy.denoyette@xxxxxxxxxx> wrote in message

// C function that uses an intrinsic to retrieve the Processor's cycle counter register
// Beware your CPU may not support this!
// Compile from the command line: cl /EHsc /O2 <thisfile.cpp>
#include <intrin.h>
#pragma intrinsic(__rdtsc)
extern "C"
__declspec(dllexport) unsigned __int64 __stdcall rdtsc()
{
return __rdtsc();
}


// C#
[DllImport("thisfile.dll"),SuppressUnmanagedCodeSecurity]
static extern ulong rdtsc();

...
ulong cycles = rdtsc();

Willy.

thanks that worked, however it stil takes 9000 clock cycles
wich I think is roughly about the same time ...
is that PInvoke realy so time consuming ?
what does it have todo ?

thanks
Colin =^.^=

What exactly takes 9000 cycles?

int cnt=100000;
start = timer.Value;
for (int x = 0; x < cnt; x++)
{
ulong cycles = timer.value
}
timerOverhead = (timer.Value - start)/cnt;

timer.value -> get{return rdtsc();}

timerOverhead comes out as about 9000;

Colin =^.^=


Following code:

[DllImport("rtc"),SuppressUnmanagedCodeSecurity]
static extern ulong rdtsc();

static void Main()

{
int cnt = 100000;
ulong cycles = 0;
ulong start = rdtsc();
for (int x = 0; x < cnt; x++)
{
cycles = rdtsc();
}
ulong timerOverhead = (cycles - start) / (ulong)cnt;
Console.WriteLine(timerOverhead);
}
}

comes out as 13 on my box.
That means 13 cycles per iteration, the total number of instructions executed is ~35, from which ~30 is Interop call overhead, the remaining in the increment and compare of cnt.
The called function itself is only two instructions:

10001000 0f31 rdtsc
10001002 c3 ret

Seems like your processor does not support the rdtsc feature, what CPU do you run this on?

You can call following C function before you are trying to use the Time Stamp Counter, this function returns true if TSC is supported else false....

extern "C"
__declspec(dllexport) bool __stdcall tsc()
{
bool tscFeature = false;
int CPUInfo[4] = {-1};
__cpuid(CPUInfo, 0x80000001);
int nFeatureInfo = CPUInfo[3];
// check tsc feature bit (bit 27) in CPUInfo[3]
if (nFeatureInfo & 0x4000000)
tscFeature = true;
return tscFeature;
}


[DllImport("yourdll"),SuppressUnmanagedCodeSecurity]
static extern ulong rdtsc();



thanks for trying it on your machie :D

hmm seems running from the ide debugger is making it incredibly slow
I just made completly new files and new c# console project,
managed to get 12 too by using ctrl-f5,
but when I run with just f5 its 9000 !


Let me see, you are running a debug version in the IDE debugger and it returns 9000, right? Well, It will be slower but not that slow, there is clearly something wrong (this 9000 seems like a magic number, weird) .


is there a way to avoid the overhead the debugger being attached seems to impose ?


No there is no posibility (other than running outside the debugger) to avoid the overhead imposed by the fact you are 1) running non optimized code 2) inside the IDE debugger.

I only use the timer in debug mode to do profileing anyway,
but I gues I cld detect if the debuger is not atacched and dump it to a file

You should not instrument debug builds, but if you do, you should accept the overhead imposed by the environment (debugger, non optimized JIT code).

Willy.


.



Relevant Pages

  • Re: Is there a faster hi resolution timer for diy profiling
    ... static extern ulong rdtsc; ... however it stil takes 9000 clock cycles ... Seems like your processor does not support the rdtsc feature, what CPU do ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: [RFC] full suspend/resume support for i915 DRM driver
    ... There's some #if 0'd code to support that case, ... * This is the number of cycles out of the backlight modulation cycle for which ... * UDI pixel divider, controlling how many pixels are stuffed into a packet. ... * 855 scratch registers. ...
    (Linux-Kernel)
  • Re: Is there a faster hi resolution timer for diy profiling
    ... "colin" wrote in message ... static extern ulong rdtsc; ... ulong cycles = timer.value ... Seems like your processor does not support the rdtsc feature, what CPU do you run this on? ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Ext3 vs NTFS performance
    ... then we will need to get glibc to support the new system call. ... cycles, ... for the next major enterprise distro releases. ...
    (Linux-Kernel)

Loading