Re: ATI vsync CPU usage
- From: "Jan Bruns" <testzugang_janbruns@xxxxxxxx>
- Date: Sun, 18 Jun 2006 09:07:02 +0200
I don't understand this workaround.
First of all, IDirect3DDevice9::Present can't be called
with the D3DPRESENT_DONOTWAIT Flag, sine it only takes
4 Parameters (2 rects, 1 winHandle, and 1 dirtyRegion).
The SDK lists D3DPRESENTFLAGs used on device creation,
and this list doesn't include a D3DPRESENT_DONOTWAIT, too.
So, do I get your description right, that the wrkaround
looks like this?
if ( IDirect3DSwapChain9::Present(...,D3DPRESENT_DONOTWAIT)
= D3DERR_WASSTILLDRAWING) then
begin
repeated_yielding_try_to_lock_last_bb;
IDirect3DSwapChain9::Present(...,0)
end;
This would make sense, if the try to lock refers to the
surface we last rendered to, so that
IDirect3DSwapChain9::Present(...,0) shouldn't have to wait
for any GPU-side rendering to be completed.
But Present (or some other method) also has to wait for some
vsync-related event.
If this wait for a vsync-event would be implemented as some mean
of pipeline blocking, this would significantly reduce computational
GPU-power for many applications.
Furthermore, Scotts results show up significant CPU-waste, even if
the GPU hasn't got to do time-consuming operations.
Let's hope, these cards don't do both, limit GPU and waste CPU.
Not sure, but I assume this would most likely mean that this
workaround doesn't work at all.
Gruss
Jan Bruns
"Wyck" <Wyck@xxxxxxxxxxxxxxxxxxxxxxxxx> schrieb im Newsbeitrag news:1F4E9952-08C0-449B-A199-EF6B3A9279ED@xxxxxxxxxxxxxxxx
Not sure of the relevance of this to ATI in particular. I haven't even
verified if it's true yet, but it's worth posting here because you can try
the workaround technique and see if you get different results.
The Microsoft DirectX runtime may be responsible for increased CPU usage
during a call to IDirect3DDevice9::Present, and the driver manufacturer may
not be able to do anything about it.
In a technical discussion, [someone] revealed to me more about the inner
workings of the DirectX runtime and how it uses the display driver. [they]
claim that a call to IDirect3DDevice9::Present with the D3DPRESENT_DONOTWAIT
flag invokes in a call to the flip call in their driver. If the driver
returns the "was still drawing" return code, the DirectX runtime loops and
calls the driver again. This results in a spin loop that consumes 100% CPU.
[They] suggest a workaround for this behaviour. The workaround is to lock
the last backbuffer in the swap chain after presenting, but before other
drawing commands are issued.
Apparently, the spin loop caused by the DirectX runtime does not happen when
calling IDirect3DSurface9::LockRect function with the D3DLOCK_DONOTWAIT flag.
In this case, the driver returns the "was still drawing" return code, and the
IDirect3DSurface9::LockRect function returns D3DERR_WASSTILLDRAWING to the
caller. The caller may then sleep for a nominal period (such as one
millisecond) to try to yield the processor to other threads.
This of course depends on creating your backbuffers (at device creation
time) with the lockable flag set.
// Here's the meat of the workaround.
pDevice->GetBackBuffer( 0, back_buffer_index, D3DBACKBUFFER_TYPE_MONO,
&pLastBackBuffer );
D3DLOCKED_RECT lr;
hr = pLastBackBuffer->LockRect( &lr, 0, D3DLOCK_DONOTWAIT );
while( !SUCCEEDED(hr) ) {
if( hr == D3DERR_WASSTILLDRAWING ) {
// yield CPU and try again.
YieldCPU();
hr = pLastBackBuffer->LockRect( &lr, 0, D3DLOCK_DONOTWAIT );
} else {
// some other unexpected error occurred
return E_FAIL;
}
}
pLastBackBuffer->UnlockRect();
"Jan Bruns" wrote:
In another thread ("multithreading for pathfinding?") Scott
described a vsync-related problem about significant CPU-waste
using ATI-cards in fullscreen.
Scotts results sound convincing about this problem beeing caused
by hardware or driver.
I've read about this problem years ago, but expected it to
be fixed soon.
So first of all, can someone reproduce Scotts results?
If so, does anyone know, what the ATI-dev-support suggested
workaround to this problem is?
Scott schrieb im Newsbeitrag ( news:e565a902pav@xxxxxxxxxxxxxxxxx )...
> Jan Bruns:
>> On my computer (amd64-3000, gf6800, winXP) I don't see any CPU-usage >> for
>> a simple selfmade sample program (Present Interval One), no matter, if
>> in full-screen, or not, and expected this problem to something of the
>> past (pre GF4 Hardware).
>> Would you please restart your test using one of the SDK-samples?
>> Maybe you forgot to validateRect() inside the Message-Loop?
> Here are some numbers using the SDK samples. My configuration is a > dual-
> core machine with a radeon x800 pci-express graphics adapter. > Utilizations
> reported are the total of both cores (i.e. 50% indicates one core fully
> utilized).
> Using the "blobs" sample:
> pres mode = one
> windowed = 10-20% utilization
> fullscreen = 54% utilization
> pres mode = immediate
> windowed = 54% utilization
> fullscreen = 54% utilization
> Using the "antialias" sample:
> pres mode = one (~ 60 fps)
> windowed = 7% util
> fullscreen = 53% util
> pres mode = immediate (~ 1000 fps)
> windowed = 54% util
> fullscreen = 54% util
On my computer (single CPU, Nvidia GPU), the "blobs" sample doesn't show
up any significant CPU-usage, if vsync is enabled, independet of wether
fullscreen or not.
For the "antialias" sample, I see a slightly increased CPU usage
of 20% in fullscreen mode. This could probably be caused by "Cool'n Quiet",
or sloppy design of the application.
Scott also described tests using CPU-stressing methods, so CPU power saving
doesn't seem to cause his results.
Gruss
Jan Bruns
.
- References:
- RE: ATI vsync CPU usage
- From: Wyck
- RE: ATI vsync CPU usage
- Prev by Date: Re: Which Card is faster in using GetRenderTargetData NVidia or ATI?
- Next by Date: Re: Which Card is faster in using GetRenderTargetData NVidia or AT
- Previous by thread: RE: ATI vsync CPU usage
- Next by thread: Re: Alpha blending settings to create inverted ( negative ) image.
- Index(es):
Relevant Pages
|