Re: Persuading PCI memory writes to write as a burst
- From: "Calvin Guan" <hguan@xxxxxxxxxxxxxxxxxxx>
- Date: Thu, 5 Apr 2007 13:26:55 -0700
This question has been asked and answered many times.
With current x86, you can not reliably control the behavior of consecutive
writes/reads, period. Your best bet is get the fb2b going,still it's very
chipset dependant your chip and the chipset has to satisfy some strict
timing conditions during the consecutive access. Moving to PCIe doesn't help
perf much in this case. I'd rather invest in the bus master engine first.
--
Calvin Guan
Broadcom Corporation
Connecting Everything(r)
"Colin Grant" <ColinGrant@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:9D314EBA-D2AB-4D1D-8AD9-F67EC1ACDAFC@xxxxxxxxxxxxxxxx
Please let me know if you have any suggestions regarding faster ways to
write
to memory in this setup:
I have a user mode application that loops round as fast as it can (over
20K
cycles per second, preferably 100K cycles per seond) doing a bit of
processing, then writing a few (2 or 3) resulting chunks of memory (each
chunk is only of the order of 16 bytes) followed by a read before
processing
again.
To avoid the cost of switching to kernel mode, I have written a device
driver to map PCI (old 33MHz variety) memory space in to the user space of
the application (calling MmMapIoSpace with the MmWriteCombined parameter).
The application writes data to the mapped memory at consecutive addresses
to
encourage PCI bridge(s) to do write combining.
The PCI card has BAR0 declaring memory with prefetch (but, sadly, not fast
back to back)
When I get on to reading larger amounts of memory (maybe 32 32-bit words)
I
will try to use some kind of pretetch or (non?) temporal hint to improve
the
read time. It is command/response 'protocol' to the PCI card so I expect
there is a danger of ending up with stale data but I guess a ?fence
instruction will cope with that and leave no time for a prefetch to fetch
anything before the application is reading it anyway.
Sadly I can't change the application algorithm at this time to overlap
processing and I/O, so it is process|I/O|process|I/O|process...
I also cannot change the PCI card to bus master result(s) back up to the
application
This is on Windows XP writing to PCI on 2+ GHz processors (P4 in one PC,
Core Duo in another PC)
There are plans to move to PCI express so I'm trying to find out how
provoke
posted writes in the same application + device driver setup.
At the moment I can see (using a PCI bus analyzer) that I'm having some
success with burst writes: sometimes there's a single 32-bit write
followed
by a burst of (say) 7 x 32-bit writes. However, other times the same 8 x
32-bit writes to the same addresses end up as a single write followed by
(say) a burst of 4 writes and then a few trailing single writes. That's
even
when I simplify the application to consecutive assembly statements that
write
register data to the consecutive addresses. I also ran it several times to
rule out something else stealing my processor timeslice.
I'm hoping that there's better ways to write data that manage it by using
different assembly instructions or DMA (but that can only be provoked
while
in the device driver?) or something?
TIA,
(a bit long but its my first post so please be gentle :)
.
- Prev by Date: Re: Persuading PCI memory writes to write as a burst
- Next by Date: Re: Could not build driver,...
- Previous by thread: Re: Persuading PCI memory writes to write as a burst
- Next by thread: Re: Persuading PCI memory writes to write as a burst
- Index(es):
Relevant Pages
|