Re: memcpy to PCI Adapter Memory

From: Stephan Wolf [MVP] (stewo68_at_hotmail.com)
Date: 07/02/04


Date: Fri, 02 Jul 2004 16:18:46 +0200

IIRC, REP MOVSD on 80386 showed completely different performance
results depending on the alignment of the source and destination
memory addresses, i.e. (E)SI and (E)DI. Don't know if later CPUs still
show this behaviour. I thus always dword align the destination before
REP MOVSD in assembler code (which I don't use very often anymore
these days).

Also, I guess RtlCopyMemory() is actually implemented in the HAL. Not
sure, though.

I was hoping someone with more insight into this would jump in here...

Stephan

---
On Fri, 02 Jul 2004 12:21:09 GMT, "cd" <cd@junk.com> wrote:
>My guess is that byte copy was the easiest thing to do to get it out the
>door. Looking at Linux code, some processors have great memcpy_*io support
>(i.e. checking for alignment, using largest word size possible), and others
>default to the simple byte copy. Even *I* could make a slightly faster model
>just by looking for similar alignment on source and destination to use
>larger word sizes. This is why a HAL routine to optimize memory moves
>to/from IO bus is a must to optimize system performance especially as
>processors get so much faster than IO busses.
>
>
>"Stephan Wolf [MVP]" <stewo68@hotmail.com> wrote in message
>news:jvaae0duueorjqp0svhucr3hvj53fbqqfh@4ax.com...
>> On Thu, 01 Jul 2004 18:24:22 GMT, "cd" <cd@junk.com> wrote:
>>
>> >Those are either memcpy or byte copies which may not be the most
>efficient
>> >(especially on RISC architectures).
>>
>> Exactly. But memcpy() or, in this case, RtlCopyMemory(), for sure uses
>> 32-bit or 64-bit memory move instructions on x86 and ia64,
>> respectively.
>>
>> Sorry, but I am not aware of why RtlCopyMemory() cannot be used on
>> RISC for mapped memory as the comment in "ndis.h" states.
>>
>> All I know is that NdisMoveMappedMemory() is probably the best choice
>> for *network* drivers, which is my major field of activity.
>>
>> Stephan


Relevant Pages

  • Re: What factors influence required memory alignment?
    ... preventing such alignment on DOUBLE PRECISION operands. ... point values are packed into a single register and/or memory location. ... The ability to trap misaligned accesses was added to various x86 ... Its slightly easier to catch bugs when an odd address read immediately ...
    (comp.arch)
  • Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
    ... raw memory bandwidth is governed by RAS cycles. ... This means that the more data you can load into the cpu on the 'read' ... Alignment is critical. ...
    (freebsd-arch)
  • Re: [PATCH] Mantaining turnstile aligned to 128 bytes in i386 CPUs
    ... raw memory bandwidth is governed by RAS cycles. ... This means that the more data you can load into the cpu on the 'read' ... Alignment is critical. ...
    (freebsd-current)
  • Re: traumatized by pointer casting
    ... >> When you cast the array by name to a pointer to struct, ... structure, and after the last member, just not before the first ... the entire structure a multiple of the alignment size. ... That is why the memory allocation functions malloc(), calloc, ...
    (comp.lang.c)
  • Re: A malloc question
    ... reliably store 100 chars followed by a struct in this memory, ... the alignment of p+100 may not be suitable for the struct. ... the paddings for memory alignment are ... malloc() will see a request of 24 byte allocation. ...
    (comp.lang.c)