Re: Curious about loop optimization C++ - assembly

Tech-Archive recommends: Speed Up your PC by fixing your registry



Egbert Nierop (MVP for IIS) wrote:
Hi,

Out of curiousity, I sometimes look at the produced assembly after
compilation in release mode.

What you often see, is that CPP, always fully addresses registers to
copy values from a to b...

While stosb,stosw, stosd etc and the same for movs[x] are one
statement, and internally use registers ESI and EDI (source,
destination) to copy data.
This seems (imho) more efficient, however, CPP never uses this
construct... it always uses a lot more instructions.

imagine this loop (I simplified the idea, of course, memcpy would be
normally used)


DWORD anArray [10000];

// copy array while skipping uneven element positions

for (int mycounter=5000; mycounter != 0; mycounter--, element+=2)
anArray[element] = somesource[element];


could be optimized to

setup source and destination

MOV EDI, [anArray]
MOV ESI, [somesource]
MOV ECX, myCounter
DEC ECX
CLD // forward copy

mylabel:
MOVSD <--- actual loop and copy instruction
LOOP mylabel <-- decrement ECX until ECX == 0


Q: is the mentioned construct, simply not so efficient or is there a
reason the C++ compiler team decided not to try to optimize to this
level?

The LOOP and MOVS instructions are horribly slow on modern CPUs because they
don't make effective use of the deep pipeline in the CPU. The longer
instruction sequence actually executes many times faster.

IIRC, VC++ did generate LOOP/MOVS years ago (VC1-4 maybe?), but has gone
away from using those constructs since maybe the Pentium.

-cd


.



Relevant Pages

  • Re: Cost of calling a standard library function
    ... instructions like "REP MOVSD" are simply like "on-chip macros"...that ... RISC-style instruction can execute just as well, ... RISC "MOV" instructions...that's why we're getting the speed ...
    (alt.lang.asm)
  • Multi-Statements Lines
    ... Therefore, i have developed and proposed a new Assembler, with ... Instructions are not isolated things, ... mov eax 0 ... a News Group that is supposed to address Assembly Programmers, ...
    (alt.lang.asm)
  • Re: RosAsm User Interface
    ... ; Generic console macros ... mov B§esi,al ... If some Instructions have, exceptionaly, to be commented, ... Divide eax by 10, make it a Digit, and store it. ...
    (alt.lang.asm)
  • Re: MOVZX has stall register
    ... | bits instructions and 16 bits instructions. ... | MOV BYTE PTR, ... | MOV EAX, 0FEH ... It can be done with Sign flag, Overflow flag, and Zero flag. ...
    (comp.lang.asm.x86)
  • Re: on reduce instruction lenght
    ... "mov eax, ebx" to ... push r1, r2, r3; ... map to all instructions, which will make any complete implementation ... the source download for my compiler does include the complete ...
    (alt.lang.asm)