Re: Optimization! Where?

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance




"Alan Carre" <alan@xxxxxxxxxxxxxxxxx> wrote in message
news:eOGEmWUWJHA.4252@xxxxxxxxxxxxxxxxxxxxxxx
If you look at the way branch prediction works, you'll see that there's no
mapping from registers to cached instructions and/or micro-op sequences.
ret, on the other hand is not equivalent, the addresses on the stack are
fixed numbers (not registers) and may have associated entries in the
prediction table [table mappings are from cached table entries to actual
memory locations. Lookups are based on addresses, not registers]. It
dosn't work by doing "cmp edx, [address]" or anything like that. No actual
cpu instructions are involved in these lookups (otherwise what's the
point?).


From Intel Optimization Manual:

"The BPU makes the following types of predictions:

- Direct calls and jumps;
- Indirect calls and jumps."

The target address doesn't have to be evaluated for the prediction. The BP
cache key is the jump/call instruction address, which is well known.


.



Relevant Pages

  • Re: Two Click disassembly/reassembly
    ... Map the extra x86 registers to memory. ... > equivalents to the string instructions. ... > got such a limited RISC like instruction set that the assembler is more ...
    (alt.lang.asm)
  • Re: IBM 45nm -- new or licensed from Intel?
    ... 12 have 'L' sub registers, ... Just don't tell your compiler that they exist, ... to insert extra movzx instructions and avoids partial stalls all in one... ... If you want to compare the int results, you usually need to extend the ...
    (comp.arch)
  • Re: Problems with editfields
    ... paging bit somewhere and an opportunity to keep the ... following instructions in unwanted ways. ... And I really want more packed integer instructions ... capable to move immediates into registers under ...
    (comp.lang.asm.x86)
  • Re: IBM 45nm -- new or licensed from Intel?
    ... registers are more restricted, but you do not need to use them (see ... instructions even if they had the option. ... For example the "low power" Silverthorne is ...
    (comp.arch)
  • Re: speed it up
    ... can load many registers at once from memory and put many instructions ... the inner loop is unrolled ... The above loop tells the compiler that 4 registers ...
    (comp.lang.cpp)