Re: Double-Checked Locking pattern issue




"George" <George@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:AFF9AD5D-1564-4099-AA56-C36A23EA124B@xxxxxxxxxxxxxxxx
Thanks Igor,


to memory is expensive, it is conceivable the CPU may reorder the write
to the pointer as early as possible.

The only reason I could think of is writing earlier to memory could save
the
register and we could save the register for later use.

What are your points about why writing early will improve performance?
Could
you show more description or some pseudo code please?

You aren't considering pipelining, cache effects, speculative branching, or
any of the other things that new CPUs use to retire multiple instructions
per clock.

In short, while some CPU can retire four instructions per clock, there
aren't four copies of every unit, and certainly it can't transfer that much
to/from memory at once. So the CPU reorders instructions to get different
instructions that use different parts of the CPU executing together. In
essence, this is what hyperthreading also does, except it interleaves a
separate flow of execution instead of reordering a single flow.

All of this is part of the reason that function calls are so very expensive.



regards,
George


.



Relevant Pages

  • Re: Volatile variables
    ... Memory barriers come in four "memory" flavors -- ... one needs only a single "store/load" barrier between the write to ... the command register and the read from the status register. ... does a full CPU pipeline flush and empties the write aggregation ...
    (comp.lang.c)
  • Re: Process Register Directly Is Slow?
    ... storing data into memory is always slow. ... into register from memory before process data in register directly may ... I would claim to use only 2-3 x86 instructions like below. ... the assembly code you give does not do what you say it does above ...
    (comp.lang.asm.x86)
  • Re: Cost of calling a standard library function
    ... > sense, since push Allocates memory, and pop deallocates it. ... Hence, all the CPU does is, basically: ... so forth...it's even possible to get "free" instructions (effectively ... what else is an ASM coder's job? ...
    (alt.lang.asm)
  • Re: Named shared memory without synchronization
    ... A primitive lives in a shared memory page. ... width of the CPU register, ... compiler doesn't need the volatile declaration. ...
    (microsoft.public.vc.language)
  • Re: Code translation
    ... >> and instructions compared to the Z80. ... > When you automatically translate machine code from one CPU to another, ... and takes much more memory. ... Such as a reverse P-Code Decompiler? ...
    (comp.os.cpm)