Re: Memory ordering and flag/state variables

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



Hi Tim

On Intel processors, a
write will always complete before a read from the same address.

In fact, the above is 100% true only for one- byte read/writes -
one-byte memory access always result in bus locking. However, when it
comes to words and dwords, your statement applies if and only if
memory access is properly aligned( i.e. respectively on word and dword
boundary), or LOCK prefix is explicitly used - non-aligned memory
access does not result in bus locking


Anton Bassov

Tim Roberts wrote:
BubbaGump <> wrote:

If a flag (in a device extension for instance) is set in some driver
routine and checked in another, shouldn't the first routine
technically perform some sort of memory barrier (KeMemoryBarrier() or
an InterlockedExchange()) to make sure the new value of the flag would
be seen by all CPUs before the function returns?

Usually, "before the function returns" is not particularly important, since
multiple threads could be inside the function at the same time.

Any time you have a piece of data that can be read and written in different
threads, you need to worry about interlocking. However, KeMemoryBarrier
really applies only in very special situations. On Intel processors, a
write will always complete before a read from the same address.

I found out about memory ordering a little while ago, and I've heard
the theories about how out-of-order loads and stores can cause
problems, but when I look at a lot of code I don't see as many
barriers as I'd expect and I wonder why it works. I often just see
plain old C assignment statements, and I wonder if it happens to work
either because current CPUs order their memory operations or because
there are already so many implicit barriers from other intervening
operations.

Right. Intel CPUs do it for you. There are certainly lots of chips that
don't; if you ever do embedded programming, you'll have to learn the proper
habits...
--
- Tim Roberts, timr@xxxxxxxxx
Providenza & Boekelheide, Inc.

.



Relevant Pages

  • RfD: Memory Access - v2 (Long)
    ... RfD - Memory access ... It is assumed that bytes do not require address alignment. ... Store the 8 LSBs of x at addr. ...
    (comp.lang.forth)
  • Re: Results of the memswap() smackdown from the thread "Sorting" assignment
    ... Use in C is memswap; a, b are pointers to a block of memory, n is ... The memory access is serial: one read and one write access per location, ... registers, and taking care of the unaligned first 16 bytes with movdqu, ... If you want to get even more extreme you can tweak the MTRR for very ...
    (comp.programming)
  • Re: How hard is socket programming?
    ... You have a quantum based context switching you can't stop, ... will never have 100% full exclusive control of MEMORY ... multi-thread to do the same type of memory access work. ... The INTEL Multi-Core chips has advanced technology to help ...
    (microsoft.public.vc.mfc)
  • Re: How hard is socket programming?
    ... you will never have 100% full exclusive control of MEMORY ACCESS - never. ... That gives other threads in your process, if it was multi-thread to do the same type of memory access work. ... Now comes a MULTI-CORE, and you have two or more threads, the SPEED is that there is NO CONTEXT SWITCHING - you still may have the same memory access, but that would be no slower if it was single cpu. ...
    (microsoft.public.vc.mfc)
  • [PATCH] Document Linuxs memory barriers [try #2]
    ... The attached patch documents the Linux kernel's memory barriers. ... I've tried to get rid of the concept of memory accesses appearing on the bus; ... barring implicit enforcement by the CPU. ...
    (Linux-Kernel)