Re: volatile char

Tech Tip: Click here to run a free scan for Windows Errors and optimize PC performance



"Dan Baker" <dbmail> wrote in message
news:utZQvRXgFHA.1048@xxxxxxxxxxxxxxxxxxxx
> Useless? What about the following case? There are several worker
> threads, working away. They don't have much interaction with the
> main thread. But, when there are times (like application shutdown)
> when the main thread simply wants all worker threads to stop. I have
> worker thread code like the following:
>
> class CMyThread
> {
> ...
> volatile BOOL m_bRequestToStop; // Set to TRUE by master thread
> to abort this worker thread
> void run();
> }
>
> void CMyThread::run
> {
> while (!m_bRequestToStop)
> {
> ... perform calculations here ...
> }
> }

"volatile" here just makes your program more likely to appear to work.
It exploits a particular feature of x86 CPU family - strong cache
coherence. As soon as one CPU writes to a memory location, the caches of
all the other CPUs are forcibly updated with the new value. This is one
of the reasons x86 does not scale beyong 4-way SMP systems - cache sync
traffic grows quadratically with the number of CPUs.

Many modern CPUs, some of which can run Windows, exhibit weak cache
coherency. Under this model, even after one CPU writes to a memory
location, the other CPUs may observe old values when reading this memory
location (if the location happens to be in this CPU's cache) for an
indeterminate (though finite) amount of time. A special machine
instruction called a "memory barrier" forces the CPU to synchronize its
cache contents with the main memory.

So, on such a system, once the main thread sets m_bRequestToStop to
true, the other threads may continue observing the value of false for
some time (though not indefinitely, so they will break out of the loop
sooner or later). Well, I guess this is not the end of the world.

Still, the correct way to implement this is one of the following:

1. Use InterlockedIncrement or InterlockedExchange to bump the flag from
0 to 1 in the main thread; use InterlockedCompareExchange in the threads
to check the value of the flag

2. Set and read the flag under a critical section

3. Use an event (see CreateEvent) instead of a boolean flag.
--
With best wishes,
Igor Tandetnik

With sufficient thrust, pigs fly just fine. However, this is not
necessarily a good idea. It is hard to be sure where they are going to
land, and it could be dangerous sitting under them as they fly
overhead. -- RFC 1925


.



Relevant Pages

  • Re: FPGA-based hardware accelerator for PC
    ... I know that in most cases the CPU ... that it contsins no cache, as BRAMs are too precious resources to be wasted ... The BRAMs are what define the opportunity, ... many threads with full associativity of memory lines using hashed MMU ...
    (comp.arch.fpga)
  • Re: Adjusting PC Hyperthreading for Spice Simulation
    ... Memory access taking hundreds of cycles? ... ago), 350 CPU cycles for a code cache miss was not atypical, but RAM ...
    (sci.electronics.design)
  • Re: Cost of calling a standard library function
    ... It accesses/reads memory using esi 4 ... > safly move it within the cache, without having to go via ebx. ... try it the same thing on a different earlier CPU, ... should check it out...for "tight inner loop" stuff, ...
    (alt.lang.asm)
  • Re: Adjusting PC Hyperthreading for Spice Simulation
    ... Memory access taking hundreds of cycles? ... ago), 350 CPU cycles for a code cache miss was not atypical, but RAM ...
    (sci.electronics.design)
  • Re: What can I check to fix system performance?
    ... it seems you have plenty of memory available: ... copies of files you have read of written lately, in a cache, in case ... processes per CPU, or 40 in all. ... Consider the disk structure. ...
    (comp.os.linux.setup)