Re: volatile and win32 multithreading
From: Jerry Coffin (jcoffin_at_taeus.us)
Date: 01/26/05
- Next message: George Hester: "Re: libctiny.lib"
- Previous message: Larry Brasfield: "Re: This comapany Microsoft really knows how to P** me off"
- Next in thread: Maxim Yegorushkin: "Re: volatile and win32 multithreading"
- Maybe reply: Maxim Yegorushkin: "Re: volatile and win32 multithreading"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 25 Jan 2005 20:52:53 -0700
In article <esk28GxAFHA.2712@TK2MSFTNGP15.phx.gbl>, MikeAThon2000
@nospam.hotmail.com says...
[ ... ]
> Thanks for your thorough reply. I have seen some of those comments before
> (guess where <g!>) and I admit that I still do not fully understand all of
> the points you make.
Perhaps it would help to step back for a moment.
Volatile makes the compiler produce code that reads from memory when
the value is read and writes to memory when the value is written. The
problems arise primarily due to caching: when the code attempts to
read or write memory, it will typically only REALLY read from/write
to the cache. What goes into the cache doesn't normally get written
out to memory immediately at all.
In fact, the processor typically attempts to keep things in the cache
as long as it can. It will flush something out to memory only when it
needs to. The primary reason is that the cache is full, and it needs
to make room to load something new. In this case, attempts to find
the least recently used line in the right part of the cache, and
flushes it out to memory. There are two problems with this: first of
all, keeping track of the time any given item is used takes too much
space, so it really only takes a guess at least-recently used.
Second, an item at any particular address always gets put into one of
a (fairly small) number of specific places in the cache.
Assume I have two volatile variables X and Y. I update them in that
order, and since they're volatile, I assume other processors will see
the updates in that order as well. Having finished that, I do
something else that reads from A, B, C and D. Just for the sake of
argument, we'll assume this is a four-way set-associative cache, and
that A, B, C and D all happen to map to the same cache lines as Y
did. Since we read A, B and C after we updated Y, when we read D, Y
is now the oldest item in those cache lines, so it gets flushed out
to memory. Meanwhile, we haven't done anything that touches the part
of the cache that X is in, so it hasn't been written out yet, and we
haven't even a good idea of when it will be.
Worse: memory is already a bottleneck in most situations. When you
add more processors, updating lots of things to memory becomes even
more of a bottleneck. Therefore, the more processors you want to
support, the harder you work at keeping things in caches, and the
more you (usually) relax how up-to-date you keep memory. Now that
processor clock speeds aren't going up constantly like they used to,
we can expect nearly all machines to start to have more and more
processors. Code that can't take advantage of them will be considered
poor, and code that works incorrectly on an MP machine will become
essentially unusable.
--
Later,
Jerry.
The universe is a figment of its own imagination.
- Next message: George Hester: "Re: libctiny.lib"
- Previous message: Larry Brasfield: "Re: This comapany Microsoft really knows how to P** me off"
- Next in thread: Maxim Yegorushkin: "Re: volatile and win32 multithreading"
- Maybe reply: Maxim Yegorushkin: "Re: volatile and win32 multithreading"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|