Re: Is the following code MT-Safe?
From: Joseph M. Newcomer (newcomer_at_flounder.com)
Date: 06/30/04
- Next message: Joseph M. Newcomer: "Re: Sandard dll"
- Previous message: GuitarBill: "Re: Sharing a socket between threads?"
- In reply to: Ken Durden: "Is the following code MT-Safe?"
- Next in thread: Trevor: "Re: Is the following code MT-Safe?"
- Reply: Trevor: "Re: Is the following code MT-Safe?"
- Reply: Ken Durden: "Re: Is the following code MT-Safe?"
- Messages sorted by: [ date ] [ thread ]
Date: Wed, 30 Jun 2004 14:35:28 -0400
It strikes me as overkill. See below.
On 30 Jun 2004 10:06:40 -0700, creepiecrawlies@hotmail.com (Ken Durden) wrote:
>Explanation follows code example, MFC synchronization objects used
>because thats what I know how to use. Some psuedo-code may be present
>where coding in all the parameters wasn't meaningful.
>
>class A
>{
>public:
> A() :
> _bRunning(false),
> _stopEvent(FALSE,FALSE),
> _stoppedEvent(FALSE,FALSE)
> {}
>
> void start()
> {
> CSingleLock( &crit, TRUE );
>
> AfxBeginThread(ThreadFunc, (LPVOID) this, ... );
****
There is no reason to lock the critical section before starting the thread. I see nothing
here that accesses the variables of the thread within the scope of the critical section.
****
> }
>
> void stop()
> {
> CSingleLock( &crit, TRUE );
>
> _stoppedEvent.Reset();
> _stopEvent.Set();
>
> ::WaitForSingleObject( _stoppedEvent, INFINITE );
****
This is actually dangerous. There is no reason to set the critical section, since nothing
in here manipulates shared variables. And the lock won't be released until this exits.
But the bug is deeper; see the comment below.
****
> }
>
> void test()
> {
> CSingleLock( &crit, TRUE );
>
> if( _bRunning )
> stop();
>
> assert( !_bRunning );
> }
*****
There is no need to set the critical section here, since it makes it IMPOSSIBLE to stop
the thread. As long as the thread is running, the ciritcal section is locked, so the
stop() function will block (if called from another thread), and not be able to stop the
thread. However, on the call to stop(), it attempts to re-acquire the critical section
that the main thread already locked, so it passes. This is risky, because it means you
don't actually have a guarantee about lockout, and this leads to potential disasters.
*****
>
>private:
> static UINT ThreadFunc( LPVOID pParam )
> {
> A * a = (A*) pParam;
>
> a->_bRunning = true;
> ::WaitForSingleObject( a->_stopEvent, INFINITE );
> a->_bRunning = false;
>
> a->_stoppedEvent.Set();
> }
*****
I presume there is intended to be more to this thread than what I see here. All that
happens is that this thread blocks until someone executes stop(), Then it sets an event
that says "I have successfully stopped". Which you are also waiting on. I believe that
following this logic, the assert (which is a fundamental design error: NEVER use 'assert'
in a program, because if the assertion fails, the program exits. A program should exit
only on user command, never spontaneously) will not be taken.
There is a
****
>
> CCriticalSection crit;
> CEvent _stopEvent;
> CEvent _stoppedEvent;
> bool _bRunning;
>};
>
>int main()
>{
> A a;
> a.start();
> a.test();
>}
>
>In this simplistic example, assume the only two threads you see are
>the only two which are relevant. In this simple example, am I
>guaranteed that the assert in the test function will not fire?
*****
There is no guarantee. Follow the logic. You have _stopped set FALSE. You start the thread
in a.start(). You then call a.test(), and the value is still FALSE. So test does not call
stop(). It then falls into the next line, which asserts that !m_bRunning. Fine, except
that right before that test was executed, the timeslice of the thread that called test()
ran out, and then, and ONLY then, is the thread permitted to run. It starts up, sets the
running flag to true, then blocks on the stop event (which will never be set). This allows
the first thread to resume, which then takes the assert.
Anything this complex to reason about is almost certainly wrong, and there is far too much
unnecessary synchronization going on here. Get rid of all of it, and rethink how you would
handle the notifications. My own preference is to use I/O Completion Ports as event queues
(when I don't have a GUI, and often when I do). You are trying to force threads into
lockstep behavior, which is always risky and error-prone, and in this case you have a
serious defect. Fall back to a message-driven model and most of these complexities
evaporate. This sort of make-my-parallel-problem-look-single-threaded solution is a common
error, and almost always leads to either major synchronization failures (as in this case)
or deadlocks. I try to avoid it these days.
*****
>
>I'm worried that the main thread will read the value of _bRunning in
>the test() function, and then when it comes time to do the assert,
>simply re-use the existing value it has already read, rather than
>going back to main memory to get the new value which was set by the
>other thread. Is this a valid concern?
*****
Well, to make sure this doesn't happen, I would add volatile to the declaration, but that
would not solve the fundamental error in logic here. The error is so deep that worrying
about memory access is not even on the radar.
Rewrite the code using a better mechanism. This code cannot be made to work as it is, and
it is extremely clunky, and even prone to complete deadlock if the second thread fails to
terminate cleanly. The critical section serves no useful purpose at all, and is a
potential hazard. The events have serious timing problems in what you are managing.
Two things to remember about AfxBeginThread()
> By the time you return from it the thread could have already finished
> The thread may not start for tens or hundreds of milliseconds after
you return
Which of these applies will depend on whether you have a multiprocessor, a hyperthreaded
processor, or a heavy system load. In your case the second one was violated. I did not
work out the logic to see if the first could apply, because there is no point in looking
for a second bug when this one is a killer.
*****
>
>Does _bRunning need to be tagged as volatile?
*****
It doesn't hurt. But the problem is so deep already that it is not even worth worrying
about
*****
>
>My understanding from school and reading is that it does, but that
>supposition blows so many holes in the way most MT programs I've seen
>are written its scary. For example, here's another example.. even
>simpler:
>
>class B
>{
> public:
> B() : m_pList(NULL) {}
>
> void add( int n )
> {
> CSingleLock( &crit, TRUE );
>
> if( m_pList == NULL )
> m_pList = new std::list<int>();
>
> m_pList->push_back( n );
> }
>
> int size()
> {
> CSingleLock( &crit, TRUE );
>
> if( m_pList == NULL )
> return 0;
>
> return m_pList->size();
> }
>
> private:
> CCriticalSection crit;
> std::list<int> * m_pList;
>};
>
>In this example, does m_pList need to be declared as
>std::list<int> * volatile m_pList
****
volatile is unrelated to synchronization. Synchronization keeps two threads from touching
the same object at the same time. volatile keeps the compiler from optimizing accesses and
caching intermediate results. We tend not to see volatile-related errors because the
Microsoft C compilers are far more conservative than other C compilers I've worked on,
where volatile was essential for correctness. Note that the C language would permit the
compiler to cache values, but only during the execution of a function; the values would be
recomputed on the next call. It just happens that the Microsoft compiler will generate
thread-safe optimizations when many other compilers will not.
****
>
>Suppose I have a dozen threads all competing to add items to this
>list, without volatile, is there a possibility that one thread will
>have a "cached" value of m_pList which is out of date? If the threads
>only called add(), it seems that there is no way for the NULL value of
>m_pList to be cached in a register; what if the threads also called
>size() prior to calling add, however, would this introduce a
>possibility for the pointer value to be stored in a register?
*****
Apparently not in the Microsoft C compiler. Note also that the structure of compilers is
such that these cached values are only retained during the execution of a function, and
not across invocations. Because of the explicit synchronization, you cannot have two
functions accessing the variables at the same time, and in fact the function blocks before
it would have the opportunity to fetch a value which could be made stale. In the presence
of optimizations, code motion outside loops, etc., where the synchronization primitives,
since they appear as ordinary function calls, might allow code motions around them (I've
worked on one such compiler, years ago, and we had to manually force volatile semantics
because the notion of the keyword "volatile" had not been invented yet), could lead to
problems. But in the examples you show, there can be no caching of intermediate results.
******
>
>Given the relatively small set of registers on Intel processors, how
>big an issue is volatile on that platform? What effect does locality
>of referencing have? If I reference the same variable 30 C lines apart
>from each other does that make a difference from accessing it 2 lines
>apart? Do non-inlined function-calls change this behavior?
****
It has nothng to do with the size of the register set. Locality of reference is
irrelevant; that is handled by caching, which has guaranteed correct semantics even in a
multiprocessor.
The distance of the reference is completely irrelevant. Compilers don't care about
distance, they only care about intermediate value lifetimes, which are computed by flow
analysis. While increasing distance increases the probability that another computation
could invalidate the value, thus shortening its lifetime, the last line of a thousand-line
function could just as easily reuse a common subexpression computed on the first line. A
compiler performs a number of tradeoff metrics in computing temporary lifetimes, including
the cost of maintaining the value across lines of code (would the intervening code get
larger as a consequence? If optimizing for size, then the cse would be discarded and
recomputed because even with the recomputation the code would be smaller).
*****
>
>What are the general guidelines for using volatile? Even if I've got
>my locks and events all lined up perfectly, it seems I can still get
>burnt without it. I've got a beefed up scenario which models class A
>(the first one), I'm going to add volatile to see if I can change the
>behavior, but its (dread) low-frequency intermittent problem, so I
>won't know for a month if it helped probably.
****
First, learn how to use locks correctly. I see no valid use of locks in the above example.
I see a potential deadlock, and a place where the lock provides no synchronization at all,
and I don't see any way that other forms of lock,ing, or the use of volatile, can salvage
this code.
First, fix the code. Then, you only need to worry about volatile when you have common
subexpressions in contexts where code motions can do you in, and a compiler less
conservative than Microsoft C6 and C7 appear to be. Remember that Microsoft is still free
to change the implementation of the compiler as long as it remains consistent with the C
standard, and the C standard does not require sequencing points produce consistent results
in non-volatile variables (see the C99 standard). The earlier standards were vague, but
the C99 standard really tightens up the specification and gives compiler writers a lot
more freedom to cache cses.
*****
>
>Thanks alot,
>-Ken
Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
- Next message: Joseph M. Newcomer: "Re: Sandard dll"
- Previous message: GuitarBill: "Re: Sharing a socket between threads?"
- In reply to: Ken Durden: "Is the following code MT-Safe?"
- Next in thread: Trevor: "Re: Is the following code MT-Safe?"
- Reply: Trevor: "Re: Is the following code MT-Safe?"
- Reply: Ken Durden: "Re: Is the following code MT-Safe?"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|