Re: Is the following code MT-Safe?

From: Trevor (trevor_at_spam.com)
Date: 06/30/04


Date: Wed, 30 Jun 2004 17:25:32 -0400

Joe,

How is a tool designed to help the programmer spot errors during debug
builds a bad thing? My asserts pop up a message box asking if I want to
ignore, break or halt the program. These only show up in a debug build
too. How is that bad?

Curious
Trevor

Joseph M. Newcomer wrote:

> It strikes me as overkill. See below.
> On 30 Jun 2004 10:06:40 -0700, creepiecrawlies@hotmail.com (Ken Durden) wrote:
>
>
>>Explanation follows code example, MFC synchronization objects used
>>because thats what I know how to use. Some psuedo-code may be present
>>where coding in all the parameters wasn't meaningful.
>>
>>class A
>>{
>>public:
>> A() :
>> _bRunning(false),
>> _stopEvent(FALSE,FALSE),
>> _stoppedEvent(FALSE,FALSE)
>> {}
>>
>> void start()
>> {
>> CSingleLock( &crit, TRUE );
>>
>> AfxBeginThread(ThreadFunc, (LPVOID) this, ... );
>
> ****
> There is no reason to lock the critical section before starting the thread. I see nothing
> here that accesses the variables of the thread within the scope of the critical section.
> ****
>
>> }
>>
>> void stop()
>> {
>> CSingleLock( &crit, TRUE );
>>
>> _stoppedEvent.Reset();
>> _stopEvent.Set();
>>
>> ::WaitForSingleObject( _stoppedEvent, INFINITE );
>
> ****
> This is actually dangerous. There is no reason to set the critical section, since nothing
> in here manipulates shared variables. And the lock won't be released until this exits.
> But the bug is deeper; see the comment below.
> ****
>
>> }
>>
>> void test()
>> {
>> CSingleLock( &crit, TRUE );
>>
>> if( _bRunning )
>> stop();
>>
>> assert( !_bRunning );
>> }
>
> *****
> There is no need to set the critical section here, since it makes it IMPOSSIBLE to stop
> the thread. As long as the thread is running, the ciritcal section is locked, so the
> stop() function will block (if called from another thread), and not be able to stop the
> thread. However, on the call to stop(), it attempts to re-acquire the critical section
> that the main thread already locked, so it passes. This is risky, because it means you
> don't actually have a guarantee about lockout, and this leads to potential disasters.
> *****
>
>>private:
>> static UINT ThreadFunc( LPVOID pParam )
>> {
>> A * a = (A*) pParam;
>>
>> a->_bRunning = true;
>> ::WaitForSingleObject( a->_stopEvent, INFINITE );
>> a->_bRunning = false;
>>
>> a->_stoppedEvent.Set();
>> }
>
> *****
> I presume there is intended to be more to this thread than what I see here. All that
> happens is that this thread blocks until someone executes stop(), Then it sets an event
> that says "I have successfully stopped". Which you are also waiting on. I believe that
> following this logic, the assert (which is a fundamental design error: NEVER use 'assert'
> in a program, because if the assertion fails, the program exits. A program should exit
> only on user command, never spontaneously) will not be taken.
>
> There is a
> ****
>
>> CCriticalSection crit;
>> CEvent _stopEvent;
>> CEvent _stoppedEvent;
>> bool _bRunning;
>>};
>>
>>int main()
>>{
>> A a;
>> a.start();
>> a.test();
>>}
>>
>>In this simplistic example, assume the only two threads you see are
>>the only two which are relevant. In this simple example, am I
>>guaranteed that the assert in the test function will not fire?
>
> *****
> There is no guarantee. Follow the logic. You have _stopped set FALSE. You start the thread
> in a.start(). You then call a.test(), and the value is still FALSE. So test does not call
> stop(). It then falls into the next line, which asserts that !m_bRunning. Fine, except
> that right before that test was executed, the timeslice of the thread that called test()
> ran out, and then, and ONLY then, is the thread permitted to run. It starts up, sets the
> running flag to true, then blocks on the stop event (which will never be set). This allows
> the first thread to resume, which then takes the assert.
>
> Anything this complex to reason about is almost certainly wrong, and there is far too much
> unnecessary synchronization going on here. Get rid of all of it, and rethink how you would
> handle the notifications. My own preference is to use I/O Completion Ports as event queues
> (when I don't have a GUI, and often when I do). You are trying to force threads into
> lockstep behavior, which is always risky and error-prone, and in this case you have a
> serious defect. Fall back to a message-driven model and most of these complexities
> evaporate. This sort of make-my-parallel-problem-look-single-threaded solution is a common
> error, and almost always leads to either major synchronization failures (as in this case)
> or deadlocks. I try to avoid it these days.
> *****
>
>
>>I'm worried that the main thread will read the value of _bRunning in
>>the test() function, and then when it comes time to do the assert,
>>simply re-use the existing value it has already read, rather than
>>going back to main memory to get the new value which was set by the
>>other thread. Is this a valid concern?
>
> *****
> Well, to make sure this doesn't happen, I would add volatile to the declaration, but that
> would not solve the fundamental error in logic here. The error is so deep that worrying
> about memory access is not even on the radar.
>
> Rewrite the code using a better mechanism. This code cannot be made to work as it is, and
> it is extremely clunky, and even prone to complete deadlock if the second thread fails to
> terminate cleanly. The critical section serves no useful purpose at all, and is a
> potential hazard. The events have serious timing problems in what you are managing.
>
> Two things to remember about AfxBeginThread()
> > By the time you return from it the thread could have already finished
> > The thread may not start for tens or hundreds of milliseconds after
> you return
>
> Which of these applies will depend on whether you have a multiprocessor, a hyperthreaded
> processor, or a heavy system load. In your case the second one was violated. I did not
> work out the logic to see if the first could apply, because there is no point in looking
> for a second bug when this one is a killer.
> *****
>
>>Does _bRunning need to be tagged as volatile?
>
> *****
> It doesn't hurt. But the problem is so deep already that it is not even worth worrying
> about
> *****
>
>>My understanding from school and reading is that it does, but that
>>supposition blows so many holes in the way most MT programs I've seen
>>are written its scary. For example, here's another example.. even
>>simpler:
>>
>>class B
>>{
>> public:
>> B() : m_pList(NULL) {}
>>
>> void add( int n )
>> {
>> CSingleLock( &crit, TRUE );
>>
>> if( m_pList == NULL )
>> m_pList = new std::list<int>();
>>
>> m_pList->push_back( n );
>> }
>>
>> int size()
>> {
>> CSingleLock( &crit, TRUE );
>>
>> if( m_pList == NULL )
>> return 0;
>>
>> return m_pList->size();
>> }
>>
>> private:
>> CCriticalSection crit;
>> std::list<int> * m_pList;
>>};
>>
>>In this example, does m_pList need to be declared as
>>std::list<int> * volatile m_pList
>
> ****
> volatile is unrelated to synchronization. Synchronization keeps two threads from touching
> the same object at the same time. volatile keeps the compiler from optimizing accesses and
> caching intermediate results. We tend not to see volatile-related errors because the
> Microsoft C compilers are far more conservative than other C compilers I've worked on,
> where volatile was essential for correctness. Note that the C language would permit the
> compiler to cache values, but only during the execution of a function; the values would be
> recomputed on the next call. It just happens that the Microsoft compiler will generate
> thread-safe optimizations when many other compilers will not.
> ****
>
>>Suppose I have a dozen threads all competing to add items to this
>>list, without volatile, is there a possibility that one thread will
>>have a "cached" value of m_pList which is out of date? If the threads
>>only called add(), it seems that there is no way for the NULL value of
>>m_pList to be cached in a register; what if the threads also called
>>size() prior to calling add, however, would this introduce a
>>possibility for the pointer value to be stored in a register?
>
> *****
> Apparently not in the Microsoft C compiler. Note also that the structure of compilers is
> such that these cached values are only retained during the execution of a function, and
> not across invocations. Because of the explicit synchronization, you cannot have two
> functions accessing the variables at the same time, and in fact the function blocks before
> it would have the opportunity to fetch a value which could be made stale. In the presence
> of optimizations, code motion outside loops, etc., where the synchronization primitives,
> since they appear as ordinary function calls, might allow code motions around them (I've
> worked on one such compiler, years ago, and we had to manually force volatile semantics
> because the notion of the keyword "volatile" had not been invented yet), could lead to
> problems. But in the examples you show, there can be no caching of intermediate results.
> ******
>
>>Given the relatively small set of registers on Intel processors, how
>>big an issue is volatile on that platform? What effect does locality
>>of referencing have? If I reference the same variable 30 C lines apart
>
>>from each other does that make a difference from accessing it 2 lines
>
>>apart? Do non-inlined function-calls change this behavior?
>
> ****
> It has nothng to do with the size of the register set. Locality of reference is
> irrelevant; that is handled by caching, which has guaranteed correct semantics even in a
> multiprocessor.
>
> The distance of the reference is completely irrelevant. Compilers don't care about
> distance, they only care about intermediate value lifetimes, which are computed by flow
> analysis. While increasing distance increases the probability that another computation
> could invalidate the value, thus shortening its lifetime, the last line of a thousand-line
> function could just as easily reuse a common subexpression computed on the first line. A
> compiler performs a number of tradeoff metrics in computing temporary lifetimes, including
> the cost of maintaining the value across lines of code (would the intervening code get
> larger as a consequence? If optimizing for size, then the cse would be discarded and
> recomputed because even with the recomputation the code would be smaller).
> *****
>
>>What are the general guidelines for using volatile? Even if I've got
>>my locks and events all lined up perfectly, it seems I can still get
>>burnt without it. I've got a beefed up scenario which models class A
>>(the first one), I'm going to add volatile to see if I can change the
>>behavior, but its (dread) low-frequency intermittent problem, so I
>>won't know for a month if it helped probably.
>
> ****
> First, learn how to use locks correctly. I see no valid use of locks in the above example.
> I see a potential deadlock, and a place where the lock provides no synchronization at all,
> and I don't see any way that other forms of lock,ing, or the use of volatile, can salvage
> this code.
>
> First, fix the code. Then, you only need to worry about volatile when you have common
> subexpressions in contexts where code motions can do you in, and a compiler less
> conservative than Microsoft C6 and C7 appear to be. Remember that Microsoft is still free
> to change the implementation of the compiler as long as it remains consistent with the C
> standard, and the C standard does not require sequencing points produce consistent results
> in non-volatile variables (see the C99 standard). The earlier standards were vague, but
> the C99 standard really tightens up the specification and gives compiler writers a lot
> more freedom to cache cses.
> *****
>
>>Thanks alot,
>>-Ken
>
>
> Joseph M. Newcomer [MVP]
> email: newcomer@flounder.com
> Web: http://www.flounder.com
> MVP Tips: http://www.flounder.com/mvp_tips.htm



Relevant Pages

  • Re: Is the following code MT-Safe?
    ... the assert (which is a fundamental design error: ... and almost always leads to either major synchronization failures ... >Does _bRunning need to be tagged as volatile? ... compiler to cache values, but only during the execution of a function; ...
    (microsoft.public.vc.mfc)
  • Re: Share .cpp and .h along projects
    ... pvector = InterlockedExchangePointerAcquire(&g_sharedVector, ... I really would like to hear what you think volatile accomplishes ... You can't require people to use volatile on top of synchronization. ... compiler useful for multithreaded programming. ...
    (microsoft.public.vc.language)
  • Re: Is the following code MT-Safe?
    ... I meant his use of synchronization. ... build is called a "compiler". ... ASSERT, the MFC macro, does what you describe. ... >> volatile is unrelated to synchronization. ...
    (microsoft.public.vc.mfc)
  • Re: Share .cpp and .h along projects
    ... when memory barriers are not used, volatile exactly prevents the ... source files between DLLs, changing optimization settings, etc. ... will be correct on any version of the C++ compiler (even non-Microsoft ... You can't require people to use volatile on top of synchronization. ...
    (microsoft.public.vc.language)
  • Re: Volatile + multithreading
    ... The volatile keyword has little use in multithreaded programming. ... > field or global seems to have no impact because it seems the compiler will ... operations as "special" and suppress optimizations around them. ... > majority of the multithreading issues that could have resulted from ...
    (microsoft.public.vc.language)