Re: CMutex /CEvent (multiple threads)



Comment below...
On Fri, 25 Sep 2009 00:41:47 -0700 (PDT), Goran <goran.pusic@xxxxxxxxx> wrote:

The problem is that scoped locking allows bad code to give the illusion of being correct,
whereas explicit lock/unlock at least forces bad code to fail in ways less likely to
propagate damage.  The one thing Java got right and C# got wrong was the way the compiler
deals with exception detection.  If your function does not handle an exception in Java,
and you call a method that can throw an exception, the compiler tells you that you have an
error.  In C# (and of course, in C++ in general) this does not happen.  I think this is
bad design.

Wow, that's a surprise! (Thinking that Java has it right, as opposed
to... well, others, in mainstream). I think you'd be surprised to hear
that there's quite a few Java people who seriously dislike Java
exception specifications, and think that language design decision was
an error. And there's at least one very important Java library
(Spring) that just does everything through RuntimeException.

I think it is essential that any error such as an access fault or divide by zero should be
turned into a clean C++ exception throw.  We should not have the weird mix of SEH and C++
Exceptions that we seem to live with now.

For purpose of rigor, that idea it's IMO a mixed bag. VCL of Delphi
and C++Builder does that. Perhaps. It should be noted that, once we
step onto an SE, code already dropped the ball, and that, at an
unknown time in the past. That moment is quite recent only if we are
lucky.

No.  What I'm saying is that if a mutex is abandoned, it is essential that we know exactly
that this is what happened.  I am insisting that the MFC library suppor the design of the
Mutex, not some weird abstraction that is a subset of the Mutex, because the original
designer did not understand what a Mutex was and that the notion that Lock could return
only TWO values was a deep and fundamental design flaw.  Because of this design flaw, the
implementation disguises the correct behavior, which is to distinguish the two cases,
"Lock acquired normally" and "Lock acquired after abandonment".  Note that you cannot
actually use virtual functions to correct this defect because if you override
CSyncObject::Lock in your derived mutex class, it still has to return TRUE or FALSE.  So
you need an out-of-band communication mechanism, such as using SetLastError to indicate
the state actually represented by "TRUE".  Note that (a) none of this is obvious to far
too many programmers, who can barely grasp the concept of synchronization and (b) because
none of this is discussed, they naively pretend they are working with an abstraction that
is a correct reflection of reality.  Don't protect me from information I need!  And don't
try to "simplify" an interface that actually NEEDS to be complex enough to represent the
underlying semantics of the object!

As a fan of exceptions, I must disagree. CSyncObject::Lock should IMO
work like this:

1. when lock is acquired, returns true
2. when timeout expires, returns false
3. anything else provokes an exception.
****
Actually, this solves the problem of out-of-band communication I alluded to. Now, if the
CMutex class been done right, we wouldn't be having this debate. So here's a design for a
CMutex class that makes sense (alas, this is what SHOULD have been done in the original
design, had the original designer not be handicapped by being (a) clueless and (b) having
a really-badly-broken C++ compiler at the time this design was done [exceptions didn't
work well in the 4.x compilers and earlier])

The implementation could be

class CSyncException : public CException {
public:
CSyncException(DWORD r) { reason = r; }
DWORD GetExceptionReason() { return reason; }
protected:
DWORD reason;
};

class CSyncFailed : public CSyncException {
public:
CSyncFailed : CSyncException(WAIT_FAILED) {}
};

class CSyncAbandoned : public CSyncException {
public
CSyncAbandoned : CSyncException(WAIT_ABANDONED) {}
};

(and, frankly, I don't see why the function should not be
void Lock()
and we throw a CSyncTimeout exception on timeout!)

class CSyncTImeout : public CSyncException {
public:
CSyncTimeout : CSyncException(WAIT_TIMEOUT) {}
};

and in the Lock() code it would look something like

DWORD result = WFSO(mutex, timeout);
switch(result)
{
case WAIT_TIMEOUT:
throw new CSyncTimeout
case WAIT_OBJECT_0:
return;
case WAIT_FAILED:
throw new CSyncFailed;
case WAIT_ABANDONED:
throw new CSyncAbandoned;
default:
throw new CSyncException(result);
}



So now my code looks like

{
CMutex lock;
... stuff
lock.Lock();
...synchronized stuff
// unlock not needed
}
note I have no explicit Unlock because I rely on the destructor CMutex::~CMutex to unlock.
So I can legitimately write things like
if(condition)
return;
inside the body of the ...synchronized stuff and the mutex willl be unlocked. Sounds
clean and elegant. But it has nasty implications.

* If, at the point of return, the data structure I was locking is consistent, I win.
* If, at the point of return, the data structure I was locking is inconsistent, it is
unlocked anyway and I lose big, because everyone who is waiting eventually gets a "you
have acquired the lock successfully" notification.
* If the ...synchronized stuff calls some method which throws an exception, I *might* be
left in an inconsistent state, but fail to notice, because I wasn't catching the exception
locally and therefore did not detect that it had happened.
* If the Lock() throws an exception because WAIT_FAILED or WAIT_TIMEOUT, I have not
changed the data and it is consistent (we will assume the data is consistent upon entry to
this function)
* If the Lock() throws an exception because of WAIT_ABANDONED, *someone else* has
potentially changed the data, and I have no guarantee of its integrity.

Note that last condition: I actually own the lock, but the data is potentially corrupted.
So first, I have to distinguish the case where I have WAIT_FAILED (data is safe) and
WAIT_ABANDONED (data may be corrupt), and then I have to deal with releasing the mutex.
OK, suppose I had written

CMutex lock;
try {
lock.Lock();
...synchronized stuff
return;
}
catch(CSyncTImeout * e)
{ /* timeout */
...do something if appropriate
e->Delete();
...probably tell our caller we failed
} /* timeout */
catch(CSyncFailed * e)
{ /* failed */
...do something if appropriate
e->Delete();
...probably tell our caller we failed
} /* failed */
catch(CSyncAbandoned * e)
{ /* abandoned */
...roll back transaction
e->Delete();
...probably tell our caller we failed
} /* abandoned */
catch(CSyncException * e)
{ /* other sync error */
...do something if appropriate
e->Delete();
....probably tell our caller we failed
} /* other sync error */
catch(...)
{ /* anything else */
...roll back transaction
throw;
} /* anything else */

What do I do under WAIT_ABANDONED? Actually, what I *want* to do is to say "signal all
threads that are waiting on this mutex that we are screwed" but there is no mechanism for
that. So the only remaining correct thing to do is to "roll back" the transaction. And
that's what I have to do to maintain robustness. And I win. Problem solved.

BUT (and there's always a BUT) I've actually inherited another problem. Suppose that the
synchronized stuff calls someone who throws an exception. I'm screwed. The exception
goes flying by and I never see it. The data might be inconsistent, but I never catch it
and never get a chance to roll it back.

Worse still, suppose I call a function that does a Lock and doesn't have a handler, and
that lock fails with WAIT_FAILED. All I get is WAIT_FAILED. Naively thinking it is my
lock, I know that this means the data is intact. OK, so I have to do

CMutex lock;
try { /* lock */
lock.Lock();
try { /* synchronized section */
...open transaction
...synchronized stuff
...commit transaction;
return;
} /* synchronized section */
catch(CSyncException * e)
{ /* sync problem */
...roll back transaction
e->Delete();
} /* sync problem */
catch(...)
{
...roll back transaction
throw;
}
} /* lock */
catch(CSyncTImeout * e)
{ /* timeout */
...do something if appropriate
e->Delete();
...probably tell our caller we failed
} /* timeout */
catch(CSyncFailed * e)
{ /* failed */
...do something if appropriate
e->Delete();
...probably tell our caller we failed
} /* failed */
catch(CSyncAbandoned * e)
{ /* abandoned */
...roll back transaction
e->Delete();
...probably tell our caller we failed
} /* abandoned */
catch(CSyncException * e)
{ /* other sync error */
...do something if appropriate
e->Delete();
....probably tell our caller we failed
} /* other sync error */
catch(...)
{ /* anything else */
...roll back transaction
throw;
} /* anything else */

And actually, that is the correct code for doing this. Quite a distance from the current
mythos that

CMutex lock;
if(lock.Lock())
...synchronized stuff

is correct and sufficient

And, by the way, just to simplify the code, I didn't show how to deal with the situations
where the start-transaction, commit, and/or rollback operations can themselves throw
exceptions.

This is why the terminal multiplexor went from a listing 3/4" thick to one 4" thick in a
year. Since every kernel call could throw an exception each location had a try/catch
block that was preserving semantic consistency; every change was treated as a change that
had to be rolled back upon error. If there was some reason the transaction could not be
rolled back (including the fact that kernel calls had created state changes that were too
complex to roll back) my solution, and this was considered the rare, drastic solution, was
to throw a TOTAL_DISASTER exception. The recovery from this was to zero the heap,
complete all pending messages with an error, reconstruct the heap, do my best to restore
the kernel state to a known starting point (and I'm referring only to the kernel state of
this one process; imagine in Windows this might entail CancelIo, for example) and start
the main "message pump" loop all over again (the Tmux had a message loop that was driven
by the message-based I/O system we used)

Now look at the "management problem" here. If I fail to understand the classes of
exceptions I can get from called functions, I can have a disaster. The Java approach was
to recognize this and FORCE the programmer to handle exceptions, either by writing a
handler or stating that their function could pass an unknown exception back. This
meticulousness interferes with the ability of a programmer to dash off
quick-and-dirty-and-incorrect code. Instead, they are forced to write correct code. Wow!
(I have taught Java-based courses and written enough Java to appreciate this. Based on my
experience, I find the notion that the compiler forces me to be correct to be refreshing)
joe


Then, an abnormal condition can be signaled in overrided classes, in a
proper C++ manner, additional failure modes included, without changing
the basic specification, and with possibility to handle any "derived"
special case separately. That's how I would override Lock in CMutex-
derivative to eventually handle WAIT_ABANDONED.

Goran.
Joseph M. Newcomer [MVP]
email: newcomer@xxxxxxxxxxxx
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
.



Relevant Pages

  • Re: commit after select
    ... > turned of the autocimmit in java;) ... > This seems to work but now he also seems to lock the tables when I ... Why do you set autocommit to false if you're only selecting? ... Just set it directly before you start a transaction and switch ...
    (comp.lang.java.databases)
  • =?iso-8859-1?q?D=E9j=E0_vu_bug:_The_Matrix_uses_the_.NET_Framework!?=
    ... encounters a .NET Framework bug. ... intermittent lock leak that was not the kind you find in text book. ... exception is thrown in either A, ... private static void Throw ...
    (microsoft.public.dotnet.framework)
  • =?iso-8859-1?q?Re:_Beyond_Java_Tiger_-_Defekte_und_L=FCcken_in_Java?=
    ... > synchronized-Block erweitert und dann eine Bibliothek für Threads zur ... man kann nicht sicher mehrere locks requiren ... > Java machen kann, wenn man denn will. ... > eine Exception soll abgefangen werden damit ich noch ein CleanUp machen ...
    (de.comp.lang.java)
  • BEGIN TRANSACTION problem
    ... there was no BEGIN TRANSACTION. ... The test prior to the exception is the first to execute the parent ... The is the exception the unit test is expecting. ... Rollback statement faile with "The ROLLBACK TRANSACTION request has no ...
    (microsoft.public.dotnet.framework.adonet)
  • RE: HELP on New request is not allowed to start because [1264822]
    ... allowed to start because it should come with valid transaction descriptor. ... System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean ... TRXOrders.CLogin.GetTradexUserIDFromQube(String QubeUserID, String GroupCode) ... This error always occurs when we issue the command to SQL Server. ...
    (microsoft.public.dotnet.framework.adonet)