Re: output.c error in multithreaded program
From: Joseph M. Newcomer (newcomer_at_flounder.com)
Date: 12/29/04
- Next message: Pierre Couderc: "Re: Programmatically selecting in a ClistCtrl"
- Previous message: Doug Harrison [MVP]: "Re: Programmatically selecting in a ClistCtrl"
- In reply to: Joe: "Re: output.c error in multithreaded program"
- Next in thread: Joe: "Re: output.c error in multithreaded program"
- Reply: Joe: "Re: output.c error in multithreaded program"
- Messages sorted by: [ date ] [ thread ]
Date: Wed, 29 Dec 2004 13:41:50 -0500
See below...
On Mon, 27 Dec 2004 12:27:19 -0000, "Joe" <not@home.com> wrote:
>
>"Joseph M. Newcomer" <newcomer@flounder.com> wrote in message
>news:81dvs0hgpp195fhu2usp9j6pj37i772srt@4ax.com...
>
>
>> >GetHttpFile and all the other internet stuff is basically a cut and paste
>> >from the demo. I think I am getting problems when the site does not reply
>or
>> >a thread has to wait too long it seems to hang. In such a case I *think*
>I
>> >am blocking other threads from internet access as the semaphores are not
>> >getting released. This problem then cascades and I get lockout of the low
>> >priority threads ( I think the type of webpages I give a priority of type
>> >LOW return a lot of info and are prone to delay).
>> *****
>> Generally, it is best to use asynchronous communication; I know there's a
>set of
>> asynchronous HTTP classes lurking somewhere, but I've not used them.
>Otherwise, you get
>> into serious problems of indefinite resource consumption. In this case, a
>given thread
>> could have several outstanding connections, which is quite feasilble if
>you use UI threads
>> (AfxBeginThread(RUNTIME_CLASS(CMyThreadClass), params) will do the job).
>> *****
>
>Yes this area is another major source of confusion for me. I have tried to
>treat the internet specific HTTP stuff as a "black box" by integrating the
>demo code from the codeproject article "as-is". I think it is when the site
>is slow to respond that I get into some sort of "hang" scenario and the
>GlobalSemaphore ends up blocking any other attempts at site access by other
>threads/processes.
>Isn't the demo code asynchronous or do you mean something else?
*****
I'm not familiar with the example, so I can't answer that question.
*****
>
>As I only get occassional errors and then only when I am running multiple
>threads/processes and the target site is slow (I think this is the scenario
>for the "hang" type error I get anyway). I do not see the error immediately.
>I find an access violation windows message and start the JIT debugger to
>find out what went wrong. The trouble is quite often the JIT quits with an
>error before getting in to the source code!!!!
>So it become very hard for me to know how to track this error down and I
>have ended up just hoping it would "go away" - buit it hasn't :)
*****
I've not seen the debugger fail to come up. The problem with this sort of timing-dependent
error is that it is indicative of some deeper bug, whether one of storage allocation,
thread management, or concurrency, as the most typical causes. Your logging of events is a
good start on looking for the conditions.
Things I would look for:
Concurrent access to shared structures from multiple threads without
proper synchronization.
Using variables in a CWinThread, particularly pointer variables,
after the CWinThread has been deleted.
Suggested technique: Set the m_bAutoDelete flag to
FALSE. If the bug "goes away", you have this sort of
error. Note that you will be leaking storage like crazy
during this diagnostic procedure, but if the bug goes
away, you've isolated a cause. Then you can fix it
and let the m_bAutoDelete remain TRUE again.
Standard dangling-pointer problems (using an object after it has
been deleted), which just happen to show up on slow-response
connections (a generalization of the above problem)
Uninitialized local variables leading to bad-pointer situations.
Uninitialized class member variables leading to bad-pointer situations.
Instead of using JIT debugging, just run the app under the debugger. This should be more
reliable than bringing it in after the fact.
In addition, you can consider putting a C++ exception handler in your thread. While it
isn't among the best of techniques, because you have to use "..." to indicate that you are
catching all exceptions, it would help you isolate the thread that is causing the problem.
I've not done this in C++, only using C exceptions, so I'm not sure of all the details.
*****
>
>
>> >
>> >At time of spotting your reply I was just about to investigate this -
>> >however I have limited experience with "internet programming" (well just
>> >this program - so I'm not sure how I can adequately test this -
>particularly
>> >in this multithreaded environment!).
>> >
>> >Have you (anyone) any observations?
>> >
>> >
>> >Secondly
>> >#######
>> >
>> >As I said I have dependencies betwen the threads.
>> >
>> >For example. I have one thread A which downloads a "header page" from the
>> >site kicks off B with a list of pages to
>> >return. A waits. B kicks off a number of threads (type C ( max ~40 )) to
>> >return the corresponding "detail" pages. B limits the maximum number of
>> >concurrent type C's running
>> >( I do this as one process may have many groups of ABC) and I have
>several
>> >processes. This is also why I have this global semaphore above as I don't
>> >wish to overload the site for obvious reasons).
>> *****
>> All you have here is a set of servers which you trigger by putting
>requests in their
>> queues. There would be no need for a global semaphore under any
>conditions; one semaphore
>> per queue would do nicely. With asynchronous communications you don't need
>a lot of
>> threads, so look into that.
>
>
>Sorry, I have used confusing terminology. This semaphore is not "Global" in
>the true meaning it is specing to the ABC "cluster". It is a member variable
>of the "heap object" I use for coordination of the cluster throughout its
>lifetime. Therefore I think I do have what you recommend in place.
>
>
>
>
>> *****
>> >
>> >So A waits until all the pages have been successfully returned ( all C's
>> >completed) as it must process all pages together - remember I can/will
>have
>> >several groups ABC in a process so I can have a lot going on at the same
>> >time.
>> *****
>> Making A wait is probably a Bad Idea. Instead, have a way of determining
>when all pages
>> have completed and then fire off an event that causes them to be
>processed. You're still
>> caught up in thinking in a synchronous model, which is rarely a good idea.
>
>Yes, and my mindset is not OOP orientated either!!
>
>
> > It leads to
>> confusing threading, which is what you have here, and unnecessarily
>complex interthread
>> signalling. Given B has the list of pages, the number is known, so there
>is no need to
>> wait at all. Just let A build the requests and go away, or at least just
>wait for a need
>> to generate more requests. There are a lot of interesting
>fully-asynchronous structures
>> that are a lot easier to deal with under situations like this.
>
>
>Well, what I have is A in a loop, if the pages returned are "bad" - say the
>site is down - it sleeps for a while and then requests the header page,
>which in turn generates another request for the detail pages (C via B).
*****
Another reason you should not be using syncrhonous models. The notion that the thread is
in a loop and waiting is also suspect. In a fully asynchronous model, you would just queue
up an event for a future time and process it at that point, without needing to sleep at
all. Note that when a thread is sleeping, no SendMessage to it will work; the sending
thread will block until the Sleep() expires, which then makes the sending thread
non-responsive to messages. The problem then cascades.
*****
>
>I request A every n seconds/minutes in any case, depending upon
>circumstances I may generate requests for C's. I looped A because I thought
>it a good enough approach to handle repeated requests.
>
>The volume I'm talking about is on an average day, is say 5 processes (could
>be treble this on a "busy" day) each with 15 type A's (could be thirty).
>Several 'A's in a process can be awake at any one time. Each A can request
>data at a maximum of say once per minute.
>
>If it didn't sleep but just terminated I would have to set up some sort of
>structure which "knows" the status of each A(failed last time etc) and when
>to kick of type A's. I saw this as being just as complex. What problems does
>A waiting then sleeping cause?
*****
Actually, if you think of "service threads" as servicing requests, you don't need to keep
creating them. You now have a "thread pool" model. This is actually simpler than models
that sleep, because the thread is always working unless there really is nothing to do.
I've used this technique several times to handle this sort of problem; for example, I have
a case in which I have to resolve IP addresses and re-establish a connection. The remote
site may be down and non-responsive. So I basically queue up requests to a connection
handler. What the connection handler does is handle an event that comes in to say "Wake up
and handle an event". It dequeues the first event on the queue, and attempts to establish
a connection (asynchronously). That's all it does. At some future point, the connection
either succeeds, in which case it passes it off to the traffic-handling thread, or it
fails, in which case it schedules a future attempt. It starts with 5-minute intervals,
then goes to 10, 15, and 30 minute intervals. All threads were very simple, and the
scheduler thread (done as a UI thread) was handled with WM_TIMER messages, because that
was the easiest way. I just did a SetTimer that reset the time so that the next WM_TIMER
message came in at the right time to trigger the next event.
THis is one case where I can't show the code, because someone else actually owns the
rights to it. Otherwise, I'd publish it.
*****
>
>
>> *****
>> >
>> >Anyway, here is the structure I have dreamed up to implement this
>dependency
>> >waiting, does it look OK?
>> >
>> >class CEventState is a heap object avaliable to all threads in a group
>ABC
>> >through a passed pointer in the input thread params.
>> >
>> >
>> >CEventState
>> >
>> >BOOL createBlockingEventSemaphore(int iSem, long lTotal, long lBlocked)
>> >{
>> >::EnterCriticalSection(&lock);
>> >handle[iSem] = CreateSemaphore( NULL,0,lBlocked,NULL);
>> >lSemCount[iSem] = lTotal;//block until this ct decrements to zero
>> >::LeaveCriticalSection(&lock);
>> >return(true);
>> >}
>> *****
>> Why enter a critical section before creating a semaphore?
>
>No reason. Mr Cut-and-Paste told me to do so :)
>
>>Why create an array of
>> semaphores? Why have an array of handles and an array of counts when the
>natural
>> representation would be an array of struct/class objects that had a handle
>and count
>> member, but there is no reason I can think of to have an array at all!
>
>Well as I said earlier I stick both semaphores in together. I have been
>experimenting with various combinations of semaphores to get the blocking
>working and its just a hang over from that really. It doesn't "hurt" though
>does it?
>
>
>
>> *****
>> >
>> >
>> >void maintainBlockingSemaphores(int iAmt)
>> >{
>> >::EnterCriticalSection(&lock);
>> >lSemCount[SEM_BLOCKING]+=iAmt; //negative iAmt decreases the blocking
>count
>> >if(lSemCount[SEM_BLOCKING]==0) //all done, so release semaphore lock
>that
>> >thread(s) are waiting on
>> >{ ReleaseSemaphore( handle[SEM_BLOCKING],1,NULL);
>> >::LeaveCriticalSection(&lock);
>> >}
>> *****
>> All previous comments apply here. Something is Very Wrong with this
>picture. The whole
>> notion of an array is deeply suspect.
>
>Again I don't really see the array as an issue as such. It just holds the
>"blocking" semaphore (wait till all C completed) and the "concurrent"
>semaphore (maximum number of concurrent C threads).
>I think it may look as though I'm trying to do something more complicated
>than I actually am.
****
Yes. If you have two different semaphores, unrelated to each other, use two separate
variables. Don't introduce gratuitous complexity if it is not needed; don't introduce
apparent dependencies if they are not needed. You would only need an array if you were
doing WaitForMultipleObjects, and I don't see that here, since the two semaphores seem
unrelated.
This is another reason that a thread pool makes more sense. You create a bunch of threads
in the thread pool, and that sets your upper bound. So you don't need a semaphore to limit
the number of threads at all by an explicit mechanism. You just dump events into a queue,
and as many threads as you have wake up and handle as many events as you have in the
queue, or as many events as you have allowed threads, whichever is smaller.
If you have different kinds of operations, you use separate queues and separate handler
threads blocked on those queues. THe result is an overall simpler structure to the entire
program.
*****
>
>What it should look like is this
>
>void maintainBlockingSemaphore(int iSem,int iAmt)
>{
>::EnterCriticalSection(&lock);
>lSemCount[iSem]+=iAmt;//negative iAmt decreases the blocking count
>
>if(lSemCount[iSem]==0)//all done, so release semaphore lock that thread(s)
>are waiting on
> ReleaseSemaphore( handle[iSem],1,NULL);
>::LeaveCriticalSection(&lock);
>}
*****
I find this code confusing, because it isn't really using a semaphore as a semaphore is
intended to be used. And for shutdown events, you should use an event, not a semaphore.
*****
>
>and called by C when its about to terminate:
> pEvent->maintainBlockingSemaphore(SEM_BLOCKING_DL,-1;
>
>pEvent is the pointer to the heap object.
>
>
>
>> *****
>> >
>> >
>> >Type A creates B then waits on
>> >
>> >WaitForSingleObject(pEvent->handle[SEM_BLOCKING],INFINITE);//waits until
>all
>> >type C threads terminated
>> >
>> *****
>> Why not allocate an object like this:
>>
>> class Waiter {
>> public:
>> Waiter(long n) { count = n; event = ::CreateEvent(NULL, TRUE,
>FALSE, NULL); }
>> // check parameters: it should be manual-reset, initial
>non-signalled
>> void Completed() {
>> if(InterlockedDecrement(&count) == 0)
>> ::SetEvent(event);
>> }
>> void Wait() { ::WaitForSingleObject(event, INFINITE); }
>> protected:
>> long count;
>> HANDLE event;
>> }
>>
>> Create a new one of these objects when you launch a thread, e.g.,
>>
>> Waiter * w = new Waiter;
>>
>> then you could follow your model and simply pass a pointer to this into
>each thread that
>> is collecting a page. As each thread collects a page, it calls the
>Completed() method.
>> Meahwnile, the thread A has done a Wait() call on the method, such as
>>
>> w->Wait();
>> delete w;
>>
>> What I'd do is have the Completed() method simply enqueue a request to the
>completion
>> handler thread when the processing was completed., and it would pass a
>pointer to the
>> Waiter object in so it could then be deleted. Or it might simply call
>> delete this;
>> internally.
>
>I looked at CreateEvent() but I thought better the devil you know and stuck
>to semaphores :)
>Of course, looking at your code suggestion it shows that Events lend
>themselves to what I'm after in a cleaner way than semaphores.
>
*****
No, better to understand the correct mechanisms and choose one based on understanding,
rather than select one because you recognize its name, no matter how inappropriate it is.
*****
>
>> Note that the need to do synchronization on the array is unnecessary.
>Actually, it should
>> be unnecessary anyway, since the element of the array is all that is being
>manipulated,
>> and it is manipulated outside any concurrent access. (If it isn't, you are
>already in such
>> deep trouble that there is no recovery, because it means that two or more
>threads are
>> going to try to access the array element at the same time).
>
>The array element in question is the blocking semaphore. Each terminating C
>thread tries to update it, and when the count reaches zero
>the semaphore is released and A is unblocked.
******
Use an event. Use InterlockedIncrement/InterlockedDecrement
******
>
>So there *is* attemped concurrent access on the semaphore as each C
>terminates, but as it is protected by the critical section, why is the fact
>that several threads may try to update concurrently it a problem? They will
>just wait untill they gain access wont they?
******
If you use InterlockedDecrement on the reference count, this is multiprocessor safe. And
there is no such concept as "concurrent access" to a semaphore; all attempts to use the
semaphore are serialized in the kernel and if two threads try to access a semaphore, only
one of them wins. So you don't need any form of concurrency protection around the
semaphore; it IS concurrency protection.
******
>
>Really the "heap object" I have talked about which has the semaphores as
>member variables is analogous to your waiter method isn't it?
>It is passed to A B C through a pointer and protected by critical sections.
*****
As I indicated, you don't need to protect a semphore against multiple threads because it
is inherently a synchronoization mechanism and is thus safe
*****
>The waiter completed() method is analogous to my
>maintainBlockingSemaphores(). When A terminates the heap object is deleted.
>The way I update the semaphore - though convoluted - *should* work shouldn't
>it?
*****
Possibly. But recall that a WaitFor on a manual-reset event has no effect on the event,
but a WaitFor on a semaphore implicitly resets it. This makes it suspect in being used for
shutdown.
*****
>
>Although I know something isn't!!!!!
>
>I appreciate you observations and recommendations.
>
>Cheers.
>
>
>
>
>
>
>
>
Joseph M. Newcomer [MVP]
email: newcomer@flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm
- Next message: Pierre Couderc: "Re: Programmatically selecting in a ClistCtrl"
- Previous message: Doug Harrison [MVP]: "Re: Programmatically selecting in a ClistCtrl"
- In reply to: Joe: "Re: output.c error in multithreaded program"
- Next in thread: Joe: "Re: output.c error in multithreaded program"
- Reply: Joe: "Re: output.c error in multithreaded program"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|