Re: ThreadPool Questions

Tech-Archive recommends: Fix windows errors by optimizing your registry



Sorry...I posted my much more general reply before seeing this post.

Anyway...

reeset wrote:
I can definitely tell you that IO is insignificant.

So, having a thread for each CPU should help. As I mentioned, more threads than that won't.

This is very important. If you are running this code on a computer with only one CPU, and your code is truly CPU-bound, then more threads are only going to make things worse.

For the rest of this post, let's assume you've got extra CPUs you can use.

The record processing itself makes up the lionshare of the "work" being done by the application. It was part of the reason why I'd thought placing the items in a threadpool would provide some optimizations.

It could. What design you'll have to use depends on what the actual cost per-record for processing is. If each record takes seconds to process, then I think queuing a work item for each record is a very easy, workable solution.

You will have ordering issues, which are easily addressed. You will also want to take out the part of your code that stops every five records and waits for the other threads to catch up. Doing that negates much of the benefit of multi-threading. Finally, you will want to limit the thread pool size to have no more active threads than you have CPUs.

If each record processed takes something like 500 to 1000 milliseconds, then you're in a bit of a gray area. You _should_ see a benefit even if you queue one worker thread item per record, but your overhead will start to reduce your efficiency in a noticeable way.

If each record processed only takes tens of milliseconds or less, then thread scheduling and context switching could easily offset whatever gain you have from parallelizing the work, if you are only assigning a single record to each worker queue item.

Now, with all that out of the way...the design Samuel suggests seems like it should work reasonably well for you. Specifically, having a thread for i/o, feeding a queue that multiple worker threads can consume, should work nicely and prevents any need to create new worker items. You just start the worker threads and let them continue as long as there is data to process. This avoids having to worry about the exact relative cost of processing each record.

The one modification to his suggestion that I'd make is this: ordering the data is a lot easier than it seems that he's suggesting. The data can be kept in some kind of ordered list from the outset, left there in order the whole time. Here's a pseudo-code description of what I mean, using Samuel's design as the framework:

Reader thread:

while not end of data:
read record
add record to ordered list
pulse a monitor that the workers wait on

Worker thread:

while not end of data:
scan forward through the ordered list to find first unprocessed record
if the end of the list is reached, wait on the monitor and then scan again
process the record, mark it as having been processed
pulse a monitor that the consumer waits on

Writing thread:

while not end of data:
scan forward through the ordered list, handling records until reaching an unprocessed record or the end of the list
wait on the monitor and then scan again

In this way, the records always remain in the correct order. There is a slight inefficiency in that the consumer thread will occasionally have to wait at an unprocessed record even though another record later in the sequence has already been done, but assuming that the processing itself is really the bottleneck, this shouldn't really be a problem. IMHO, the simpler design more than makes up for any minor delays there (and of course the delay is only really noticed if it occurs at the very end of the processing sequence, since otherwise the consumer thread should easily catch up at the next opportunity).

You could use a LinkedList or List<> collection to keep the data ordered according to the input order of the records. Just add new data to the end of the collection. The LinkedList would be more efficient if you want to remove items from the front of the list as they are finally handled by the consumer.

You might be tempted to use a couple of Queues, but the need to keep the records in the same order as they are initially read would make that impractical, as you'd have to create some sort of intermediate data structure to ensure that processed records are added to the consumer's Queue in the same order in which they were removed from the reading thread's queue.

Other than that change, I think Samuel's suggestion may in fact be ideal for your situation.

Pete
.



Relevant Pages

  • Re: Producer-consumer threading problem
    ... in the queue and the consumer checks for ... data) pairs to the input Queue. ... terminate each worker thread. ... subprocesses; I want to terminate the subprocesses but keep the worker ...
    (comp.lang.python)
  • Re: [PATCH] sched: Fix adverse effects of NFS client on interactive response
    ... >>That's more or less independent of this issue as the distribution of CPU ... >>raised is largely solved by the time tasks spend on the active queue ... >>before moving to the expired queue rather than the order in which they ... > rescheduled on the active queue instead of the expired queue at the end ...
    (Linux-Kernel)
  • Re: Multithreading : Main thread gets blocked ? What could be the reason ?
    ... > // Start a worker thread to searck keyword in browsed page ... I do not see the corresponding lock. ... If you are consuming 70% of the CPU, that is because THERE IS NOTHING ELSE FOR IT TO DO! ... >processsing in While loop, It does some operation in every iteration. ...
    (microsoft.public.vc.mfc)
  • Re: schedstat-2.6.8.1 [was: Re: 2.6.9-rc1-mm1]
    ... +statistics for each cpu listed, and there may well be more than one ... the active queue is considered empty ... +#ifdef CONFIG_SCHEDSTATS ...
    (Linux-Kernel)
  • Re: OSDL Bug 3770
    ... one CPU caused all lower priority tasks on that CPU to be starved, ... > is an environment whose tasks scheduling follow a specific ... > you can proceed in two ways to serve the clients: ... > make an unique FIFO queue for all servers. ...
    (Linux-Kernel)