Re: Rowset solution is sought. Please help

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance

From: Hugo Kornelis (hugo_at_pe_NO_rFact.in_SPAM_fo)
Date: 09/24/04


Date: Sat, 25 Sep 2004 00:37:32 +0200

On Thu, 23 Sep 2004 22:41:35 -0500, Farmer wrote:

>But back your solution, Hugo.
>I am thankful to you that not only that you have a "theoretical solution"
>but a practical one too.
>I am just enjoying learning from your code.
>
>However, there some kind of issue in your code. I am not sure if I can
>figure it out by myself; I will try; I am quite curious to explore and
>figure it out; but here is an issue.
>
>select ordCustRequestDate, BatchID
>, count(*) OrdersQty, sum(Part_Qty) PartTotal
>from #tmp
>group by ordCustRequestDate, BatchID
>having count(*) > 1 and sum(Part_Qty) > 400

(snip)

Hi Vladimir,

Ouch! Guess I didn't look close enough after all, before posting the thing
of yesterday.... I used your query above and in my test data, I even got
batches of well over 15,000 parts !!

As is so often the case, the error was very small. I had left out just one
tiny line of code. I won't copy the full script again, but I'll put the
relevant part here, so that you can make the change in your copy of the
code as well.

  -- Assign one order to each new batch, starting from the largest
quantity.
  -- To satisfy ANSI standard syntax, the subquery has to be repeated;
  -- this will probably hurt performance (much!)
  UPDATE t
  SET BatchID = b.BatchID
  FROM #tmp AS t
  INNER JOIN #batches AS b
        ON b.SpaceLeft IS NULL
        AND b.ordCustRequestDate = t.ordCustRequestDate
        AND (SELECT COUNT(*)
              FROM #batches AS b2
              WHERE b2.ordCustRequestDate = t.ordCustRequestDate
              AND b2.SpaceLeft IS NULL
              AND b2.BatchID <= b.BatchID)
           = (SELECT COUNT(*)
              FROM #tmp AS t2
              WHERE t2.ordCustRequestDate = t.ordCustRequestDate
              AND t2.BatchID IS NULL
              AND (t2.Part_Qty > t.Part_Qty
                OR (t2.Part_Qty = t.Part_Qty AND t2.ordID <= t.ordID)))
  WHERE t.BatchID IS NULL

(The last line: "WHERE t.BatchID IS NULL" was missing in my previous post
and should be added. This could cause orders already assigned to a batch
to be reassigned to a new batch - and in some cases, many orders got all
inserted into the same batch....)

The one thing that still worries me about my solution is the degration of
performance as the number of rows in the table increases. I'm currently
running a quick test to see if significant gain can be accomplished by
moving orders that are assigned to a batch from #tmp to another table, but
I think that execution time will always increase exponentially with the
size of the input.

I'd be interested to hear what execution times you see against your real
production data and how that compares to the cursor. I'd also like to hear
how big the actual difference is in number of batches used by my solution
vs yours. Would you mind sharing this information with me, once you've
tested my (now hopefully completely debugged) code on your data?

Best, Hugo

-- 
(Remove _NO_ and _SPAM_ to get my e-mail address)


Relevant Pages

  • Re: Multithreaded dataprocessing too slow... Help!
    ... Breaks up data into batches and feeds the individual batches to the ... > worker threads when they have finished processing a batch. ... >> status is Finished it assignes a new chunk of data to it for processing. ...
    (microsoft.public.dotnet.framework.performance)
  • Re: Multithreaded dataprocessing too slow... Help!
    ... Breaks up data into batches and feeds the individual batches to the ... worker threads when they have finished processing a batch. ... > status is Finished it assignes a new chunk of data to it for processing. ...
    (microsoft.public.dotnet.framework.performance)
  • Re: Multithreaded dataprocessing too slow... Help!
    ... Breaks up data into batches and feeds the individual batches to the ... >> worker threads when they have finished processing a batch. ... >>> status is Finished it assignes a new chunk of data to it for ... >>> 1012 then this is the chunk that's being assigned to the thread to ...
    (microsoft.public.dotnet.framework.performance)
  • Re: When is autocovariance small indicating independent values?
    ... So I divide my set of values in batches of same duration. ... the challenge is to find a batch size that satisfies these ... autocovariance is increasing, but not monotonically. ...
    (sci.stat.math)
  • Re: When is autocovariance small indicating independent values?
    ... So I divide my set of values in batches of same duration. ... the challenge is to find a batch size that satisfies these ... autocovariance is increasing, but not monotonically. ...
    (sci.stat.math)