Re: Counting with distinct - Optimizer Problems

From: Steve Kass (skass_at_drew.edu)
Date: 03/17/04


Date: Wed, 17 Mar 2004 00:53:16 -0500

Nice trick, Uri. ;)

I'll add that Shawn isn't missing anything, in my opinion. Queries with
both count(*) and count(distinct ...) are often real trouble for the
optimizer, and some user intervention is probably necessary. The view
trick will help with the count(*) and can be joined to the rest of the
query, but it won't speed up the count(distinct..). I don't know if
NOEXPAND and a join will force a good plan - my guess is it will help,
but whether this solution is better than user intervention without a
view depends on how often this query needs to be run and on how
frequently there are inserts and deletes from the table. In any case,
if Shawn uses this trick, the column to replace T.i should be the
primary key column or any column that is never or very infrequently
updated, so that only deletes and inserts will contribute to the overhead.

SK

Uri Dimant wrote:

>Shawn
>I'd recommend you create an indexed view .
>create table T (
> i int
>)
>go
>
>create view T_count with schemabinding
>as
> select cast(i as bit) as val, count_big(*) T_count from dbo.T
> group by cast(i as bit)
>go
>
>create unique clustered index T_count_uci on T_count(val)
>go
>insert into T
>select OrderID
>from Northwind..[Order Details]
>go
>
>select sum(T_count) from T_count with (noexpand)
>-- uses an efficient query plan on the materialized view
>
>go
>drop view T_count
>drop table T
>
>
>"Shawn Meyer" <smeyer@interelate.com> wrote in message
>news:eKMZoO9CEHA.3256@TK2MSFTNGP09.phx.gbl...
>
>
>>I have a table defined as :
>>
>>CREATE TABLE [dbo].[job_13_event] (
>> [id] [int] IDENTITY (1, 1) NOT NULL ,
>> [recipient_id] [int] NULL ,
>> [event_type] [int] NULL ,
>> [mail_type] [int] NULL ,
>>) ON [PRIMARY]
>>
>>There is about 5 million rows in this table;
>>1,063,337 have event_type=3;
>>
>>When I run the folowing statement it takes 13 seconds
>>to run. The execution plan shows a table spool/eager spool taking
>>32%.
>>
>>SELECT
>> count(recipient_id) mail_type_count,
>> count(distinct recipient_id) dist_count,
>> mail_type
>>from
>> job_13_event where event_type = 3
>>GROUP BY mail_type
>>
>>When I run the two counts as seperate statements, the count returns in 2
>>seconds
>>and the count distinct returns in 3.
>>
>>SELECT
>> count(recipient_id) mail_type_count,
>> mail_type
>>from
>> job_13_event where event_type = 3
>>GROUP BY mail_type
>>
>>SELECT
>> count(distinct recipient_id) dist_count,
>> mail_type
>>from
>> job_13_event where event_type = 3
>>GROUP BY mail_type
>>
>>If I do a join of the two statements and use FORCEPLAN,
>>I can get the results to come back in 7 Seconds. Here is the sql for
>>
>>
>that:
>
>
>>SET FORCEPLAN ON
>>select a.dist_count, a.mail_type, b.mail_type_count from
>>( SELECT
>> count(distinct recipient_id) dist_count,
>> mail_type
>> from job_13_event
>> where event_type = 3
>> GROUP BY mail_type) a,
>>( SELECT
>> count(1) mail_type_count,
>> mail_type
>> from job_13_event
>> where event_type = 3
>> GROUP BY mail_type ) b
>>where a.mail_type = b.mail_type
>>
>>I would think that the optimizer could calculate the first simple statment
>>by :
>>sorting by the mail_type and then recipient_id and doing a count for one
>>
>>
>and
>
>
>>a distinct count for the recipient_id
>>in one scan. In that case you would get the faster operation the count(1)
>>essentially for free, resulting in a near 4 second
>>result time.
>>
>>Is there something that I am missing that would make my first statement
>>
>>
>run
>
>
>>as fast or faster than using
>>the forceplan ???
>>
>>Thanks.
>>Shawn
>>
>>
>>
>>
>
>
>
>



Relevant Pages

  • Re: DAO and Access97 WHERE clause fails
    ... That did the trick. ... BTW, I had searched extensivly on line for the answer before posting, but ... was missing the obvious, stupid coding error. ...
    (comp.lang.python)
  • Re: Retrieving registry value
    ... going to do the trick. ... I've found the CRegKey Class and its methods ... (specifically QueryStringValue), but I'm missing the code to alias the ...
    (microsoft.public.vb.general.discussion)
  • Re: Stator CB750 DOHC
    ... to get it in it is very tight and does not want to seat all the way am ... I missing something is their some trick to getting it in all the way ...
    (rec.motorcycles)
  • Re: Red gingham is so hot
    ... Look at the feedback though, ... I'm missing a trick here...! ... Get size 10 tops, get a size 20 to model them (wearing only a small ...
    (uk.people.consumers.ebay)
  • Re: Vista Forgets
    ... Hey, Shawn, this looks great. ... forum at all before. ... Your tutorial looks like it should do the trick. ...
    (microsoft.public.windows.vista.installation_setup)

Loading