Re: Index with strange statistics?

From: Gert-Jan Strik (sorry_at_toomuchspamalready.nl)
Date: 03/22/04


Date: Mon, 22 Mar 2004 20:58:28 +0100

Karl,

Sorry about the late response.

It is true that the index has no appropriate statistics to make any
reliable estimation. These statistics would not be of much use anyway,
because for this query the index cannot be seeked or partially scanned
anyway.

However, because (the leaf level of) the index does contain the insaddr
column, a full scan of this index is still possible, and will satisfy
the query. The clustered index (it's branches) does not hold the
relevant column, so here, an index seek or partial scan is also out of
the question. So assuming comparable fragmentation and unused space, a
nonclustered index scan will always need less logical reads than the
clustered index scan, which IMO makes it the better strategy. In this
case, outdated statistics should not matter.

Gert-Jan

Karl Gram wrote:
>
> Gert-Jan,
>
> If you look at the query and the index you will see that the first column of
> the index is not used by the query. Therefore the optimizer will probably
> not use the non-clustered index.Also the B-tree's intermediate levels only
> contain the first column of the index which SQL Server can use for a index
> seek. Only the leaf level contains all the other columns of the index. Also
> an index only contains selectivity information for the first column of the
> index and not for the other columns. Therefore SQL Server can rely on the
> index to determine the selectivity and density of the data in the table.
>
> However, if the first column of the index is not used in the query, the
> optimizer can still sometimes choose the non-clustered index but it will
> always do a index scan. Choosing a non-clustered index this way however is
> not so obvious and relies on the non-index statistics SQL Server maintains.
>
> --
> HTH
>
> Karl Gram
> http://www.gramonline.com
>
> "Gert-Jan Strik" <sorry@toomuchspamalready.nl> wrote in message
> news:405B4489.E36A5880@toomuchspamalready.nl...
> > If the query is exactly as the one you mentioned, then one would expect
> > the nonclustered index scan, because the index is covering the query.
> > Even if the statistics were not up to date, SQL-Server should still know
> > that the NC index is faster (because the mutation rate is not very
> > high).
> >
> > How big is the size difference between the clustered index and the NC
> > index?
> >
> > Note that if your query needs columns that are not covered by the index,
> > then it is a whole different ball game.
> >
> > HTH,
> > Gert-Jan
> >
> >
> > Thomas Kratz wrote:
> > >
> > > Hi,
> > >
> > > I have a relatively large (12 Million entries) table in a SQL Server
> 2000
> > > (SP2) database. There are 2 indexes:
> > >
> > > 1. primary key index on column 'prinbr'
> > >
> > > 2. combined index:
> > > CREATE UNIQUE
> > > INDEX [price_hst_insaddr_fk] ON [dbo].[price_hst]
> > > ([day], [insaddr], [curr], [ptynbr])
> > > WITH
> > > FILLFACTOR = 80
> > > ,DROP_EXISTING
> > > ON [PRIMARY]
> > >
> > > It now happened a few times that a query like
> > >
> > > select count(*) from price_hst where insaddr=<number>
> > >
> > > took 90 seconds instead of the usual 5-15 seconds.
> > > The result count equals about 34000 entries.
> > >
> > > In the Profiler I could see, that a full table scan was issued.
> > > How can that happen? Surely using the index is *always* faster than a
> full
> > > table scan for such a query.
> > >
> > > Recreating the index, didn't solve the problem, but an
> > >
> > > update statistics price_hst
> > >
> > > did. Is it possible for the table statistics to get garbled?
> > >
> > > By the way:
> > >
> > > - 55000 rows are inserted into the price_hst table once a night
> > > - 'update statistics price_hst' is run every morning before the
> > > users connect
> > >
> > > Thomas
> >
> > --
> > (Please reply only to the newsgroup)

-- 
(Please reply only to the newsgroup)


Relevant Pages

  • Re: WHERE clause question
    ... > no reason that select * with where should run slower. ... There are two parts to executing a query. ... SQL Server can start spooling rows as soon as it finds matches and has the ... if the statistics is outdated, ...
    (microsoft.public.sqlserver.programming)
  • Re: Index with strange statistics?
    ... the index is not used by the query. ... contain the first column of the index which SQL Server can use for a index ... Therefore SQL Server can rely on the ... not so obvious and relies on the non-index statistics SQL Server maintains. ...
    (microsoft.public.sqlserver.server)
  • Re: Optimizer not using the good index
    ... I would guess that the statistics are inaccurate. ... use Query Analyzer and look at the estimated execution plan ... Open another window and get the actual execution plan. ... not work always since SQL Server only maintains distribution stats for 200 ...
    (microsoft.public.sqlserver.server)
  • Re: WHERE clause question
    ... There are two parts to executing a query. ... > SQL Server can start spooling rows as soon as it finds matches and has the ... > "Quentin Ran" wrote in message ... if the statistics is outdated, ...
    (microsoft.public.sqlserver.programming)
  • Re: Extremely Slow Table
    ... In this case, the table is the road, and the query plan is the car. ... The optimizer makes it choice of plan from the statistics SQL Server ...
    (comp.databases.ms-sqlserver)

Quantcast