Re: Leftmost column in an index

From: Itzik Ben-Gan (itzik_at_REMOVETHIS.SolidQualityLearning.com)
Date: 11/15/04


Date: Mon, 15 Nov 2004 11:44:17 +0200

David,

You're right. Even if you're filtering on a secondary index column the
non-covering index can still be used. As you mentioned, the leaf level of
the index can be fully scanned (as opposed to using an ordered operation
like a seek or an ordered scan), and then for all filtered rows, a bookmark
lookup is used to fetch the pages containing the data rows.
That's one example where the optimizer creates these _WA% statistics to
estimate the number of matching rows and determine whether an index scan +
lookups would be more efficient than a table scan.
Here's an example:

900 rows
use tempdb
set nocount on
if object_id('t1') is not null drop table t1
create table t1(c1 int, c2 int, c3 char(1000))
declare @i as int
set @i = 1
while @i <= 1000
begin
  insert into t1 values(@i, case when @i <= 10 then 1 else 2 end, 'a')
  set @i = @i + 1
end
create index idx1 on t1(c1, c2)

The above t1 table is populated with 1000 rows, each ~1K in size. The table
consumes 143 pages, which is the cost of a table scan.
The distribution of the data in c2 is: 10 rows with the value 1, 990 rows
with the value 2.
An index is created on (c1, c2). The index row size is obviously much
smaller then the full row size (~15 bytes vs. ~1K). Within the leaf level of
the index, the 1000 index rows currently fit in only 4 pages, which is the
cost of an index scan.

Now examine the plans and the IO stats for the following queries:

select * from t1 where c2 = 1
plan + I/O cost: index scan (4 logical reads) + 10 lookups (10 logical
reads) = 14 logical reads.
Obviously this is more efficient than a table scan which would have cost 143
logical reads.

select * from t1 where c2 = 2
plan + I/O cost: table scan = 143 logical reads.
Here, performing 990 lookups would have been much more costly than a table
scan.

The optimizer had to know the distribution of the values in c2, and for this
reason it created statistics on it. You can see the statistics created by
querying sysindexes:

select * from sysindexes
where id = object_id('t1')
  and indexproperty(id, name, 'IsAutoStatistics') = 1

You'll get back a row with a pointer to the statistics named _WA_Sys%

-- 
BG, SQL Server MVP
www.SolidQualityLearning.com
"DWalker" <None> wrote in message 
news:Xns95A1EB4E04ACCDWalker@207.46.248.16...
> Hi, please see below.
>
> "David Gugick" <davidg-nospam@imceda.com> wrote in
> news:#uj46atyEHA.3408@TK2MSFTNGP10.phx.gbl:
>
>> DWalker wrote:
>>> Thanks for the quick response.
>>>
>>> Suppose the Authors table has a composite index on au_lname,au_fname.
>>> Then obviously a query that selects based on lname can do an index
>>> seek.  But this query:
>>>
>>> select au_fname, au_lname from authors where au_fname = 'Bob'
>>>
>>> can still use an index scan, right?  The presence of the composite
>>> index helps some, but not as much as when you're querying on the
>>> leftmost column.
>>>
>>> Am I right?
>>>
>>> Thanks.
>>>
>>> David Walker
>>>
>>
>> If you're dealing with a covering index, you might see an index scan
>> operation. In your example, you have a query that references only
>> those columns in the index. SQL Server would probably choose to use an
>> index scan in that case because there is no need for a bookmark lookup
>> operation to get the real data. This assumes this is a non-clustered
>> index. If it were clustered, then you'll see a clustered index scan,
>> which is really the same as a table scan operation. If you added
>> another column to your query that was not in the index, the index scan
>> would likely change to a table scan or clustered index scan.
>>
>
> Suppose I do this:
>
> Select au_fname, au_lname, phone from authors where au_fname = 'Bob'
>
> Assuming SQL can't do a clustered index scan, why would it have to do a
> table scan?  Why couldn't it scan the index on fname to find which data
> pages it needs to read, get the phone number out of?
>
> I tried some of this on my own tables and it seems to be doing an index
> scan in these cases.
>
> Thanks for any enlightenment here.
>
> David Walker
> 


Relevant Pages

  • Re: Covering indexes versus column order in Delaney
    ... yes - if the index is not useful to ANY query or statement, ... the cost of a table scan (or ... a clustered index scan) is 10,000 page reads (plus some overhead ... >consisting of account number, transaction date, and transaction sequence ...
    (microsoft.public.sqlserver.programming)
  • Re: QUERY TUNING
    ... IN THE ABOVE QUERY I HAVE A NORMAL INDEX IN THE BRAN_CODE COLUMN.. ... Cost  | ... statistics are not ran by default. ... Otherwise the explain plan can't be used. ...
    (comp.databases.oracle.server)
  • Re: Statistics and Index Choice
    ... You mentioned that there is no query which is "run first". ... is said that SQL Server uses statistics to determine which part of the query ... If the I/O cost is the most important issue for choosing an strategy, ...
    (microsoft.public.sqlserver.server)
  • Re: Cumulative cost
    ... When you say "running total per student", I believe this means the same ... You can use a Totals query to get a "per student" Sum ... AND calculate something you're calling "cumulative cost". ...
    (microsoft.public.access.forms)
  • Re: FTS Performance in SQL 2005
    ... Now if I do a query directly to the field I would theoretically need: ... SQL Server MVP ... Because i've set the MAX sql-server memory to 3.5 GB instead of 4.0 GB ... cost relative to the whole batch, ...
    (microsoft.public.sqlserver.fulltext)