Re: Statistics and Index Choice
From: Liz (liz_at_tiredofspam.com)
Date: 05/06/04
- Next message: Brett Robichaud: "Re: Runiing an EXE from an Agent Job"
- Previous message: SQLNerdz: "Re: avoid Timeout expired?"
- In reply to: Amin Sobati: "Re: Statistics and Index Choice"
- Next in thread: Liz: "Re: Statistics and Index Choice"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 6 May 2004 08:45:18 -0500
"Amin Sobati" <amins@morva.net> wrote in message
news:OQAW6N1MEHA.2628@TK2MSFTNGP12.phx.gbl...
> Tibor,
> Your explanation gave me better understanding of index choice strategy.
> But I'd like to know that if I/O cost is important for execution plan, is
> SQL server able to estimate the I/O cost before executing the query? I
mean
> how it will determine that for example the 100# of rows are scattered in
how
> many pages? Is it done using RowIDs in leaf pages?
> Thanks,
> Amin
you might find it interesting to run:
DBCC SHOW_STATISTICS (orders, ShippedCountry_OrderDate)
and
DBCC SHOW_STATISTICS (orders, OrderDate_ShippedCountry)
(assuming that your first index is named "ShippedCountry_OrderDate" and the
second "OrderDate_ShippedCountry" ... make the adjustments)
>
>
> "Tibor Karaszi" <tibor_please.no.email_karaszi@hotmail.nomail.com> wrote
in
> message news:eso1uttMEHA.1340@TK2MSFTNGP12.phx.gbl...
> > A query can be executed in many ways. In your case, two search condition
> with AND, we have some options. When
> > I say "we" below, I'm referring to the optimizer's desicion which the
> execution engine later carries out:
> >
> > * We can look at every page, every row to see if we match both
conditions.
> Table Scan.
> >
> > * We can use an index on one column and through that index narrow which
> pages we need to search. On these
> > pages we check if both conditions are met. Index search.
> >
> > * We can other the other index, the rest is ad above. Index search.
> >
> > *We can combine both indexes (do a join) and then only go to the pages
> where there are rows which meet both
> > conditions. Index search.
> >
> > *One index can cover the query, i.e., all columns the query refers to is
> included in a non-clustered index. No
> > need to go to a data page. Index search.
> >
> > *We can have two index, one for each search condition, and the two
> combined covers the query. Again, SQL
> > Server does a join between the indexes but doesn't have to visit any
data
> pages. The indexes together covers
> > the query. Index search.
> >
> > It is correct that SQL Server uses statistics to determine which of
above
> strategies has the lowest cost. If
> > you return 100 rows through an NC index, then SQL Server have to visit
one
> page (to get the data) for each row
> > to be returned. Another search condition might onely return 1 row, end
> result is much cheaper.
> >
> > In your case, both indexes has both the ShipContry and the OrderDate in
> them. So in each case, SQL Server will
> > find all rows to be return in the index, and only go to the pages which
> has only the rows that patches both
> > criterias. Why do you say that one index should be cheaper than the
other?
> > --
> > Tibor Karaszi, SQL Server MVP
> > http://www.karaszi.com/sqlserver/default.asp
> >
> >
> > "Amin Sobati" <amins@morva.net> wrote in message
> news:eL8RLitMEHA.620@TK2MSFTNGP10.phx.gbl...
> > > Tibor,
> > > You mentioned that there is no query which is "run first". In some
books
> it
> > > is said that SQL Server uses statistics to determine which part of the
> query
> > > has smaller result, then continues finding other records that match
with
> > > other parts of query in the first result(therefore other
> > > conditions(where...) perform
> > > the search in smaller range of rows).
> > > If the I/O cost is the most important issue for choosing an strategy,
so
> > > what's the usage of statistics? How does it help the SQL Server to
make
> a
> > > decision?
> > > After you reply, I myself concluded that the estimation of a smaller
> result
> > > does not necessarily mean an smaller I/O cost, is it true?
> > > Thanks :)
> > >
> > > Amin
> > >
> > >
> > >
> > > "Tibor Karaszi" <tibor_please.no.email_karaszi@hotmail.nomail.com>
wrote
> in
> > > message news:eOnQqzsMEHA.556@TK2MSFTNGP10.phx.gbl...
> > > > It is all about cost. You need to look at how much it cost to
> implement
> > > the query one way or the other. Cost
> > > > is mainly page accesses, but CPU can be a factor that the optimizer
> > > considers as well (some operations costs a
> > > > lot CPU but are less I/O intensive). In your case, both plans where
> only
> > > costs 4 page accesses, i.e., the same
> > > > cost. Also, there is no query which is "run first". The optimizer
> > > estimates the cost for each strategy and
> > > > picks the cheapest one. However, there is code in the optimizer to
> "bail
> > > out soon" if it finds a cheap enough
> > > > plan. No use spend 100 seconds extra optimizing if you only gain .5
> > > seconds on that extra 100 seconds.
> > > >
> > > > Also, make sure you understand how an index is constructed and how
SQL
> > > Server will access the data, using the
> > > > index... :-)
> > > > --
> > > > Tibor Karaszi, SQL Server MVP
> > > > http://www.karaszi.com/sqlserver/default.asp
> > > >
> > > >
> > > > "Amin Sobati" <amins@morva.net> wrote in message
> > > news:%233c9tEsMEHA.3972@TK2MSFTNGP10.phx.gbl...
> > > > > Tibor,
> > > > > I appreciate you for your fast reply.
> > > > > Yes, my indexes are non-clustered. You mentioned that the table
has
> only
> > > 830
> > > > > rows, does it mean that I must expect the criteria that I've
learned
> if
> > > > > there are considerable amount of rows? What's the criteria for
> choosing
> > > > > which part of a query to run first? In fact, what's the relation
> between
> > > I/O
> > > > > cost and index choice?
> > > > > Thanks in advance,
> > > > > Amin
> > > > >
> > > > > "Tibor Karaszi" <tibor_please.no.email_karaszi@hotmail.nomail.com>
> wrote
> > > in
> > > > > message news:ugow3trMEHA.2500@TK2MSFTNGP12.phx.gbl...
> > > > > > I assume that both indexes are non-clustered?
> > > > > >
> > > > > > This table only has 830 rows. So don't draw any conclusions out
of
> > > that.
> > > > > The I/O cost was only 4 pages. The
> > > > > > optimizer has code to bail out early if it find a "cheap enough"
> plan.
> > > I
> > > > > actually get the same amount of page
> > > > > > I/O regardless of I force index 1 or index 2.
> > > > > >
> > > > > > --
> > > > > > Tibor Karaszi, SQL Server MVP
> > > > > > http://www.karaszi.com/sqlserver/default.asp
> > > > > >
> > > > > >
> > > > > > "Amin Sobati" <amins@morva.net> wrote in message
> > > > > news:%23v2kDerMEHA.1188@TK2MSFTNGP12.phx.gbl...
> > > > > > > Hi,
> > > > > > > I have created two indexes on Northwind.Orders for
experimenting
> > > > > statistics
> > > > > > > and index choices:
> > > > > > > Index1: Orders(OrderDate,ShiptCountry)
> > > > > > > Index2: Orders(ShiptCountry,OrderDate)
> > > > > > >
> > > > > > > In my query:
> > > > > > > select * from orders where
> > > > > > > shipcountry='germany' and
> > > > > > > (orderdate between '1996-07-01' and '1996-07-05')
> > > > > > >
> > > > > > > I expected to see and index seek on first index, because
> [orderdate
> > > > > between
> > > > > > > '1996-07-01' and '1996-07-05'] has smaller result and
> > > > > > > [shipcountry='germany'] has more.
> > > > > > > But execution plan always indicates that there has been index
> seek
> > > on
> > > > > the
> > > > > > > second index.
> > > > > > > What's the real strategy for choosing and index? I have
learned
> that
> > > SQL
> > > > > > > Server will first run the part of query that has smaller
result,
> but
> > > > > it's
> > > > > > > not true for my query.
> > > > > > > Thanks in advance,
> > > > > > > Amin
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
>
>
- Next message: Brett Robichaud: "Re: Runiing an EXE from an Agent Job"
- Previous message: SQLNerdz: "Re: avoid Timeout expired?"
- In reply to: Amin Sobati: "Re: Statistics and Index Choice"
- Next in thread: Liz: "Re: Statistics and Index Choice"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|