Re: Indexes Confuses me!!
From: Itzik Ben-Gan (itzik_at_REMOVETHIS.SolidQualityLearning.com)
Date: 09/27/04
- Next message: Tinoco: "Restoring DTS structures file"
- Previous message: Michael Culley: "Re: Is sql server as good as access 97?"
- In reply to: Drew: "Re: Indexes Confuses me!!"
- Next in thread: Itzik Ben-Gan: "Re: Indexes Confuses me!!"
- Messages sorted by: [ date ] [ thread ]
Date: Mon, 27 Sep 2004 18:22:27 -0400
Sure. Kalen Delaney has a column called Inside SQL Server in SQLMag
(www.sqlmag.com) where she often covers indexing issues. You may want to
start with the first issue dating back to March 1999.
-- BG, SQL Server MVP www.SolidQualityLearning.com "Drew" <agarwalp@lucent.com> wrote in message news:ejuoGiFpEHA.3464@tk2msftngp13.phx.gbl... > Thanks Ben-Gan for your patience. > Yes you are right in saying that i need a bit of reading. Do you know > about > some online articles or notes relating to this? > Also one more question >> heap: 1 x n >> clustered: L x n > Should i infer that if you are using (Select * from table1) the heap is > better than the clustered index and clustered index is more useful in > situations like (Select * from table where col1='Drew') where clustered > index is on col1. > > Thanks Again > > > "Itzik Ben-Gan" <itzik@REMOVETHIS.SolidQualityLearning.com> wrote in > message > news:u#quFvxoEHA.896@TK2MSFTNGP12.phx.gbl... >> As David mentioned, Indexing is a big topic... >> >> > 1,002,500= 1,000,000 rows + 2500 index pages. Exactly whats happening >> > here? >> >> A couple of things that might help you understand the calculations: >> >> The minimum I/O unit used by SQL Server is a page. So, if SQL Server >> needs >> to read a certain row, it has to read the whole page first. >> If you're using an index for filtering or sorting purposes, and the index >> does not cover the query (meaning that the index does not include all >> attributes referred to by the query), SQL Server grabs the row locator > from >> the index row, and performs a bookmark lookup. Bookmark lookup means >> following the row locator to read the data row. >> >> Now, the row locator has a different format depending on whether the >> table >> is clustered or not. If the table doesn't have a clustered index (aka > heap), >> the row locator is a physical pointer consisting File#:Page#:Row#. With > such >> a pointer, reading the actual data row costs a single logical page read. >> If the table is clustered, the row locator is more logical; it contains > the >> clustered index key of the same row. So, a bookmark lookup in a clustered >> table cannot locate the row directly with a single page read, rather, it > has >> to issue a whole seek operation within the clustered index. If the > clustered >> index has 3 levels, each bookmark lookup translates to 3 logical reads. >> >> So depending on whether the table is a heap or clustered (with L levels), >> the total number of logical reads required to perform bookmark lookups >> for > n >> keys is: >> >> heap: 1 x n >> clustered: L x n >> >> > Q1. >>clustered index has three levels. What does this mean. >> > Q2. How does 3,002,500 comes out? >> >> 3 was just an example. An index might have a different number of levels > than >> 3. You really need to get familiar with Indexes and balanced trees to > figure >> this out. Recommended reading on the subject: Kalen's Inside SQL Server > 2000 >> and Donald E. Knuth's 3rd volume of The Art of Computer Programming - >> Sorting & Searching. >> You can calculate the number of levels in an index if you know the number > of >> rows in the table (r), and the index row size (i), fragmentation aside >> for >> now. >> First calculate the number of index rows that fit in an index page, p = > r/i. >> So p is the number of rows that can reside in the root page of the tree >> (level = 0). Since each index row points to a whole page in the next >> level >> of the balanced tree, the number of index rows in level l of the index is >> p^(l+1). >> The leaf level of the index contains a key for each row from the base > table, >> so you can calculate the number of levels in the index: >> >> ceiling(log(i)r) = L >> >> If you also want to take index fragmentation into consideration, you need > to >> divide r by the average page density (f): >> >> ceiling(log(i)(r/f)) = L >> >> With fairly short index row sizes (~20 bytes), you'll end up with 3 or 4 >> index levels for any given table size in the range of millions to >> billions >> of rows. >> That's why I used 3 in my example. >> >> > Does this mean the number of data reads depends on number of data >> > pages? >> > So >> > if i have a Datapage with 10 rows and a Datapage with 100 rows, the >> > efficiency is same. >> >> A couple of things... >> Try to distinguish between just scanning data where you're after all rows > of >> a page and reading a row using a bookmark lookup. The former will read > less >> pages if more rows fit in a page, while the latter won't since a page >> will >> be read for each target row. >> Things get even more complicated when you consider the fact that a >> logical >> read (a read from cache) is much more costly than a physical read (a read >> from disk). >> So, to answer your question, it's not the same when you're scanning data; > it >> is the same in terms of logical reads when you're using lookups, but >> since >> you're reading less pages physically in total, there will be some > difference >> in this case as well. >> >> -- >> BG, SQL Server MVP >> www.SolidQualityLearning.com >> >> >> "Drew" <agarwalp@lucent.com> wrote in message >> news:%235upYAwoEHA.3712@TK2MSFTNGP15.phx.gbl... >> >>>Then, performing a bookmark lookup using relative identifiers (RIDs) >> >>>costs >> >>>one page logical read for each key, or 1 x 1,000,000. This plan costs >> >>>a >> >>>total of 1,002,500, logical reads. >> > 1,002,500= 1,000,000 rows + 2500 index pages. Exactly whats happening >> > here? >> > >> >>>If the table has a clustered index on a column other than col1 and >> >>>that >> >>>clustered index has three levels, scanning the leaf level of the >> >>>nonclustered index in an ordered fashion costs 2,500 logical reads. > Then, >> >>>performing a bookmark lookup using clustering keys (i.e., a seek in >> >>>the >> >>>clustered index) costs three page logical reads per key, or 3 x >> >>>1,000,000. >> >>>This plan costs a total of 3,002,500 logical reads. >> > >> > Q1. >>clustered index has three levels. What does this mean. >> > Q2. How does 3,002,500 comes out? >> > >> >>>However, the I/O cost of a plan that doesn't use the nonclustered >> >>>index >> >>>is >> >>>50,000 logical reads for a table scan, plus the cost of sorting the >> >>>data-much more efficient than plans that use the nonclustered index" >> > >> > Does this mean the number of data reads depends on number of data >> > pages? >> > So >> > if i have a Datapage with 10 rows and a Datapage with 100 rows, the >> > efficiency is same. >> > >> > Thanks for your patience. >> > >> > >> > "David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> wrote in >> > message >> > news:ujYnwxvoEHA.1300@TK2MSFTNGP12.phx.gbl... >> >> In a nutshell, SQL Server's optimizer has to weigh-up the cost of > reading >> >> the index pages against the cost of scanning the table (or clustered >> > index). >> >> Using the index isn't always the fastest way to retrieve the data. > Which >> >> point of Itzik's article didn't you follow? If you want to understand >> > index >> >> optimizations you should study a good book as it is a complex topic. >> > "Inside >> >> SQL Server" (Delaney) goes into this in depth. >> >> >> >> -- >> >> David Portas >> >> SQL Server MVP >> >> -- >> >> >> >> >> > >> > >> >> > >
- Next message: Tinoco: "Restoring DTS structures file"
- Previous message: Michael Culley: "Re: Is sql server as good as access 97?"
- In reply to: Drew: "Re: Indexes Confuses me!!"
- Next in thread: Itzik Ben-Gan: "Re: Indexes Confuses me!!"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|
Loading