Re: IS NULL on field is not using index placed on that field
From: Mike Hodgson (mike.hodgson_at_mallesons.nospam.com)
Date: 02/23/05
- Next message: Alejandro Mesa: "RE: how to select rows in a table"
- Previous message: madhivanan2001_at_gmail.com: "Re: how to select rows in a table"
- In reply to: Mike Hodgson: "Re: IS NULL on field is not using index placed on that field"
- Messages sorted by: [ date ] [ thread ]
Date: Wed, 23 Feb 2005 23:55:43 +1100
Upon reflection, most of this sounds like crap to me now (not sure what
I was on at the time). As such, disregard most of it. (Not having a
good day.)
-- *mike hodgson* |/ database administrator/ | mallesons stephen jaques *T* +61 (2) 9296 3668 |* F* +61 (2) 9296 3885 |* M* +61 (408) 675 907 *E* mailto:mike.hodgson@mallesons.nospam.com |* W* http://www.mallesons.com Mike Hodgson wrote: > Indexes (clustered and nonclustered) include nulls (they're just > another value, albeit a special one). But, when you have the 2 > criteria in your WHERE clause the query engine needs to look through 2 > separate ranges of values in the index (the null values and the > '2004-2-3' values)...sort of. So that's 2 seek operations (plus the > associated lookups into the clustered index to get the rest of the > data you request in your column list). So that's twice as much work > as if you just have the single condition in your WHERE clause > (DeliverDate = '2004-2-3'). If you just say "WHERE DeliverDate IS > NULL" you get a nice quick plan using your nonclustered index too. > When the query optimiser comes up with an execution plan (in your > case) it figures out pretty quickly that it can get the data much more > efficiently by just scanning the whole clustered index until it's got > all the data it needs, rather than flicking through the nonclustered > index (twice) and looking up the matching leaf nodes in the clustered > index as well. > > You'd have to test it out to be sure but I suspect the UNION ALL that > David suggested would not really be any more efficient just because on > the surface of it I'm guessing it would do pretty much the same > operations (just in 2 separate SELECT statements) as the single query > with the OR. The OR operator can be nasty and should be avoided if > possible but sometimes it has it's place. > > The only way the nulls in the index affect the index selection process > is that they serve as yet another value to sort in the index. At > least I think that's the only affect they have. With the non-null > values, the optimiser can reorder the WHERE clauses so that it doesn't > have to keep reversing direction in the index to find the > noncontinuous rows; not so with the nulls it would seem. > > If you have a big enough, fragmented enough index it will get avoided > all together. I repeated the exercise but included a 3rd column > (char(3000)), just to pad out the rows so I'd get lots of page splits > & index fragmentation, and inserted 3600 rows instead of the 10 rows I > had before (the 10 rows would have been on 1 page but the 3600 rows, > each 3000+ bytes, would be split over at least 1800 pages - 2 rows per > page). Selectivity was still the same but the optimiser thought (in > all 3 cases (where date = x, where date = x or date is null, and where > date is null)) that it would be better to just do a table scan. I > even took out the null values and made the DeliverDate column NOT NULL > and it made no difference. > > Sometimes you just have to trust that the optimiser knows best (which > it usually does). Of course, this could all be crap (I'm getting > tired) but it sounds logical to me. > > -- > *mike hodgson* |/ database administrator/ | mallesons stephen jaques > *T* +61 (2) 9296 3668 |* F* +61 (2) 9296 3885 |* M* +61 (408) 675 907 > *E* mailto:mike.hodgson@mallesons.nospam.com |* W* > http://www.mallesons.com > > > > Adrian Bezzina wrote: > >>Mike, >> >>Thanks for the prompt reply. >> >>after executing (i know the tables and fields have changed) the query on the >>following table: >> >>Consign (table) >> | >> |_ [UNIQUE] GUID Field - PrimaryKey. >> | >> |_ ORIGINAL_DATE DateTime Field - Clustered Index on this field alone. >> | >> |_ DELIVER_DATE DateTime Field - non clustered index on this field. >> >>set showplan_text on >>go >> >>SELECT * FROM Consign WHERE Deliver_date = '2004-2-3' >>SELECT * FROM Consign WHERE Deliver_date = '2004-2-3' OR Deliver_date IS >>NULL >>go >> >>set showplan_text off >>go >> >>the plan for the first query is doing a 'Index Scan' on the non-clustered >>index on the deliver date field >> >>the plan for the second query is doing a 'Clustered Index Scan' on a >>clustered index - original_date and i suppose this is the equivalent to a >>table scan as I have moved the clustered index off the primary key to a more >>ordered field (in this case the original_date field). >> >>I suppose what I am really asking is: >> >>Are null values placed in an Index ? and if so does that effect the qeury if >>IS NULL (or IS NOT NULL) affect the selection of which index to use (sorry >>about changing the table structure, but i am showing you the indexes in >>question). >> >>Regards, >>Adrian. >> >>-------------------------- >> >>"Mike Hodgson" <mike.hodgson@mallesons.nospam.com> wrote in message >>news:e$tD6MWGFHA.1932@TK2MSFTNGP14.phx.gbl... >>What does it do? A table/clustered index scan? We don't really have enough >>info to answer - can you post the execution plans & the DDL (table schema >>and schema for all indexes on the table)? To get the execution plans just >>turn on showplan_text before the select statements and off after the select >>statements. That is, >> >>set showplan_text on >>go >> >>SELECT * FROM Deliveries WHERE DeliverDate = '2004-2-3' >>SELECT * FROM Deliveries WHERE DeliverDate = '2004-2-3' OR DeliverDate IS >>NULL >>go >> >>set showplan_text off >>go >> >>When I tried it on a tmp table (I had only a nonclustered index on >>DeliverDate) it chose: >> >>a nonclustered index seek for: >>SELECT * FROM Deliveries WHERE DeliverDate = '2004-2-3' >> >>a table scan for: >>SELECT * FROM Deliveries WHERE DeliverDate = '2004-2-3' OR DeliverDate IS >>NULL >> >>a nonclustered index seek for: >>SELECT * FROM Deliveries WHERE DeliverDate IS NULL >> >>I guess the optimiser decided that it was more efficient to scan the table >>once than to do 2 seek passes (+ associated lookups) in the DeliverDate >>nonclustered index for the query with the 2 where criteria. When I added a >>clustered index on OrderID, the table scan operation turned into a scan on >>the DeliverDate nonclustered index because the index suddenly became a >>covering index (since all nonclustered indexes also contain the clustering >>key, thereby implicitly making the DeliverDate index really (DeliverDate, >>OrderID)). >> >>My quick tests only had 10 rows of data. It would make a difference if I >>was running the SELECT queries against a bigger table. >> >>-- >> >>mike hodgson | database administrator | mallesons stephen jaques >>T +61 (2) 9296 3668 | F +61 (2) 9296 3885 | M +61 (408) 675 907 >>E mailto:mike.hodgson@mallesons.nospam.com | W http://www.mallesons.com >> >> >>Adrian Bezzina wrote: >>Hi All, >> >>I have the table structure as follows >> >>->Deliveries (table) >> |__OrderID (Guid) >> |__DeliverDate (DateTime) (ALLOW NULLS) >> >>Index is on DeliverDate (ASC) >> >>60000 records >> >>look at the query plan for the following and is using the proper index: >> >>SELECT * FROM Deliveries WHERE DeliverDate = '2004-2-3' >> >>If I now change the query to: >> >>SELECT * FROM Deliveries WHERE DeliverDate = '2004-2-3' OR DeliverDate IS >>NULL >> >>The query plan does not choose the DeliverDate index anymore? >> >>What is the reason for this? >> >>Regards, >>Adrian. >> >> >> >> >> >> >>
- Next message: Alejandro Mesa: "RE: how to select rows in a table"
- Previous message: madhivanan2001_at_gmail.com: "Re: how to select rows in a table"
- In reply to: Mike Hodgson: "Re: IS NULL on field is not using index placed on that field"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|