Re: Performance: Query optimizer producing sub-optimal result?
From: Gert-Jan Strik (sorry_at_toomuchspamalready.nl)
Date: 09/24/04
- Next message: James Goodwin: "Re: Basic "not in" query"
- Previous message: Adam Machanic: "Re: Basic "not in" query"
- In reply to: Mark Ayliffe: "Performance: Query optimizer producing sub-optimal result?"
- Next in thread: Mark Ayliffe: "Re: Performance: Query optimizer producing sub-optimal result?"
- Reply: Mark Ayliffe: "Re: Performance: Query optimizer producing sub-optimal result?"
- Messages sorted by: [ date ] [ thread ]
Date: Fri, 24 Sep 2004 20:33:10 +0200
Mark,
Have you tried adding the query hint OPTION (FORCE ORDER) to your query
A?
HTH,
Gert-Jan
Mark Ayliffe wrote:
>
> Apologies for the long post, I'm trying to describe my problem as fully as I
> can:
>
> As part of a suite of SQL statements I am trying to write a data integrity
> validation statement[2] to check the consistency of a legacy unique ID[1]. I
> have managed to produce a working solution basued on a user defined function
> which validates a set of unique IDs in a small group (2 to 100) of related
> records. My problem is that almost whatever I try, the Query optimizer is
> reorganising things so that my function is called more frequently than
> necessary. The function declares and opens a cursor on to the table so that
> I can navigate between the records checking each row as I go, so of course
> it has a non-trivial cost. I've tried several different approaches:
>
> A.
> SELECT UID, fnCheck(UID)
> FROM (SELECT DISTINCT MainPart from myTable)
> WHERE fnCheck(UID) <> 0
> [3]
>
> The optimizer seems to move the "DISTINCT" action in the subquery to the end
> of the process. Using the Server Trace shows that fnCheck(UID) is called
> once per record in MyTable, not once per DISTINCT MainPart as I'd expected.
> OK, on to...
>
> B. Define a View which performs the SELECT DISTINCT part of option A. My SQL
> statement becomes:
>
> SELECT UID, fnCheck(UID)
> FROM MAINUIDVIEW
> WHERE fnCheck(UID) <> 0
>
> Amazingly this *still* calls fnCheck(UID) once per record in the base table,
> not only once per row in the view.
>
> C. Take the contents out of the function and make the whole check an inline
> procedure, making the cursor include the whole table and cope with the
> switch of MainPart as we proceed. This appears to be as slow as the original
> version, although it doesn't have the unnecessary repeats. I suspect that
> the cursor on a large table (typically the table involved can have between
> 10's of thousands of records to a few million) is making the process rather
> slow.
>
> D. Use a temporary table, put the DISTINCT main parts in there and use that
> in the query. This is my current method, but it seems a little over the top
> for what I'm trying to do. And though the table generation is very quick,
> the whole process is still disappointingly slow. This is why I'd like the
> function calls to be minimised.
>
> Obviously my (scalar) function is rather expensive and the Optimizer isn't
> able to take this into account. Have I missed something obvious here? Is
> there a better technique I could be using? I've tried fiddling with the
> NOEXPAND view hint, but as I don't have an index on the view (I think) SQL
> Server doesn't accept it. Are there other ways that I can bend the Optimizer
> to my will :-)
>
> Thanks for reading this far!
>
> Mark
>
> [1] The Unique Id takes the form: [Main part][sep][Minor part]. A related
> set of records have the same [Main part], the (single character) [sep] is
> the same for all but the last record in the set and the [Minor part] is a
> "stringified" number, starting at 1 and increasing as necessary. A valid set
> of UIDs might look like:
> Wibble-0001
> Wibble-0002
> Wibble=0003
> This format has been in use in our systems for well over a decade, I would
> have done it differently, but I'm stuck with it.
>
> [2] I need to validate:
> - Valid characters only used in each position of the string. This seems to
> work fine in a separate function and isn't really part of my question.
> - [Minor part] starts at 1 and increments sequentially
> - [sep] is the correct value for each record in a group, the highest
> numbered [Minor Part] has the "terminating" separator.
>
> [3] There are a couple of short where clauses which rule out some forms of
> UID which I don't want to check. I've left these out for clarity.
>
> --
> Real email address |
> is mark at | What happens if you get scared half to death twice?
> ayliffe dot org |
-- (Please reply only to the newsgroup)
- Next message: James Goodwin: "Re: Basic "not in" query"
- Previous message: Adam Machanic: "Re: Basic "not in" query"
- In reply to: Mark Ayliffe: "Performance: Query optimizer producing sub-optimal result?"
- Next in thread: Mark Ayliffe: "Re: Performance: Query optimizer producing sub-optimal result?"
- Reply: Mark Ayliffe: "Re: Performance: Query optimizer producing sub-optimal result?"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|
|