Re: Performance: Query optimizer producing sub-optimal result?

From: Gert-Jan Strik (sorry_at_toomuchspamalready.nl)
Date: 09/24/04


Date: Fri, 24 Sep 2004 20:33:10 +0200

Mark,

Have you tried adding the query hint OPTION (FORCE ORDER) to your query
A?

HTH,
Gert-Jan

Mark Ayliffe wrote:
>
> Apologies for the long post, I'm trying to describe my problem as fully as I
> can:
>
> As part of a suite of SQL statements I am trying to write a data integrity
> validation statement[2] to check the consistency of a legacy unique ID[1]. I
> have managed to produce a working solution basued on a user defined function
> which validates a set of unique IDs in a small group (2 to 100) of related
> records. My problem is that almost whatever I try, the Query optimizer is
> reorganising things so that my function is called more frequently than
> necessary. The function declares and opens a cursor on to the table so that
> I can navigate between the records checking each row as I go, so of course
> it has a non-trivial cost. I've tried several different approaches:
>
> A.
> SELECT UID, fnCheck(UID)
> FROM (SELECT DISTINCT MainPart from myTable)
> WHERE fnCheck(UID) <> 0
> [3]
>
> The optimizer seems to move the "DISTINCT" action in the subquery to the end
> of the process. Using the Server Trace shows that fnCheck(UID) is called
> once per record in MyTable, not once per DISTINCT MainPart as I'd expected.
> OK, on to...
>
> B. Define a View which performs the SELECT DISTINCT part of option A. My SQL
> statement becomes:
>
> SELECT UID, fnCheck(UID)
> FROM MAINUIDVIEW
> WHERE fnCheck(UID) <> 0
>
> Amazingly this *still* calls fnCheck(UID) once per record in the base table,
> not only once per row in the view.
>
> C. Take the contents out of the function and make the whole check an inline
> procedure, making the cursor include the whole table and cope with the
> switch of MainPart as we proceed. This appears to be as slow as the original
> version, although it doesn't have the unnecessary repeats. I suspect that
> the cursor on a large table (typically the table involved can have between
> 10's of thousands of records to a few million) is making the process rather
> slow.
>
> D. Use a temporary table, put the DISTINCT main parts in there and use that
> in the query. This is my current method, but it seems a little over the top
> for what I'm trying to do. And though the table generation is very quick,
> the whole process is still disappointingly slow. This is why I'd like the
> function calls to be minimised.
>
> Obviously my (scalar) function is rather expensive and the Optimizer isn't
> able to take this into account. Have I missed something obvious here? Is
> there a better technique I could be using? I've tried fiddling with the
> NOEXPAND view hint, but as I don't have an index on the view (I think) SQL
> Server doesn't accept it. Are there other ways that I can bend the Optimizer
> to my will :-)
>
> Thanks for reading this far!
>
> Mark
>
> [1] The Unique Id takes the form: [Main part][sep][Minor part]. A related
> set of records have the same [Main part], the (single character) [sep] is
> the same for all but the last record in the set and the [Minor part] is a
> "stringified" number, starting at 1 and increasing as necessary. A valid set
> of UIDs might look like:
> Wibble-0001
> Wibble-0002
> Wibble=0003
> This format has been in use in our systems for well over a decade, I would
> have done it differently, but I'm stuck with it.
>
> [2] I need to validate:
> - Valid characters only used in each position of the string. This seems to
> work fine in a separate function and isn't really part of my question.
> - [Minor part] starts at 1 and increments sequentially
> - [sep] is the correct value for each record in a group, the highest
> numbered [Minor Part] has the "terminating" separator.
>
> [3] There are a couple of short where clauses which rule out some forms of
> UID which I don't want to check. I've left these out for clarity.
>
> --
> Real email address |
> is mark at | What happens if you get scared half to death twice?
> ayliffe dot org |

-- 
(Please reply only to the newsgroup)


Relevant Pages

  • Re: SQL Update with aggregate MIN function
    ... and the second to find and mark just the items you want marked. ... If you were trying to restrict this to just one athlete and session you ... AND TrainingSession = "Weights1" ... > in the same table .v. a query, this setup is quite central to how I pull ...
    (microsoft.public.access.queries)
  • RE: "The operation was cancelled at the users request" error during mass insertions
    ... | I am doing the insertions in a for loop, ... | setNextUid(nextUID); //update nextUid in the table Uid ... The error message suggests that the query being executed is simply taking ...
    (microsoft.public.sqlserver.jdbcdriver)
  • Re: Bizarre subquery syntax problem
    ... With Access, the official syntax for a subquery always requires an alias, so ... SELECT p.AS UID, p.Firstname AS Firstname, p.Lastname AS Lastname, ... I am seeing bizarre problems with a subquery. ... core query runs fine- ...
    (microsoft.public.access.queries)
  • Re: Access 2007
    ... Arnold ... "Mark A. Sam" wrote: ... Now I am crashing every time I open a table or query in ...
    (microsoft.public.access.tablesdbdesign)
  • Re: long running queries
    ... Mark is right -- you don't have much of an option with SQL Server 2000. ... >> that would result in a large data set and complex query, ... > Some of the client tools will warn you if the query looks like its going ...
    (microsoft.public.sqlserver.olap)