Re: SQL Server 2000 BUG?

From: Steve Kass (skass_at_drew.edu)
Date: 01/09/05


Date: Sun, 09 Jan 2005 12:51:26 -0500

Juraj,

I think using subselects will generate the same query plan as using
variables, though it may be harder to read. (There is one exception -
if you need to use columns of the table you are updating in the
intermediate expressions, you may not be able to reference those in
nested subselects without adding an extra join.)

For example, this:

create table target (
  pk int not null primary key,
  c1 int,
  c2 int
)

create table helper (
  pk int not null primary key,
  h1 int,
  h2 int
)

UPDATE target
SET c1 = ax,
    c2 = a4
FROM (
  SELECT pk, dbo.f(a1,a2,a3,a4) as ax, a4 FROM (
    SELECT pk, a1, a2, a3, a1*a3 as a4 FROM (
      SELECT pk, a1, a2, a2-a1 as a3 FROM (
        SELECT pk, a1, h2*a1 as a2 FROM (
          SELECT pk, h2, h1+1 as a1
          FROM helper
        ) H
      ) H
    ) H
  ) H
) H
WHERE target.pk = H.pk

gives the same query plan as

declare @Tmp1 int, @Tmp2 int, @Tmp3 int, @Tmp4 int, @TmpX int

UPDATE target
SET @Tmp1 = h1+1,
    @Tmp2 = h2*@Tmp1,
    @Tmp3 = @Tmp2-@Tmp1,
    @Tmp4 = @Tmp1*@Tmp3,
    @TmpX = dbo.f(@Tmp1,@Tmp2,@Tmp3,@Tmp4),
    c1 = @TmpX,
    c2 = @Tmp4
FROM target
JOIN helper ON target.pk = helper.pk

The subselect version might be more bug-resistant, though harder to
maintain. If you need to use one of the target table columns in the
expression, say "@Tmp4 = @Tmp1*@Tmp3+c2", it won't work to put "a1*a3+c2
as a4" into the subselect, and you will have to add an join to [target]
in the FROM clause of the update, which will add another join operator
to the query plan. (I'm not sure why you can't refer to target table
columns nested this way.)

I try to avoid UPDATE .. FROM whenever possible, but since T-SQL doesn't
support UPDATE target SET (c1,c2) = (SELECT this, that FROM ...),
there's sometimes no convenient alternative. This gives a terrible
query plan, with or without the use of a view:

CREATE VIEW helperV AS
  SELECT pk, dbo.f(a1,a2,a3,a4) as ax, a4 FROM (
    SELECT H.pk, a1, a2, a3, a1*a3+c2 as a4 FROM (
      SELECT pk, a1, a2, a2-a1 as a3 FROM (
        SELECT pk, a1, h2*a1 as a2 FROM (
          SELECT pk, h2, h1+1 as a1
          FROM helper
        ) H
      ) H
    ) H JOIN target ON H.pk = target.pk
  ) H
go

UPDATE target
SET c1 = (SELECT ax FROM helperV WHERE helperV.pk = target.pk),
    c2 = (SELECT a4 FROM helperV WHERE helperV.pk = target.pk)
WHERE EXISTS (
  SELECT * FROM helperV
  WHERE helperV.pk = target.pk
)

Assuming there is not some way to reformulate the entire problem (maybe
using indexed views or indexed computed columns somewhere?), you seem to
be doing as well as possible, and just encountered some bad luck because
of the UPDATE bug. While SQL Server 2005 still doesn't support SET (c1,
c2) = in its UPDATE syntax, common table expressions or the possibility
of writing some of the calculations in a CLR language instead of T-SQL
might make live a little easier. Also good to know is that this UPDATE
bug does not exist in the latest build of SQL Server 2005.

If the @variable = col = <expression> syntax helps, great. I don't know
if it's any less buggy than the two-step version, but it might be.

SK

Juraj Rojko wrote:

>Hi Steve & Gert-Jan,
>
>thank you very much for the very interesting comments. I definitely will try
>to avoid code like this is my future projects.
>
>Your comments also inspired me to another fix for the problem. This update
>works well:
>
>UPDATE xTable1
>SET @Tmp1 = Col2 = dbo.xFunc1(Col1)
>
>I can simply rewrite my code to this form.
>
>Well, I think, I should more explain what I am actually doing. My code is
>something like this
>
>UPDATE Table
>SET @Tmp1 = <Expression>,
> @Tmp2 = <Expression>(@Tmp1),
> @Tmp3 = <Expression>(@Tmp2, @Tmp1),
> @Tmp4 = <Expression>(@Tmp3, @Tmp1),
> @TmpX = <Func>(@Tmp1, @Tmp2, @Tmp3, @Tmp4),
> Col1 = @Tmp1,
> Col2 = @Tmp2,
> Col3 = @Tmp3,
> Col4 = @Tmp4
>FROM Table
>JOIN AnotherTable ON Table.Key = AnotherTable.Key
>
>Each expression is dependent on previous expression and I need to store
>result of each step. Some expressions are quite complicated (I mean long,
>with CASE etc).
>
>How to write code like this without helper variables? Only way I see is to
>use nested sub SELECT, but I think it will lead to less effective execution.
>Am I wrong, or have you better idea?
>
>Juraj
>
>
>
>



Relevant Pages