Re: ORDER BY and IDENTITY
From: Itzik Ben-Gan (itzik_at_REMOVETHIS.SolidQualityLearning.com)
Date: 12/21/04
- Next message: Doug Kent: "how to 'join' two tables during insert, without using a cursor?"
- Previous message: Adam Machanic: "Re: Problem with sql statement"
- In reply to: Questar: "Re: ORDER BY and IDENTITY"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 21 Dec 2004 19:59:55 +0200
I use it for test purposes; my production code looks a bit cleaner <g>, and
I use singular DROP statements.
But I have to say that when I used this plural form in test code I never
gave it a second thought. I didn't realize that it's not documented in BOL.
Strangely, BOL documents the syntax for all other DROP statements with
multiple objects. Maybe it's just a documentation bug. I'll report it.
-- BG, SQL Server MVP www.SolidQualityLearning.com "Questar" <Questar@newsgroup.nospam> wrote in message news:FE76359D-943D-4C5B-9B9F-20812A7B9DD5@microsoft.com... >I just noticed something interesting from your example. > > Would you use the syntax... > > drop table ta, tb, #t > > in production code? That usage does not appear to be documented in BOL! > ;-) > > "Itzik Ben-Gan" wrote: > >> Hi, >> >> I guess I'm the MVP you're referring to here, and that your question >> relates >> to my comments on the subject in the following article: >> http://www.windowsitpro.com/SQLServer/Article/ArticleID/44138/SQLServer_44138.html >> >> I see that you already got the pointers to the following articles >> discussing >> the SELECT INTO part of the problem: >> http://support.microsoft.com/default.aspx?scid=kb;en-us;273586 >> http://www.windowsitpro.com/SQLServer/Article/ArticleID/43553/43553.html >> >> The situation is somewhat similar to mathematics, where someone comes up >> with an assumption/conjecture based on observations, e.g., Fermat's last >> theorem which was proven right, and Goldbach's conjecture which is still >> unsolved. In Mathematics, suffice to come up with one example that >> contradicts the assumption to prove it wrong. With a T-SQL behavior, I'd >> say >> that it's nice to have an example, but suffice that Microsoft says that >> something works in a certain way to prove it so. I understand that you >> suspect that the SELECT INTO + IDENTITY + ORDER BY (call it Technique 1) >> variation might produce identity values that do not reflect the sort, but >> think it's wrong of INSERT INTO + ORDER BY into table with IDENTITY (call >> it >> Technique 2) to produce identity values that don't reflect the sort. >> While we're at it, many programmers also believe that SELECT INTO + >> IDENTITY >> FROM (SELECT + TOP 100 PERCENT + ORDER BY) (call it Technique 3) >> generates >> identity values that reflect the desired sort. >> >> I just generated a contradicting (on my machine) example for the SELECT >> INTO >> statement: >> >> set nocount on >> use tempdb >> go >> drop table ta, tb, #t >> go >> create table ta(a int not null) >> insert into ta values(3) >> insert into ta values(1) >> insert into ta values(2) >> >> create table tb(b int not null primary key) >> insert into tb values(4) >> insert into tb values(1) >> insert into tb values(3) >> insert into tb values(2) >> >> select b, identity(int, 1, 1) as rn >> into #t >> from ta join tb on a = b >> order by b >> >> select * from #t order by b >> >> -- Output: >> b rn >> ----------- ----------- >> 1 2 >> 2 3 >> 3 1 >> >> -- Plan: >> |--Table Insert(OBJECT:([#t]), SET:([#t].[rn]=[Expr1005], >> [#t].[b]=[tb].[b])) >> |--Top(ROWCOUNT est 0) >> |--Compute Scalar(DEFINE:([Expr1005]=setidentity([Expr1004], -7, 0, >> '#t'))) >> |--Sort(ORDER BY:([tb].[b] ASC)) >> |--Compute Scalar(DEFINE:([Expr1004]=getidentity(-7, 0, '#t'))) >> |--Nested Loops(Inner Join, OUTER REFERENCES:([ta].[a])) >> |--Table Scan(OBJECT:([tempdb].[dbo].[ta])) >> |--Clustered Index >> Seek(OBJECT:([tempdb].[dbo].[tb].[PK__tb__522A84E8]), >> SEEK:([tb].[b]=[ta].[a]) ORDERED FORWARD) >> >> You can see in the plan that the getidentity() function is invoked prior >> to >> the sort. That's the case on my machine (SQL2K Dev / SP3, single CPU) >> with >> my data. I don't know how this code would behave on other machines and >> with >> other data. But that was exactly my point. >> >> But you mentioned that you can accept that Technique 1 is not guaranteed, >> but not Technique 2. Why? Because you assume that the SELECT must be >> first >> fully processed and then loaded into the table? There's nothing in the >> relational model or ANSI SQL to support this because such an INSERT is >> non-relational and non-standard. So you expect an undocumented >> proprietary >> feature to behave in a certain way because that's what seems right to >> you. >> You can clearly see in the plan that the Compute Scalar operator with the >> getidentity() function is invoked before the data is actually loaded, so >> why >> doesn't it make sense that it might be invoked before the sort? >> The way I see it, if Technique 1 doesn't guarantee anything, I'd be very >> reluctant to assume Technique 2 does without Microsoft explicitly >> guaranteeing it. >> >> I'm afraid I currently don't have contradicting examples for Techniques 2 >> and 3, and since I'm at home, I don't have access to machines with >> multiple >> CPUs. >> I'll keep looking for ones, and if someone has contradicting examples, >> I'd >> be glad to hear about those. >> >> I posted a request in the private MVP forum to get Microsoft's official >> stand on the subject for all variations of the techniques. I'll post any >> info that can be shared publicly. >> >> Cheers, >> -- >> BG, SQL Server MVP >> www.SolidQualityLearning.com >> >> >> "Questar" <Questar@discussions.microsoft.com> wrote in message >> news:7A556581-9D66-4E2E-8634-CA4E1342AE44@microsoft.com... >> > It has come to our attention that the expected (and observed) behavior >> > of >> > a >> > specific T-SQL statement is not guaranteed. You have probably >> > encountered >> > the issue before, but I will provide two examples here. >> > >> > Syntax Example 1: >> > >> > CREATE TABLE #T1 (T1 int, Z1 varchar(20)) >> > >> > INSERT #T1 VALUES (1,'X') >> > INSERT #T1 VALUES (2,'Y') >> > INSERT #T1 VALUES (3,'Z') >> > >> > SELECT T1 AS T2, Z1 AS Z2, IDENTITY(int,1,1) AS I2 INTO #T2 FROM #T1 >> > ORDER >> > BY T1 DESC >> > >> > DROP TABLE #T1 >> > DROP TABLE #T2 >> > >> > Syntax Example 2: >> > >> > CREATE TABLE #T1 (T1 int, Z1 varchar(20)) >> > CREATE TABLE #T2 (T2 int, Z2 varchar(20), I2 int IDENTITY(1,1)) >> > >> > INSERT #T1 VALUES (1,'X') >> > INSERT #T1 VALUES (2,'Y') >> > INSERT #T1 VALUES (3,'Z') >> > >> > INSERT #T2 (T2,Z2) SELECT T1,Z1 FROM #T1 ORDER BY T1 DESC >> > >> > DROP TABLE #T1 >> > DROP TABLE #T2 >> > >> > According to a SQL Server MVP, the SQL Server 2000 execution plans for >> > these >> > two examples are not guaranteed to have the sort occur before the >> > assignment >> > of IDENTITY values. The MVP said the expected behavior was guaranteed >> > in >> > SQL >> > Server 2005. These two examples produce very similar, but not >> > identical, >> > execution plans. Looking at the statements logically I could >> > understand >> > if >> > the syntax of Example 1 resulted in the Compute Scalar (IDENTITY) >> > operation >> > happening before the Sort operation, but that behavior would seem very >> > wrong >> > for Example 2. I believe there's a significant amount of T-SQL code >> > being >> > used (including some of ours) that depends upon the syntax of Example 2 >> > resulting in the Sort operation happening before the Compute Scalar >> > (IDENTITY) operation. >> > >> > Does anybody have any additional thoughts or information on this >> > matter? >> > >> >> >>
- Next message: Doug Kent: "how to 'join' two tables during insert, without using a cursor?"
- Previous message: Adam Machanic: "Re: Problem with sql statement"
- In reply to: Questar: "Re: ORDER BY and IDENTITY"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|