Re: XML performance extremely slow for no obvious reason
- From: Dr Pizza <DrPizza@xxxxxxxxxxxxxxxxx>
- Date: Sun, 04 Mar 2007 14:29:47 +0000
On Sat, 03 Mar 2007 22:57:07 -0500, Steve Kass <skass@xxxxxxxx> wrote:
I think can shed a little light on this, at least. Some of this has
nothing to do with XML, but with the inability of the optimizer
to deal with parameters like your @col1 that cause there to
be either 1 result or 10000, depending on whether @col1 is
null or not. But XML is also an issue, because there, too,
the optimizer can't deal, but you can do less about it, I think.
Further playing around suggests the XML makes it both better and worse,
depending.
I bumped up the data table to 100,000 rows, just to make issues more
apparent.
The reason behind the XML is to work around SQL Server's lack of arrays;
this is a stored procedure called through ADO.NET and the original
developer felt XML was the easiest way to pass multiple values in to the
SP. Seems fair enough, I think.
The example code generates the XML on the fly, of course. One thing I
tried out of curiosity was to replace the XML @idlist with a simple
subselect:
where (id = @id or @id is null)
and
(
@idlist is null
or
id in
(
select top (50) id from data
)
)
with all the other conditions removed. With the conditions removed, the
XML version is now considerably *faster* than with subselect above. Not
what I expected at all! The XML version takes under two seconds; the
subselect (which is equivalent, but without the conversion from XML to
tables) takes 33 seconds. A check of the execution plan shows that yet
again it's multiplying the table size and then having to read the best
part of 100 million rows. Now as far as I can tell, that should never be
the case. Reading 50 rows from a table should be quicker than reading 50
rows from XML, and given that that's the only difference between the two,
there's no good reason for the execution plans to differ (IMO).
The next thing was to try using a temp table:
select ParamValues.id.value('.','uniqueidentifier') as id
into #ids
from @idlist.nodes('/idlist/id') as ParamValues(id)
....
where (id = @id or @id is null)
and
(
@idlist is null
or
id in
(
select id from #ids
)
)
This seems to be far more consistent. The version with no other
conditions takes a couple of seconds, and even with all the conditions
added, it doesn't take more than about 12 seconds. And importantly,
there's no explosion in the number of rows scanned; it consistently scans
the right number of rows, no matter what.
There may be much simpler ways to handle this, but I hoped
my rambling was interesting, if not useful...
Thanks. The behaviour still seems downright weird (and totally
undesirable) but the workaround is satisfactory.
.
- Follow-Ups:
- Re: XML performance extremely slow for no obvious reason
- From: Dr Pizza
- Re: XML performance extremely slow for no obvious reason
- References:
- XML performance extremely slow for no obvious reason
- From: Dr Pizza
- Re: XML performance extremely slow for no obvious reason
- From: Steve Kass
- XML performance extremely slow for no obvious reason
- Prev by Date: Re: XML performance extremely slow for no obvious reason
- Next by Date: Re: XML performance extremely slow for no obvious reason
- Previous by thread: Re: XML performance extremely slow for no obvious reason
- Next by thread: Re: XML performance extremely slow for no obvious reason
- Index(es):
Relevant Pages
|