Re: What's the deal with PAGEIOLATCH_SH? - PAGEIOLATCHIssue20040426.zip (0/1)
From: Jami Bradley (jbradley_at_isa-og.com)
Date: 06/15/04
- Next message: Aaron [SQL Server MVP]: "Re: Alter table, insert"
- Previous message: Ayrin: "Re: Alter table, insert"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 15 Jun 2004 11:56:19 -0600
After a great deal of help with Microsoft/PSS we finally isolated the issue! We worked it down to
a reproducible table - the key driving factor to reproduce the problem is table width.
The table I am using has an average row size of 3211 bytes, so I basically get 2 of these per page.
When the query optimizer generates the execution plan, one of the factors that determines how it
should do the update is the percentage of rows per page that will be updated. In my case, I am
updating about 40% of the rows. For whatever reason (rounding?) the optimizer decides that I am
not updating too many rows/page (under 20%) so it chooses to use random I/O, not sequential I/O for
the update. This causes the update to *not* use the clustered index and fly through the data - it
instead chooses another index and jumps all over the place for 2 hours.
If I take the 3 relevant columns (average row size about 20) and run the update, it takes 37
seconds. When I add a single CHAR colum of 3200 bytes, the update then takes 2 hours.
Microsoft is looking into the issue now that it can be reproduced - hopefully they can fix it.
This is another good reason to keep those tables normalized and narrow!
Thanks to everyone for their help - I learned a great deal about SQL queries :-)
Jami
On Tue, 27 Apr 2004 20:19:55 -0600, James Bradley <jbradley@isa-og.com> wrote:
>Thanks Greg,
>
>Ahh - I see the confusion with the materialized view. I am misusing the term! I have a *real
>table* that I manually created from a view. They are not related. I literally do the following:
>
>truncate Table1
>insert into Table1 select * from View
>
>I am trying some experiment with batching - I built a temporary table of the unique IDs that need
>to be updated, then just update based on this list. That way, I should be able to delete top NNN
>from the temporary table and loop.
>
>I also opened a PSS call today with MS - I'll let you know what comes of that as well.
>
>This one is driving me crazy :-)
>
>Thanks for all your help,
>
>Jami
>
>On Wed, 28 Apr 2004 11:27:55 +1000, "Greg Linwood" <g_linwoodQhotmail.com> wrote:
>
>>Hi James
>>
>>I'm in Melbourne Australia so yes, I might be asleep whilst you're in work
>>hours (:
>>
>>I reviewed the thread this morning & it seems we're working with a bit of a
>>moving target here. I wasn't aware that your "table" was a "join of tables"
>>at first & yesterday you said that you've implemented it as a materialized
>>view so I'm not sure if that's still the case today. I'd suggest that having
>>the materialzed view could explain the different behaviours you're seeing &
>>might not be a good idea overall. Generally, materialized views help read
>>performance & hinder update performance. I suggested testing the joined
>>table as a materialized view in an effort to see if somehow the update was
>>being delayed by whatever read activity is coupled with the statement from
>>whatever's going on in that view.
>>
>>Before going further, I suggest that you provide the actual base table
>>structures so I or others can make better suggestions, based on the actual
>>underlying data structures.
>>
>>I have a feeling that the best advice I can offer with what I know of your
>>problem right now, is that you should try out batching the updates. You've
>>said that you don't need to do the update atomically & that batching's ok,
>>so the advantage of this will be that the SQL engine won't need to maintain
>>so many locks during the update within a single transaction. I don't have
>>all of the actual columns in your table/s but I suggest that you look for a
>>column that has something like a date value which you can use to segment the
>>update, then write a loop to iterate through however segments you define,
>>performing the update in batches. Given the time delay between our posts, I
>>suggest you at least give this a go. It's a fairly commonly used technique &
>>might provide a silver bullet to your problem.
>>
>>Regards,
>>Greg Linwood
>>SQL Server MVP
>>
>>"James Bradley" <jbradley@isa-og.com> wrote in message
>>news:r48t801jr1vkfnskq5jvvmrt9mmvfmpl8s@4ax.com...
>>> Hi Greg
>>>
>>> Ok, I've been working with it today, and nothing seems to be helping :-(
>>>
>>> I recreated the large table with a clustered index on the unique field.
>>This is *not* the same
>>> field that we need to use for the update. I verified that Table2 is clean
>>and well indexed (54K
>>> records).
>>>
>>> I took the query and the execution plan is identical to the plan before
>>the CI on Table1, except
>>> there is now a clustered index update and the end of the execution.
>>>
>>> I figured out how to get the text version:
>>>
>>> |--Clustered Index Update(OBJECT:([DB1].[dbo].[Table1].[Table1PkC]),
>>> SET:([Table1].[hasChildren]=1))
>>> |--Top(ROWCOUNT est 0)
>>> |--Hash Match(Right Semi Join,
>>HASH:([Expr1007])=([Table1].[ID]),
>>> RESIDUAL:([Table1].[ID]=[Expr1007]))
>>> |--Compute
>>Scalar(DEFINE:([Expr1007]=Convert([Table2].[parentID])))
>>> | |--Index
>>Scan(OBJECT:([DB1].[dbo].[Table2].[Table2Idx2]))
>>> |--Index Scan(OBJECT:([DB1].[dbo].[Table1].[Table1Idx4]),
>>ORDERED FORWARD)
>>>
>>> Now when I try to run this query, it runs *very* slowly. I'm still seeing
>>very little disk I/O
>>> (1MB/sec at most, often < 100KB/sec), and CPU is around 1.5%.
>>>
>>> The big differences today are:
>>> 1. I am seeing a lot of checkpoints (I saw very few the other days).
>>Perfmon is showing an average
>>> of 100 checkpoints/sec with peaks to 700. This is *still* without any
>>significan disk i/o(!)
>>>
>>> 2. If I let the command run in QA for 5 seconds, then attempt to cancel,
>>QA is 'attempting to
>>> cancel the query' for a few minutes. It finally times out and the
>>connection is broken. This is
>>> new today, usually it doesn't take long to cancel when I stop it so early.
>>>
>>> 3. I tried running ITW on the specific update statement and it had no
>>recommendations. It really
>>> is a simple update and I can't see how much it can be tweaked.
>>>
>>> I think the root of all of this is identifying why the server is waiting.
>>It doesn't appear to be
>>> disk I/O, or CPU. It seems to be the PAGEIOLATCH_SH, but the wait doesn't
>>consume any noticeable
>>> CPU time. Am I waiting for someone to sleep? :-)
>>>
>>> Thanks!
>>>
>>> Jami
>>>
>>>
>>> On Tue, 27 Apr 2004 10:23:39 +1000, "Greg Linwood" <g_linwoodQhotmail.com>
>>wrote:
>>>
>>> >Hi James.
>>> >
>>> >ok - the PK issue is likely at the heart of your performance problem. I
>>> >suggest that you materialise that table, then use the ITW to collect
>>index
>>> >recommendations, test them & see what performance you can achieve.
>>Without a
>>> >PK, internal iterative steps in the query plan probably require
>>individual
>>> >scans rather than seeks which can be very expensive.
>>> >
>>> >The fact that the table is "generated by joining other tables" is
>>probably
>>> >contributing to the problem as well. What does this mean? Are you
>>referring
>>> >to a view? If the joined tables are a view, you might consider even
>>> >materialising the view, which could help out as well.
>>> >
>>> >It's vital that you understand the implications of NOT having a clustered
>>> >index on a table where other indexes (non-clustered) exist. It's not just
>>> >about having ranges - the physical implementation of NC indexes is
>>actually
>>> >different where they exist on a table that does / doesn't have a C index.
>>> >This can have a big impact on performance alone, especially if the
>>udpates
>>> >are causing page splits that cause physical NC leaf pointers to be
>>updated
>>> >where there's no C index.. You should read up on that issue a bit..
>>> >
>>> >Regards,
>>> >Greg Linwood
>>> >SQL Server MVP
>>> >
>>> >"James Bradley" <jbradley@isa-og.com> wrote in message
>>> >news:fu8r80p436u13enj91lvgtehtsvs09m2fl@4ax.com...
>>> >> Hi Greg
>>> >>
>>> >> That table has no primary key and no clustered index. This table is
>>> >> currently being generated by joining other tables, but it will soon be
>>> >> detached. I do have one field that is a good candidate for a PK -
>>> >> shown as Idx2. I know it is a good practice to have a PK, but could
>>> >> that affect the performance?
>>> >>
>>> >> This table used to have very poorly chosen clustered index, which I
>>> >> have since removed. If I were to cluster it, it would probably be by
>>> >> Idx2 as well, but I don't see any real benefit for that. I don't have
>>> >> any ranges on that ID anyway.
>>> >>
>>> >> Thanks!
>>> >>
>>> >> Jami
>>> >>
>>> >> On Tue, 27 Apr 2004 08:54:15 +1000, "Greg Linwood"
>>> >> <g_linwoodQhotmail.com> wrote:
>>> >>
>>> >> >Hi James.
>>> >> >
>>> >> >I can't see any Primary Keys on those tables - do they not have PKs or
>>> >have
>>> >> >you just missed this in the script? This is very important. It's also
>>> >> >important to know if any PK / indexes are clustered indexes or
>>> >> >non-clustered.
>>> >> >
>>> >> >Your isolation of 500k rows and execution of the update taking 1
>>second
>>> >is
>>> >> >also important - it suggests to me that there's something seriously
>>wrong
>>> >> >with the original query plan. SQL Server is certainly capable of
>>handling
>>> >> >large updates like this, but I'd suggest something fundamental (such
>>as
>>> >PK
>>> >> >being missing) is messing up the main query.
>>> >> >
>>> >> >Use of a clustered index might also help. If the table has a clustered
>>> >index
>>> >> >on the correct column, the query optimizer may be able to isolate the
>>> >1.5m
>>> >> >rows updated by this query using a range scan and treat them sort of
>>as
>>> >if
>>> >> >they're in their own table. Getting the choice of the clustered PK can
>>be
>>> >> >made manually or with the Index Tuning Wizard, which is a smart tool
>>that
>>> >> >understands the SQL Query Optimizer's internal workings.
>>> >> >
>>> >> >To use the ITW, you put the update statement in the Query Analyser,
>>> >> >highlight it, then select the Query / Index Tuning Wizard and follow
>>the
>>> >> >prompts. Try it with / without the "Keep all existing indexes" option
>>> >> >ticked. This tool is very powerful for making index suggestions, but
>>also
>>> >be
>>> >> >aware that when you use it like this, it's only tuning for the
>>individual
>>> >> >query you highlighted in QA & might make suggestions that are good for
>>> >that
>>> >> >query but not for the overall work done by the db. It can also be used
>>> >with
>>> >> >workloads captured by SQL Profiler for better overall index
>>suggestions,
>>> >but
>>> >> >you'll probably get joy for now from just tuning for this query.
>>> >> >
>>> >> >There is a substantial difference between having a clustered index on
>>a
>>> >> >table or not - it affects how ALL indexes on the table behave during
>>> >major
>>> >> >update statements like yours, so I suggest you spend some time reading
>>up
>>> >on
>>> >> >clustered indexes & their relationship to non-clustered indexes as
>>this
>>> >will
>>> >> >help you to understand what's going on under the hood. There's lots of
>>> >> >information in SQL Books Online & in SQL Server books on this
>>important
>>> >> >topic.
>>> >> >
>>> >> >Not having Primary Key is such a major mistake that I'm assuming
>>you've
>>> >just
>>> >> >missed this from your script - could you clarify that please?
>>> >> >
>>> >> >It would also be handy if you posted the actual query execution plan
>>> >rather
>>> >> >than the general statistics. Those stats aren't much use. The Query
>>Plan
>>> >> >gives us absolute information about precisely HOW the query is
>>physically
>>> >> >running & will likely lead to solving your problem. Do post this
>>please -
>>> >> >either in graphical or text format won't matter.
>>> >> >
>>> >> >Regards,
>>> >> >Greg Linwood
>>> >> >SQL Server MVP
>>> >> >
>>> >> >"James Bradley" <jbradley@isa-og.com> wrote in message
>>> >> >news:gkaq80lj2914utqu0s9ts7nf220jau4nvf@4ax.com...
>>> >> >> Hi Greg
>>> >> >>
>>> >> >> It took me a while, but I put together the create table/index calls
>>> >> >> for these tables. I had to rename the fields because of
>>> >> >> confidentiality issues.
>>> >> >>
>>> >> >> I've also attached the output from
>>> >> >> dbcc show_statistics for the relevant indices.
>>> >> >>
>>> >> >> I ran the full update statement, it took 2 hours, 24 minutes. I've
>>> >> >> attached the text output from the statistics for that run.
>>> >> >>
>>> >> >> I tried all 3 queries (the current one and the two you suggested
>>> >> >> below), and the query execution plans are identical!
>>> >> >>
>>> >> >> I don't need to do this atomically - I have exclusive use of the
>>> >> >> database when I need to run this. I could easily break it into
>>> >> >> several batches.
>>> >> >>
>>> >> >> One thing that is still bothering me -- let's say SQL is choosing a
>>> >> >> bad execution plan (hash join, etc.). Wouldn't I see the server
>>busy
>>> >> >> doing this, even if it is inefficient? Right now, the server is not
>>> >> >> limited by either CPU or Disk I/O, and I would assume one of those
>>> >> >> must be limiting, unless locks/latches are blocking it.
>>> >> >>
>>> >> >> Thanks for all the help!
>>> >> >>
>>> >> >> On Mon, 26 Apr 2004 11:07:06 +1000, "Greg Linwood"
>>> >> >> <g_linwoodQhotmail.com> wrote:
>>> >> >>
>>> >> >> >Hi James
>>> >> >> >
>>> >> >> >Can you please send the full DDL for these two tables, including
>>> >create
>>> >> >> >table, indexes, keys, relations & DBCC SHOW_STATISTICS for these
>>> >tables /
>>> >> >> >indexes? I'm sceptible about it's choice of a hash join to perform
>>> >this
>>> >> >> >work.
>>> >> >> >
>>> >> >> >Another question for you - does this work need to be performed
>>> >atomically
>>> >> >as
>>> >> >> >a single unit of work or is it possible to break it into two or
>>three
>>> >> >> >batches? This may increase performance tremendously.
>>> >> >> >
>>> >> >> >I'd suggest that the query written differently might yield a better
>>> >plan,
>>> >> >> >eg:
>>> >> >> >
>>> >> >> >UPDATE Table1
>>> >> >> >SET hasChildren=1
>>> >> >> >WHERE ID in (SELECT parentID FROM Table2)
>>> >> >> >
>>> >> >> >or
>>> >> >> >
>>> >> >> >UPDATE Table1
>>> >> >> >SET hasChildren=1
>>> >> >> >WHERE exists (SELECT * FROM Table2 where parentid = Table1.id)
>>> >> >> >
>>> >> >> >I'd suggest that batching updates might help even further if either
>>of
>>> >> >those
>>> >> >> >re-written queries did yield better plans.
>>> >> >> >
>>> >> >> >Regards,
>>> >> >> >Greg Linwood
>>> >> >> >SQL Server MVP
>>> >> >>
>>> >> >
>>> >>
>>> >
>>>
>>
- Next message: Aaron [SQL Server MVP]: "Re: Alter table, insert"
- Previous message: Ayrin: "Re: Alter table, insert"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|