Re: What's the deal with PAGEIOLATCH_SH?
From: James Bradley (jbradley_at_isa-og.com)
Date: 04/25/04
- Next message: Kevin3NF: "Re: Sr. SQL DBA job open in Texas"
- Previous message: John Kane: "Re: Optional CONTAINS"
- In reply to: Greg Linwood: "Re: What's the deal with PAGEIOLATCH_SH?"
- Next in thread: Greg Linwood: "Re: What's the deal with PAGEIOLATCH_SH?"
- Reply: Greg Linwood: "Re: What's the deal with PAGEIOLATCH_SH?"
- Reply: Andrew J. Kelly: "Re: What's the deal with PAGEIOLATCH_SH?"
- Messages sorted by: [ date ] [ thread ]
Date: Sun, 25 Apr 2004 14:47:42 -0600
I took a look at the disk queue length (all of this work is on a
single temp drive, not on the RAID), and it is happily running about
0.7 for most of the query, with occasional spike to about 13 for a few
seconds. About 20 minutes into the query everything goes bad. The
disk I/O *drops* to 1MB/second in and out, and the disk queue length
jumps to 60-70 and stays there! CPU utilization stays low -- less
than 20% of one CPU.
That sounds like a disk I/O problem to me, but basic test of reading
and writing the disk seem to work just fine (good queue lengths and
data rates). One thing I have noticed in PerfMon is when disk writes
go over some threshold, which seems to vary, the disk queue length
skyrockets. Sometimes I see this occur around 10MB/s write, just now
I the queue length going > 100 when the writes go over 4MB/s. Bizarre
behavior :-)
I'll work on getting the real execution plan, instead of the estimated
one out of Query Analyzer, but I have never let the command finish (I
gave up after two hours). When I try a smaller table size (say 500K
rows instead of 4.6M), the query executes very quickly, and I don't
see these problems. This makes me think that maybe it is an in-memory
versus out-of-memory execution issue, except that the CPU utilization
is much too low.
Here is the current statement I'm using:
UPDATE Table1
SET hasChildren=1
FROM
(
SELECT DISTINCT parentID FROM Table2
) T
WHERE Table1.ID = Table2.parentID
The tables are indexed (NC) on ID and parentID. There are no triggers
on Table1 and hasChildren is not reference by any index, clustered or
otherwise.
I don't know how to post the estimated execution plan, but I can
describe it pretty easily.
1. Index scan on Table1 (31% of time - this is the big table)
2. Index scan on Table2 (0% of time - this is a small lookup table)
3. Hash Match/Right Semi Join (67% of time).
#3 above shows an estimated row count of 888K, and a row size of 24.
Total cost of the entire execution is 48.0 for 888K rows.
Thanks for any tips!
Jami
On Sun, 25 Apr 2004 14:12:28 +1000, "Greg Linwood"
<g_linwoodQhotmail.com> wrote:
>A fairly generally benchmark is up 3 to 4 queued requests per spindle are
>ok, but more than that on average represents the disk subsystem lagging. It
>seems that you're system's not experiencing this given this basic
>observation. I always like to take a quick "flashing lights" or audible
>noises from the disk box to make sure that the counters are working properly
>though (:
>
>SQL Server performs much of it's work in cache, so the absence of disk i/o
>doesn't equate to an absence of work - the next thing I'd suggest you check
>is the actual execution plans for those queries. To do this, use the
>profiler if your queries are being submitted by an application or select the
>Query / Show Execution Plan in Query Analyser if you're submitting the
>queries from there. Look for the most costly part of the query & look for
>scans or inefficient joins such as hash joins. It may well be that your
>query is executing completely in memory, but running on an inefficient plan?
>The observations you've made on the pageiolatch_sh waitstate may be
>incidental to the overall query execution plan being inefficient. If you're
>not sure on analysing execution plans, perhaps post the output from the
>query with "set statistics profile on" & we might be able to help there..
>
>Regards,
>Greg Linwood
>SQL Server MVP
>
>"James Bradley" <jbradley@isa-og.com> wrote in message
>news:krbm805uq3f2mkelo8adpn6ti9iuvbh3er@4ax.com...
>> Thanks Greg.
>>
>> DB ID 9 is my target database (not the source), and the disk I/O does
>> seem to behave unusually. When I watch the bytes in/out I see it
>> cranking along at 20-60MB/sec normally for about 20 minutes, then it
>> just hits a brick wall and drops to 1MB/sec. I'll check the queue
>> lengths - if they are small (< 1), then I presume it is SQL. If they
>> are over 1, then it is waiting for the disk system. Correct?
>>
>> Jami
>>
>> On Sun, 25 Apr 2004 11:39:00 +1000, "Greg Linwood"
>> <g_linwoodQhotmail.com> wrote:
>>
>> >I noticed in your original post that you asked how to identify if the
>> >contention was in tempdb & I didn't answer that. I'm not sure where that
>> >resource id came from: 9:1:1072396 but that's likely db#9, filegroup#1,
>> >object#1072396. You cacn translate the db# in master..sysdatabases, then
>the
>> >objectid from whatever database's sysobjects.
>> >
>> >Given you don't have any blocking, I'd suggest that you use Perfmon to
>check
>> >the physical disk object's read / write queue lengths to see if they're
>> >high. The reason I'm saying this is that you've said you're not getting
>the
>> >throughput you're expecting. If SQL Server is queuing more requests than
>the
>> >disk subsystem can keep up with, the Avg. Disk Read / Write Queue Length
>> >counters will identify this. If that's your problem, then you likely have
>> >only one of two problems - check the query plan to make sure that SQL
>Server
>> >isn't creating excessive i/o requests by reading more data than it neads
>(eg
>> >table scans or other inefficient query plans) or you might have to tune
>the
>> >disk sub-system.
>> >
>> >What "high" means for these counters depends on what disk configuration
>you
>> >have, but basically, if more than a few requests per physical disk
>spindle
>> >are queued on average then your disk system isn't keeping up or sql
>server's
>> >making too many requests..
>> >
>> >Regards,
>> >Greg Linwood
>> >SQL Server MVP
>> >
>> >"James Bradley" <jbradley@isa-og.com> wrote in message
>> >news:rl1l8091v3pg99t0iirj3h3stamvhsfroq@4ax.com...
>> >> From what I understand, a latch is a lightweight lock, which seems
>> >> fine to me. I checked the spids from Enterprise Manager and there is
>> >> no blocking/blocked by listed. When I looked, there were only two
>> >> connections to the databases - one by Query Analyzer when it was
>> >> running my command, and the second by Enterprise Manager when I was
>> >> looking at what was happening.
>> >>
>> >> Could the PAGIOLATCH_SH be waiting for data to come in from the disk?
>> >> I was monitoring total disk I/O and it was only 1MB/sec each way, not
>> >> like the 20-60MB/sec that I was expecting.
>> >>
>> >> I really need to figure out what is causing these delays, because I
>> >> can't wait hours for a statement that should take 20 minutes :-(
>> >>
>> >> Thanks,
>> >>
>> >> Jami
>> >>
>> >> On Sat, 24 Apr 2004 09:30:17 +1000, "Greg Linwood"
>> >> <g_linwoodQhotmail.com> wrote:
>> >>
>> >> >Hi James
>> >> >
>> >> >A PAGEIOLATCH_SH simply indicates that the connection your statement
>is
>> >> >running on is waiting on acquiring a shared lock on a storage page.
>There
>> >> >are a variety of reasons why this could be occurring, but the most
>common
>> >is
>> >> >blocking. Have you checked sp_lock to see if the spid is blocked by
>> >another
>> >> >spid? You can also easily see if the spid is actually doing work by
>> >checking
>> >> >the CPU activity at the same time.
>> >> >
>> >> >Regards,
>> >> >Greg Linwood
>> >> >SQL Server MVP
>> >> >
>> >> >"James Bradley" <jbradley@isa-og.com> wrote in message
>> >> >news:ti3j801v1otant21ph1f5l8pq3t4apm4bv@4ax.com...
>> >> >> I'm seeing some strange performance problems with my SQL server (8.0
>> >> >> SP3). I have some large data conversion work to do and sometimes
>the
>> >> >> server runs pretty quickly, and sometimes is simply appears to go
>idle
>> >> >> in the middle of a long statement!
>> >> >>
>> >> >> Whenever I check the current activity on the process, I'm seeing
>that
>> >> >> it is waiting (usually 50-400 mSec) for a PAGEIOLATCH_SH. The
>system
>> >> >> has no load (1-5% CPU, 1MB/sec disk I/O). I am using two databases,
>> >> >> and I am the only user on those two. There are 3-5 other users on a
>> >> >> different database, usually idle, and we are running reporting
>> >> >> services.
>> >> >>
>> >> >> What is the server waiting for? Is there some conflict with tempdb?
>> >> >> Is there some command to figure out what the resource is that it is
>> >> >> wairing for (e.g., 9:1:1072396)?
>> >> >>
>> >> >> Thanks!
>> >> >>
>> >> >> Jami
>> >> >
>> >>
>> >
>>
>
- Next message: Kevin3NF: "Re: Sr. SQL DBA job open in Texas"
- Previous message: John Kane: "Re: Optional CONTAINS"
- In reply to: Greg Linwood: "Re: What's the deal with PAGEIOLATCH_SH?"
- Next in thread: Greg Linwood: "Re: What's the deal with PAGEIOLATCH_SH?"
- Reply: Greg Linwood: "Re: What's the deal with PAGEIOLATCH_SH?"
- Reply: Andrew J. Kelly: "Re: What's the deal with PAGEIOLATCH_SH?"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|