Re: Extremely high % Disk Time .. How is this possible??
From: Geoff N. Hiten (SRDBA_at_Careerbuilder.com)
Date: 02/10/05
- Next message: Andrew J. Kelly: "Re: Extremely high % Disk Time .. How is this possible??"
- Previous message: Tim: "Re: Extremely high % Disk Time .. How is this possible??"
- In reply to: Tim: "Re: Extremely high % Disk Time .. How is this possible??"
- Next in thread: Andrew J. Kelly: "Re: Extremely high % Disk Time .. How is this possible??"
- Reply: Andrew J. Kelly: "Re: Extremely high % Disk Time .. How is this possible??"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 10 Feb 2005 09:32:30 -0500
As Andrew pointed out, it is the physical drives that appear to be limited.
Slicing them into logical drives may look better but it does not increase IO
capability. Separating data and logs onto different physical disks with the
correct RAID layouts will increase performance. Andrew also suggested
profiling for performance to find if there are just a few really ugly
queries that are killing your system. I agree completely.
As for the original question on Percent IO Time, that counter has been
largely irrelevent since SCSI Overlapped IO was introduced way back when.
Caching controllers really killed it. I prefer to look at disk queue length
and read bytes/write bytes per second as true measures of disk performance.
I believe that you will be running with write cache off in a non-SAN cluster
to prevent data loss during a cluster failover. This is a big performance
killer and one major reason I do not recommend a cluster without a SAN. All
indicators are that you need to modify your IO system. You may find other
performance bottlenecks as you remove the IO limit, but for right now, that
is the worst offender.
-- Geoff N. Hiten Microsoft SQL Server MVP Senior Database Administrator Careerbuilder.com I support the Professional Association for SQL Server www.sqlpass.org "Tim" <tim@pds.notanaddress.ca> wrote in message news:%23kwePe3DFHA.2608@TK2MSFTNGP10.phx.gbl... > I do have 5 logical drives, 1 for logs and 1 for dbases for each server, and > a quorum drive. Been evaluating the benefits of adding a dedicated array > for tempDB and indexes. Roughly 30-40 % of ransactions occur in tempDB > (still have a lot of tempDbases as it seems to be the most efficient way to > maintain/handle the passing of tables between stored procedures. In many > cases, there is no easy way to eliminate the passing of the tables as a > couple of stored procedures (the big offenders) require a large amount of > processing. > > > "Andrew J. Kelly" <sqlmvpnooospam@shadhawk.com> wrote in message > news:e65DCOtDFHA.2632@TK2MSFTNGP12.phx.gbl... > > It sure sounds like your storage array is not able to handle the load. > > Having a Raid 1+0 is great but only having 6 drives is on the low end. > > Since you are not on a SAN you only have the cache on the controllers > > themselves to help deal with all the writes. Make sure the cache on the > > controllers is set to 100% write back and not read. It also sounds like > you > > only have a single drive array and everything is on it (logs , data and > > tempdb). That also is not good. You should have at least a Raid1 or > > Raid1+0 for just the log files. The spikes can be caused by lots of > things > > but you are probably seeing some of this due to checkpoints. You can use > > the Checkpoint pages per second perfmon counter to correlate this to the > > disk queue spikes. You might also try using connection pooling instead of > > that many individual connections. You will need multiple pools to handle > > the different dbs but it should cut down the actual number of connections > > quite a bit.Have you ran any traces to see what kind of queries are being > > run and who the worst offenders are? You may have a few really poorly > tuned > > queries that are hurting everything else. > > > > > > -- > > Andrew J. Kelly SQL MVP > > > > > > "Tim" <tim@pds.notanaddress.ca> wrote in message > > news:%23v7AhlsDFHA.2804@TK2MSFTNGP14.phx.gbl... > > > Hi Group > > > > > > SYSTEM: > > > > > > Clustered SQL Server 2000 on Windows 2000 (2 servers), with RAID 1+0, 6 > > > drives (Total of 24 20GB hard drives with mirroring/striping). 4 GB RAM > / > > > Server. All fiber back end, with Fibre Channel Storage Hub 7 ( @ 100 > > > MB/sec ). Our SQL servers have fiber cards @ 400 MB/sec connected to > hub. > > > About 50 DBases, ranging in size from about 50-100 MB up to 6 GB. They > > > are > > > shared by 15 clients (they are fairly balanced over a 24 hour period, > when > > > 1 > > > client is busy, the other isn't). The total size of dbases is ~ 65 GB, > > > Log > > > File ~ 35 GB. 100-250 connected users (to our application), with > 300-500 > > > SQL connections. Dual XEON 2.8Ghz Processors / Server. > > > > > > PROBLEM: > > > > > > We are experiencing disk problems but are having a very difficult time > > > diagnosing 'exactly' what is the cause and solution. On the data > drives, > > > thru PerfMon, the % Disk Times are averaging between 250 and 1500, with > > > peaks of over 30,000 (How do I get 30,000 % disk use??). Average disk > > > queues are setting at about 100-250 during the problem times, but > > > occasionally see it spike well over 1000. The transactions/sec hover > > > between 50-100. Processor hovers between 10 and 40%, spiking to 70-80%. > > > The disk bytes/sec for the data drives average betwwen 1 and 6 > MBytes/sec. > > > > > > I know there are a lot of variables that can be at play here, but where > > > should I start? Is the system overall too small? Over worked?? We did > > > have > > > memory issues but increased the RAM and there no longer appears to be > > > buffer > > > cache problems and we have not yet exceeded the available RAM. We have > > > removed most of our temp tables (in favor of table variables) and > removed > > > a > > > lot of the cursors that were in the system. We moved our reporting > server > > > to it's own box (completely seperate from prod servers). We reduced the > > > log > > > file backup intervals and staggered our database backup jobs. We > > > converted > > > portions of our reporting to Analysis services. All with short term > > > marginal improvements. We have to speed our system up before we start > > > losing clients. Does it sound/appear as though the next option would be > > > to > > > scale out? scale up? Goto a SAN environment? And info and advise > would > > > be > > > greatly appreciated! > > > > > > If you got this far, thanks for the time in reading this long post!!! > And > > > am > > > anxious to read any responses. Thank you. > > > > > > > > > > > >
- Next message: Andrew J. Kelly: "Re: Extremely high % Disk Time .. How is this possible??"
- Previous message: Tim: "Re: Extremely high % Disk Time .. How is this possible??"
- In reply to: Tim: "Re: Extremely high % Disk Time .. How is this possible??"
- Next in thread: Andrew J. Kelly: "Re: Extremely high % Disk Time .. How is this possible??"
- Reply: Andrew J. Kelly: "Re: Extremely high % Disk Time .. How is this possible??"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|