Re: Extremely high % Disk Time .. How is this possible??
From: Andrew J. Kelly (sqlmvpnooospam_at_shadhawk.com)
Date: 02/10/05
- Next message: donotspam: "4 Queries about DB clustering"
- Previous message: Geoff N. Hiten: "Re: Extremely high % Disk Time .. How is this possible??"
- In reply to: Geoff N. Hiten: "Re: Extremely high % Disk Time .. How is this possible??"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 10 Feb 2005 10:18:06 -0500
Completely agree with Geoff.
-- Andrew J. Kelly SQL MVP "Geoff N. Hiten" <SRDBA@Careerbuilder.com> wrote in message news:%237$Gb13DFHA.2824@tk2msftngp13.phx.gbl... > As Andrew pointed out, it is the physical drives that appear to be > limited. > Slicing them into logical drives may look better but it does not increase > IO > capability. Separating data and logs onto different physical disks with > the > correct RAID layouts will increase performance. Andrew also suggested > profiling for performance to find if there are just a few really ugly > queries that are killing your system. I agree completely. > > As for the original question on Percent IO Time, that counter has been > largely irrelevent since SCSI Overlapped IO was introduced way back when. > Caching controllers really killed it. I prefer to look at disk queue > length > and read bytes/write bytes per second as true measures of disk > performance. > > I believe that you will be running with write cache off in a non-SAN > cluster > to prevent data loss during a cluster failover. This is a big performance > killer and one major reason I do not recommend a cluster without a SAN. > All > indicators are that you need to modify your IO system. You may find other > performance bottlenecks as you remove the IO limit, but for right now, > that > is the worst offender. > > -- > Geoff N. Hiten > Microsoft SQL Server MVP > Senior Database Administrator > Careerbuilder.com > > I support the Professional Association for SQL Server > www.sqlpass.org > > "Tim" <tim@pds.notanaddress.ca> wrote in message > news:%23kwePe3DFHA.2608@TK2MSFTNGP10.phx.gbl... >> I do have 5 logical drives, 1 for logs and 1 for dbases for each server, > and >> a quorum drive. Been evaluating the benefits of adding a dedicated array >> for tempDB and indexes. Roughly 30-40 % of ransactions occur in tempDB >> (still have a lot of tempDbases as it seems to be the most efficient way > to >> maintain/handle the passing of tables between stored procedures. In many >> cases, there is no easy way to eliminate the passing of the tables as a >> couple of stored procedures (the big offenders) require a large amount of >> processing. >> >> >> "Andrew J. Kelly" <sqlmvpnooospam@shadhawk.com> wrote in message >> news:e65DCOtDFHA.2632@TK2MSFTNGP12.phx.gbl... >> > It sure sounds like your storage array is not able to handle the load. >> > Having a Raid 1+0 is great but only having 6 drives is on the low end. >> > Since you are not on a SAN you only have the cache on the controllers >> > themselves to help deal with all the writes. Make sure the cache on >> > the >> > controllers is set to 100% write back and not read. It also sounds >> > like >> you >> > only have a single drive array and everything is on it (logs , data and >> > tempdb). That also is not good. You should have at least a Raid1 or >> > Raid1+0 for just the log files. The spikes can be caused by lots of >> things >> > but you are probably seeing some of this due to checkpoints. You can > use >> > the Checkpoint pages per second perfmon counter to correlate this to >> > the >> > disk queue spikes. You might also try using connection pooling instead > of >> > that many individual connections. You will need multiple pools to > handle >> > the different dbs but it should cut down the actual number of > connections >> > quite a bit.Have you ran any traces to see what kind of queries are > being >> > run and who the worst offenders are? You may have a few really poorly >> tuned >> > queries that are hurting everything else. >> > >> > >> > -- >> > Andrew J. Kelly SQL MVP >> > >> > >> > "Tim" <tim@pds.notanaddress.ca> wrote in message >> > news:%23v7AhlsDFHA.2804@TK2MSFTNGP14.phx.gbl... >> > > Hi Group >> > > >> > > SYSTEM: >> > > >> > > Clustered SQL Server 2000 on Windows 2000 (2 servers), with RAID 1+0, > 6 >> > > drives (Total of 24 20GB hard drives with mirroring/striping). 4 GB > RAM >> / >> > > Server. All fiber back end, with Fibre Channel Storage Hub 7 ( @ 100 >> > > MB/sec ). Our SQL servers have fiber cards @ 400 MB/sec connected to >> hub. >> > > About 50 DBases, ranging in size from about 50-100 MB up to 6 GB. > They >> > > are >> > > shared by 15 clients (they are fairly balanced over a 24 hour period, >> when >> > > 1 >> > > client is busy, the other isn't). The total size of dbases is ~ 65 > GB, >> > > Log >> > > File ~ 35 GB. 100-250 connected users (to our application), with >> 300-500 >> > > SQL connections. Dual XEON 2.8Ghz Processors / Server. >> > > >> > > PROBLEM: >> > > >> > > We are experiencing disk problems but are having a very difficult >> > > time >> > > diagnosing 'exactly' what is the cause and solution. On the data >> drives, >> > > thru PerfMon, the % Disk Times are averaging between 250 and 1500, > with >> > > peaks of over 30,000 (How do I get 30,000 % disk use??). Average >> > > disk >> > > queues are setting at about 100-250 during the problem times, but >> > > occasionally see it spike well over 1000. The transactions/sec hover >> > > between 50-100. Processor hovers between 10 and 40%, spiking to > 70-80%. >> > > The disk bytes/sec for the data drives average betwwen 1 and 6 >> MBytes/sec. >> > > >> > > I know there are a lot of variables that can be at play here, but > where >> > > should I start? Is the system overall too small? Over worked?? We > did >> > > have >> > > memory issues but increased the RAM and there no longer appears to be >> > > buffer >> > > cache problems and we have not yet exceeded the available RAM. We > have >> > > removed most of our temp tables (in favor of table variables) and >> removed >> > > a >> > > lot of the cursors that were in the system. We moved our reporting >> server >> > > to it's own box (completely seperate from prod servers). We reduced > the >> > > log >> > > file backup intervals and staggered our database backup jobs. We >> > > converted >> > > portions of our reporting to Analysis services. All with short term >> > > marginal improvements. We have to speed our system up before we >> > > start >> > > losing clients. Does it sound/appear as though the next option would > be >> > > to >> > > scale out? scale up? Goto a SAN environment? And info and advise >> would >> > > be >> > > greatly appreciated! >> > > >> > > If you got this far, thanks for the time in reading this long post!!! >> And >> > > am >> > > anxious to read any responses. Thank you. >> > > >> > > >> > >> > >> >> > >
- Next message: donotspam: "4 Queries about DB clustering"
- Previous message: Geoff N. Hiten: "Re: Extremely high % Disk Time .. How is this possible??"
- In reply to: Geoff N. Hiten: "Re: Extremely high % Disk Time .. How is this possible??"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|