Re: IOPS and megacycles calculations



Often times (especially in an environment where users connect briefly, pull
mail, then disconnect seemingly at random, you woun't find that one or two
hour period. An alternate is to gather data over time. When you plot the
data, you are looking for a normal distribution or bell curve. If you have
a normal distribution, great; if not you may have to correct for skew to
make it a normal distribution..

Once you have a bell curve, the average (average function in excel) value is
the value that is valid for 50% of the data points. If you add 1 standard
deviation (stdev function in excel) to the average, the sum will be a value
that is valid for 87% of the data points. As you add more and more
standard deviations, the sum becumes a value that is valid for more and more
data points. In this way you can come up with a value to use that is valid
to a degree of accuracy that will meat your SLA.

It's important to note the response time along with the number of IO's per
spindle when you collect the data. If your response time is less than 20
ms, fine. If the average response time is more than 20ms, then the system
from which you are collecting data has a bottleneck and you'll need to add
spindles to correct the problem. Every storage vendor has a plot for drives
of different types that shows response time on one axis and IOPS on the
other. If I take a 15K FC SCSI spindle for example, I get 130 IOPS at
20ms and over 200 IOPS at 60ms. When the spindles get overloaded to a point
(about 400 IOPS for our example 15K FC spindle) the IOPS/spindle doesn't go
up any more and the response time gets longer and longer as the load
increases. If you are in this range, the data you collect will be invalid.
That's why it's important to look at response time as well; you want to
ensure that the data you are collecting is valid.

Disk latency is the primary cause of RPC latency, however it is not the only
possible cause. You can look at RPC latency as well if you are worried
about network or CPU sources of latency. With the expection of the network
source of RPC latency caused by the early version of MS05 019, I don't
believe I've ever seen a source of RPC latency other than disk. It's not to
say that it can't happen, just that it's rare.

I personally prefer the physical disk counters. In the old NT days, you
actually had to do diskperf -y to turn on the physical disk counters, and
many folks that have been around a while instictively refer to logical disk
counters because they're always there. In 2003, THe physical counters are
there by default as well.

You specifrically have to know the read/write ratio to apply the write
penalty when figuring total IOPS. If you don't measure, how would you know
what it is? It's implied in order to:

"
· Depending on the hardware RAID configuration you use, plan for I/O
penalties. In general, for each write request, hardware RAID generates the
following I/O:

· RAID-0 = 1 write

· RAID-1 or RAID-10 = 2 writes

· RAID-5 = 4 writes

Use the following formula to calculate your I/O penalty:

(IOPS/mailbox × READ RATIO)+ ((IOPS/mailbox × WRITE RATIO) ×RAID penalty)

For example, if you have 1,500 IOPS per mailbox (as calculated using
procedures earlier in this guide), your read ratio is 66%/33% (two requests
out of every three are read requests and the remaining one request is a
write request), and you are using a RAID-1 or RAID-10 array, your actual
hardware IOPS is:

(1,500 × 2/3) + ((1,500 × 1/3) × 2) = 2,000

Applying the same scenario on a RAID-5 array, your actual hardware IOPS is:

(1,500 × 2/3) + ((1,500 × 1/3) × 4) = 3,000

If all of your drives are 10,000 RPM, you will need at least 30 drives to
obtain your required IOPS in a RAID-5 configuration. If you implement RAID-1
or RAID-10 instead, then you would need at least 20 drives (you can't have a
RAID-1 or RAID-10 solution with an odd number of disks).

"


John




"TonyP" <TonyP@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:AE121735-B102-4F0E-8F31-6F3CCFD98964@xxxxxxxxxxxxxxxx
Hi John

Sorry John I am going to start again with you, I really got lost with the
entire mean, deviation and bell curve stuff. I come to this later again!!

I really want to follow your approach I need you to bear with me please?

It has been a while since I done some statistical work, (at University in
UK
large part of my degree was in Statistical Economics but I did not really
study too hard!!)

Also this is my first crack at really being a solution architect and
finally
coming off the team for senior windows network support, so I really want
to
make the grade and do it correctly.

I also want to finish off the approach in "Optimizing Storage for
Microsoft
Exchange" that Microsoft mention.

Just I have four weeks of data (its not prefect) and while I wait to
collect
data using your method I might as well finish off there way and compare
the
difference. I really enjoy this job and I have lots of time to finish off
the
method that was mentioned in "Optimizing Storage for Microsoft Exchange"

I can keep you updated on the results of the two methods if you like?

Anyway back to my FIRST concern, in the "Optimizing Storage for Microsoft
Exchange" Paper they talk about looking at the following counters:

1) MSExchangeIS à RPC Operations/sec
2) Logical Disk à Disk Transfer/sec à Instance=Drive Letter (houses
exchange database)
3) Processor à %Processor à Instance=Total

They say to look for a 1-2 hour period in which all values are the
highest,
use this data to populate the "server_sizing.xls" spread***.

They recommend doing this for 3 nodes of exchange. Use the server with the
highest load as the baseline for server/processor/storage etc (as you
mentioned I/O penalty for RAID, read ratio , write ratio must be accounted
for).

They do not mention gathering values for read and write rate penalty for
IOPS and suggest you base it on recommended value!!

Why I prefer the route suggested by you which allows us to have a
calculation for this penalty.

Note: I found an article on an Exchange blog site which mentioned that if
you have more than one database drive per server then you would record the
average Disk Transfer/sec for EACH database drive for the 1-2 hour period
and
then take the average of these to populate the IOPS/sec in the
"server_sizing.xls" spread***. phew!!

I have data for each node, collected at 15 second intervals as suggested
in
the Whitepaper for 4-5 weeks already (some days data collection is not
valid)
I want to populate my "server_sizing.xls" for each day and each node for
3-4
weeks and then obtain as I mentioned a master IOPS and Megacycle set of
numbers for 3-4 weeks.

I will then analysis these in the statistical approach you mentioned to
get
"Normalized Bell Curve" and come up with a more accurate value.

I will do some research into how to obtain a "Normalized Bell Curve" and
play around with the values for IOPS and Megacycles from above while
waiting
to re-collect values for 2-3 weeks using your method.

OK now to the main point (sorry about the waffle above!) .. Going back to
collecting values as you have mentioned, I am a little unclear on the
Physical Disk values you wish me to collect and sampling interval.

1) What time interval do you suggest? 5 mins? 1 mins? 15 secs?
2) Log individual logs on each physical node?
3) Also what time frame should each log last ? 24 hours? Weekly? 9 to 5?

Actually I have also been logging some long term values for about 2 weeks
on
a separate server at 10 minutes intervals. Individual Logs gather data
Monday
0:00 to Friday 23:59 for each EVS01, EVS02, EVS03, and EVS04 for a weekly
period excluding weekends.

I logged all counters for a series of Exchange and Server counters that
are
recommended. I intend to use these for a "Long Term Trend Analysis". I
wonder
if I could extract some data from this?

I have already noticed latency issues in the current 4 node cluster
containing 20,000 users which I am using to calculate IOPS and Megacycles
to
Capacity Plan for the 30,000 users which need to be migrated from iPlanet.

Over the weekend I will examine the last 2 weeks ""Long Term Trend
Analysis"
and see what is happening with present environment.

What if I see a lot of latency on the database drives I am using to
calculate IOPS?

Will my figures for IOPS be underestimated or overestimated?

The design and configuration of the database, transaction logs, SMTP drive
locations and MTA is terrible by the previous team!! TERRIBLE!!!

I can email you my excel spread*** of the setup if you like?

They implemented 4 storage groups by server (3A-1P) and each storage group
has 4 mailbox stores.

I finally made them move each set of 4 mailbox store databases within each
Storage Group to a dedicated drive letter.

When they setup the existing 20,000 environment they ran out of drive
letters hence the following:
A single drive contains two transaction log sets from two Storage Groups.
The SMTP locations are on the same drive as the Transaction log drives.

Ever heard of Mount Points!!!

Oh by the way John they implemented two 136GB drives in RAID 1 for each
drive letter in SAN .. I know I know!!! .. and I wonder why it took me
years
to get a job at this level when other guys are setting things up terrible?

Now what was that saying it is who you know NOT what you know!!

MTA is actually on the same drive as database drive from a mailbox store,
I
will move it to a transaction drive.
I know this is not recommended but - Only because I want each 4 set of
mailboxe stores hence each Storage group on a dedicated drive for the
purposes of calculating IOPS without interference from transaction, SMTP
and
MTA drive activity.


Also on the counters you mention, do you mean the following:

1) reads/sec - % Disk Reads/sec ??
2) write/sec - % Disk Write/sec ??
3) sec\write - % Average Disk sec/Write ??
4) sec\read - % Average Disk sec/Read ??

Ok enough for now I been at work for nearly 14 hours!! I still have a few
more question in regards to :

1) "The MSExchangeIS counter for user count and active user count can be
used to
derive your concurrency rate" I use 100% unless there is a clear case to
do
otherwise.

2) Do be aware of the response time requirement.
Many folks make the mistake of using the maximum IOPS a spindle can
support
when determining spindle count. This number is not the same as the number
of IOPS at a 20ms response time a spindle can support.

But I save them for later!!

Thanks John for the advice so far, it is great to have someone to talk to
about this stuff that actually understands!

Regards and so look forward to reading your response,

Tony

P.S. - Here is to passing on knowledge in our industry and I cant wait
for
the day I can be on the MVP team.


"John Fullbright [MVP]" wrote:

An example of bell curves that are skewed:

http://pirate.shu.edu/~wachsmut/Teaching/MATH1101/Descriptives/box.html

If the data is skewed you would likely apply a non-linear transform to
correct for the skew. It would be worth picking up a statistics package
if
you go this route.


"TonyP" <TonyP@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:02ADB368-132C-47B5-84C8-9ED58305589D@xxxxxxxxxxxxxxxx

Ok a little more complicated than I had envisaged but like you said it
is
worth the extra mile.

So how do I obtain an normalized bell curve!! and then find mean and
average? I found this link

http://support.microsoft.com/kb/213930

Also I dont understand change delta?

Tony



"John Fullbright [MVP]" wrote:

1. In some cases, I have found it necessary to take a statistical
approach
to sizing. In this case, you would collect data points over a period
of
time (weeks to months) at a lower sampling frequency (5 minute
interval
or
so). Once the data is collected and svaed to csv format, you can work
with
the numbers in your favorite statistics package to create a normalized
bell
curve, then find the mean and the standard deviation. From that
point,
you
would add enough standeard deviations to the mean to reach the desired
level
of accuracy and use that figure for your sizing. I would go the extra
mile
and do this. If you use the average alone then 50% of the time you
will
be
undersized, ensuring an undesirable end user experience and the
failure
of
your project. By adding 5 standard deviations to the mean, you will
meet
your perfomance objective 99.9485% of the time. If you use six
standard
deviations, you'll reach 5 nines.

2. From a disk perspective, the important numbers are:

reads/sec
writes/sec
sec/write
sec/read
user count
active user count

I use the physical disk counters to obtain these numbers. These
counters
are collected using perfmon. The sum of reads and writes per second
(or
transactions/per sec) tells you the total IOPS while the read and
write
numbers are used to generate the read/write ratio. Seconds per read
and
write tell you about IO latency. You want latency to average less
than
20ms
with no spikes lasting more than a few seconds over 50ms.

The MSExchangeIS counter for user count and active user count can be
used
to
derive your concurrency rate. Again, don't use averages or you'll be
doomed
to failure. Unless there is a clear case (an organization such as a
hospital that works in clear shifts, or an airline where a lage
percentage
of employees are in the air and don't have access to mail), you should
err
toward the side of caution. I use 100% unless there is a clear case
to
do
otherwise.

3. You'll need to take the read numbers, plus the write numbers (with
the
write penalty for your RAID type factored in) and determine the IOPS
that
your storage subsystem must be capable of supporting (the math is in
"Optimizing Storage for Exchange Server 2003", so I won't repeat it
here).
You primary concern will be adequate spindle count to support the
performance requirement. Do be aware of the response time
requirement.
Many folks make the mistake of using the maximum IOPS a spindle can
support
when determining spindle count. This number is not the same as the
number
of IOPS at a 20ms response time a spindle can support.

4. Change delta will be a significant factor with a sizable pop/imap
user
base. Typically these users pull the mail from the server to the
client
each time they check their mail. If users check their mail each day,
this
translates to a 100% change delta without deleted items retention or a
12.5%
change delta with 7 days deleted items retention. As you migrate to
MAPI
clients, the size of the databases will grow as more mail is retained
on
the
server, however the change deltas will shrink. Change deltas impact
space
and time for incremental disk or tape backup or space for snapshots.

John


"TonyP" <TonyP@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:46B30508-7444-41F4-BAEB-E9961C83ACB5@xxxxxxxxxxxxxxxx
Hello


I am trying to capacity plan for a migration of 30,000 users to
Exchange
2003. Currently the users are based on a Sun iPlanet environment
using
only
POP and IMAP connections.

In the exchange environment initially they will use POP/IMAP and
MAPI
then
overtime we forecast about 20,000 MAXIMUM users to be configured for
MAPI.
Actually we cannot even see more than 15,000 ever being moved to
MAPI.

Currently we also have a four node exchange 2003 cluster, 3 Active 1
Passive
hosting 20,000 users in MAPI with a backend SAN environment.

Our clustered environment is hosted at our Data Centre and all our
offices
over the city connect back to us via Fibre. All users connect via
MAPI
and
there a few using RPC over HTTPS. There is also a front end NLB
consisting
of 3 servers for OWA but this is mainly used for OWA and OMA
connections.

I am using the Microsoft white paper "Optimizing Storage for
Microsoft
Exchange Server 2003" to work out IOPS and Megacycles per mailbox.

I have monitored data for 2 weeks period on the current cluster and
I
seem
to have no 2 hour busy period which is consistent (Monday mornings),
it
seems
to change from day to day and hour to hour .. my users are in Spain
and
working habits here are not the same!!

I intend to collect data for about one month then analysis this data
and
use
statistical averages etc to come up with a figure. I will come up
with
an
IOPS and Megacycles for a 2 hour busy period every day. Then
analysing
this
set of information to come up with an average IOPS and Megacycles
based
on
data collected every day.

My intention is come up with an IOPS and Megacycle for the current
20,000
users. Then use this calculation to size hardware for the 30,000
users
I
intend to migrate.

I wanted to check on two things:

1) If my approach to use the IOPS and Megacycles from the 20,000
users
can
be used for planning for the 30,000 users?

2) MORE importantly looking at tonnes of documentation on the web I
am
seeing some people suggesting to monitor - logical disk transfers
and
others
suggesting Physical disk transfer in Event viewer. Which do I use?
Four
node
cluster on HP hardware backend is Storagetek SAN.

Thanks

T









.