Re: IOPS and megacycles calculations
- From: "John Fullbright [MVP]" <fjohn@donotspamnetappdotcom>
- Date: Tue, 13 Mar 2007 18:56:47 -0700
An example of bell curves that are skewed:
http://pirate.shu.edu/~wachsmut/Teaching/MATH1101/Descriptives/box.html
If the data is skewed you would likely apply a non-linear transform to
correct for the skew. It would be worth picking up a statistics package if
you go this route.
"TonyP" <TonyP@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:02ADB368-132C-47B5-84C8-9ED58305589D@xxxxxxxxxxxxxxxx
Ok a little more complicated than I had envisaged but like you said it is
worth the extra mile.
So how do I obtain an normalized bell curve!! and then find mean and
average? I found this link
http://support.microsoft.com/kb/213930
Also I dont understand change delta?
Tony
"John Fullbright [MVP]" wrote:
1. In some cases, I have found it necessary to take a statistical
approach
to sizing. In this case, you would collect data points over a period of
time (weeks to months) at a lower sampling frequency (5 minute interval
or
so). Once the data is collected and svaed to csv format, you can work
with
the numbers in your favorite statistics package to create a normalized
bell
curve, then find the mean and the standard deviation. From that point,
you
would add enough standeard deviations to the mean to reach the desired
level
of accuracy and use that figure for your sizing. I would go the extra
mile
and do this. If you use the average alone then 50% of the time you will
be
undersized, ensuring an undesirable end user experience and the failure
of
your project. By adding 5 standard deviations to the mean, you will meet
your perfomance objective 99.9485% of the time. If you use six standard
deviations, you'll reach 5 nines.
2. From a disk perspective, the important numbers are:
reads/sec
writes/sec
sec/write
sec/read
user count
active user count
I use the physical disk counters to obtain these numbers. These counters
are collected using perfmon. The sum of reads and writes per second (or
transactions/per sec) tells you the total IOPS while the read and write
numbers are used to generate the read/write ratio. Seconds per read and
write tell you about IO latency. You want latency to average less than
20ms
with no spikes lasting more than a few seconds over 50ms.
The MSExchangeIS counter for user count and active user count can be used
to
derive your concurrency rate. Again, don't use averages or you'll be
doomed
to failure. Unless there is a clear case (an organization such as a
hospital that works in clear shifts, or an airline where a lage
percentage
of employees are in the air and don't have access to mail), you should
err
toward the side of caution. I use 100% unless there is a clear case to
do
otherwise.
3. You'll need to take the read numbers, plus the write numbers (with
the
write penalty for your RAID type factored in) and determine the IOPS that
your storage subsystem must be capable of supporting (the math is in
"Optimizing Storage for Exchange Server 2003", so I won't repeat it
here).
You primary concern will be adequate spindle count to support the
performance requirement. Do be aware of the response time requirement.
Many folks make the mistake of using the maximum IOPS a spindle can
support
when determining spindle count. This number is not the same as the
number
of IOPS at a 20ms response time a spindle can support.
4. Change delta will be a significant factor with a sizable pop/imap
user
base. Typically these users pull the mail from the server to the client
each time they check their mail. If users check their mail each day,
this
translates to a 100% change delta without deleted items retention or a
12.5%
change delta with 7 days deleted items retention. As you migrate to MAPI
clients, the size of the databases will grow as more mail is retained on
the
server, however the change deltas will shrink. Change deltas impact
space
and time for incremental disk or tape backup or space for snapshots.
John
"TonyP" <TonyP@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:46B30508-7444-41F4-BAEB-E9961C83ACB5@xxxxxxxxxxxxxxxx
Hello
I am trying to capacity plan for a migration of 30,000 users to
Exchange
2003. Currently the users are based on a Sun iPlanet environment using
only
POP and IMAP connections.
In the exchange environment initially they will use POP/IMAP and MAPI
then
overtime we forecast about 20,000 MAXIMUM users to be configured for
MAPI.
Actually we cannot even see more than 15,000 ever being moved to MAPI.
Currently we also have a four node exchange 2003 cluster, 3 Active 1
Passive
hosting 20,000 users in MAPI with a backend SAN environment.
Our clustered environment is hosted at our Data Centre and all our
offices
over the city connect back to us via Fibre. All users connect via MAPI
and
there a few using RPC over HTTPS. There is also a front end NLB
consisting
of 3 servers for OWA but this is mainly used for OWA and OMA
connections.
I am using the Microsoft white paper "Optimizing Storage for Microsoft
Exchange Server 2003" to work out IOPS and Megacycles per mailbox.
I have monitored data for 2 weeks period on the current cluster and I
seem
to have no 2 hour busy period which is consistent (Monday mornings), it
seems
to change from day to day and hour to hour .. my users are in Spain and
working habits here are not the same!!
I intend to collect data for about one month then analysis this data
and
use
statistical averages etc to come up with a figure. I will come up with
an
IOPS and Megacycles for a 2 hour busy period every day. Then analysing
this
set of information to come up with an average IOPS and Megacycles based
on
data collected every day.
My intention is come up with an IOPS and Megacycle for the current
20,000
users. Then use this calculation to size hardware for the 30,000 users
I
intend to migrate.
I wanted to check on two things:
1) If my approach to use the IOPS and Megacycles from the 20,000 users
can
be used for planning for the 30,000 users?
2) MORE importantly looking at tonnes of documentation on the web I am
seeing some people suggesting to monitor - logical disk transfers and
others
suggesting Physical disk transfer in Event viewer. Which do I use? Four
node
cluster on HP hardware backend is Storagetek SAN.
Thanks
T
.
- Follow-Ups:
- Re: IOPS and megacycles calculations
- From: TonyP
- Re: IOPS and megacycles calculations
- References:
- Re: IOPS and megacycles calculations
- From: John Fullbright [MVP]
- Re: IOPS and megacycles calculations
- From: TonyP
- Re: IOPS and megacycles calculations
- Prev by Date: Re: Outlook 2003 vs Outlook 2003 RPC/HTTP
- Next by Date: Re: System folders in Exchange Server 2007
- Previous by thread: Re: IOPS and megacycles calculations
- Next by thread: Re: IOPS and megacycles calculations
- Index(es):
Relevant Pages
|
Loading