Re: Pending timeout v. Reprieve
From: Alex L (AlexL_at_discussions.microsoft.com)
Date: 01/14/05
- Next message: Rick: "RE: SUS in cluster mode?"
- Previous message: John Toner [MVP]: "Re: Three questions"
- In reply to: Iype Isac [MSFT]: "Re: Pending timeout v. Reprieve"
- Messages sorted by: [ date ] [ thread ]
Date: Fri, 14 Jan 2005 13:25:08 -0800
Iype,
In my online thread I set the CheckPoint to 1. So I do expect to get a
reprieve each time. However, I have only seen it not get a reprieve once in
all my observations. Thanks for all your help; your explanations have helped
tremendously.
Regards,
Alex
"Iype Isac [MSFT]" wrote:
> Alex
> The only possible explanation that I can think of is this.
> From the documenation
> "Resource DLLs initially set CheckPoint to zero" .
> If you make sure that the first SetResourceState call on the online pending
> state is greater than 0, then you are sure to get a reprieve. If you set the
> first call with checkpoint =0, then you will not get the repreive. Besides
> this, there should be no other reason that you are not getting a reprieve.
>
> Iype
>
>
> "Alex L" <AlexL@discussions.microsoft.com> wrote in message
> news:8238D7E4-83FC-41DB-BB57-A2E0399FEA7B@microsoft.com...
> > Hi Iype,
> >
> > Thanks again for your valuable information. I understood the part about
> > merely providing an update is sufficient to get a reprieve. However, I
> > did
> > not know that:
> >
> >> Also, the repreive is given by the resource monitor at the end of the
> >> PendingTimeout.
> >
> > What I do currently is SetResourceState() (onlinepending) call at the
> > beginning of my online thread. Is this good enough to get a reprieve at
> > the
> > end of the pending timeout, since that might be considered an update
> > "sometime in the previous PendingTimeout" ? If so, now 99% of my cases
> > make
> > sense. However, I still don't know why that 1% of the time, it actually
> > does
> > timeout without any reprieve given. Does this make sense?
> >
> > Thanks again,
> > Alex
> >
> >
> >
> > "Iype Isac [MSFT]" wrote:
> >
> >> Alex
> >> By providing an update on the status of the resource, what we mean here
> >> is
> >> that you just let the resource monitor know what your status is. So if
> >> you
> >> are online pending, and you still are online pending, even if you sent an
> >> update again saying that you are online pending, the resource monitor
> >> will
> >> give you a repreive. So a status update does not have to mean a change in
> >> the status.
> >> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/mscs/mscs/resource_status.asp
> >> The checkpoint member in the RESOURCE_STATUS structure plays an important
> >> role during the OnlinePending State. If you have called the
> >> SetResourceState
> >> with an incremented value of the checkPoint member, the resource monitor
> >> picks up that update and you will get a repreive if you have not come
> >> online
> >> yet. Else it is not treated as a new update and the resource will
> >> timeout,
> >> if it does not come online.
> >>
> >> Also, the repreive is given by the resource monitor at the end of the
> >> PendingTimeout. This is given to the resource, if the resource has given
> >> the
> >> resource monitor a status update sometime in the previous PendingTimeout
> >> interval. The repreive is not given exactly when the resource state is
> >> updated.
> >>
> >> HTH
> >> Iype
> >>
> >> "Alex L" <AlexL@discussions.microsoft.com> wrote in message
> >> news:219CBACB-5157-46A2-9E39-C1C8915079C9@microsoft.com...
> >> > Iype,
> >> >
> >> > Thanks for your response. This information is good to know. However,
> >> > my
> >> > code does not call SetResourceStatus until AFTER the resource is
> >> > online.
> >> > Therefore, from my point of view, the 2 situations below should behave
> >> > the
> >> > same because the online hasn't been completed so no updates would be
> >> > sent.
> >> >
> >> > Also, it is strange that the reprieve always comes EXACTLY when the
> >> > pending
> >> > timeout period has elapsed. According to the discription of
> >> > PendingTimeout,
> >> > that would mean my code is providing an update (SetResourceStatus) at
> >> > EXACTLY
> >> > that time.
> >> >
> >> > Perhaps there's something else going on here?
> >> >
> >> > Thanks again,
> >> > Alex
> >> >
> >> > "Iype Isac [MSFT]" wrote:
> >> >
> >> >> Alex
> >> >> PendingTimeout is not the time given to the resource to come online or
> >> >> offline. It is the time interval for which the service waits for an
> >> >> update
> >> >> on the status of the resource. If the resource updates the status,
> >> >> then
> >> >> it
> >> >> gets a repreive. If in this Pending TimeOut interval, if your resource
> >> >> does
> >> >> not give an update to the Resource Monitor about its status, then the
> >> >> resource is marked failed.
> >> >>
> >> >> Below is a link to the explanation...
> >> >> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/mscs/mscs/resources_pendingtimeout.asp
> >> >>
> >> >> "The PendingTimeout property sets the number of milliseconds that a
> >> >> Resource
> >> >> Monitor will wait for a resource DLL to update the status of a
> >> >> resource
> >> >> in
> >> >> an OnlinePending or OfflinePending state before terminating the
> >> >> resource.
> >> >> The PendingTimeout property does not necessarily limit the time that a
> >> >> resource can spend in a ClusterOnlinePending or ClusterOfflinePending
> >> >> state"
> >> >>
> >> >> So in the first case, it looks like the resource did not update its
> >> >> status
> >> >> within the time interval of 180 seconds (Pending TimeOut). So it
> >> >> failed.
> >> >> In the second scenario, it looks like the resource did update its
> >> >> status
> >> >> in
> >> >> the first time interval, so it got a 'reprieve' but it did not send
> >> >> any
> >> >> more
> >> >> status updates. So after the second time interval it failed.
> >> >>
> >> >> Iype
> >> >>
> >> >> "Alex L" <AlexL@discussions.microsoft.com> wrote in message
> >> >> news:29638B2B-571E-4782-AC49-126D97E76545@microsoft.com...
> >> >> > Hello,
> >> >> >
> >> >> > I am confused as to why sometimes I get a "reprieve" after the
> >> >> > pending
> >> >> > timeout period and sometimes I actually get timed out after the
> >> >> > resource
> >> >> > pending timeout period. To clarify, below is 2 situations, both of
> >> >> > which
> >> >> > I
> >> >> > have seen to occur.
> >> >> >
> >> >> > Situation 1: pending timeout:
> >> >> > 1. set resource pending timeout = 180 seconds
> >> >> > 2. online resource
> >> >> > 3. after 180 seconds (and resource not yet online) resource is timed
> >> >> > out
> >> >> > and
> >> >> > fails, cluster.log:
> >> >> >
> >> >> > WARN [RM] RmpTimerThread: Resource testres1 pending timed out, CP
> >> >> > 1 -
> >> >> > setting state to failed.
> >> >> >
> >> >> >
> >> >> > Situaion 2: reprieve after timeout
> >> >> > 1. set pending timeout = 180
> >> >> > 2. online resource
> >> >> > 3. after 180 seconds, resource gets a "reprieve" and is NOT failed.
> >> >> > cluster.log:
> >> >> >
> >> >> > INFO [RM] RmpTimerThread: Giving a reprieve for resource "testres1"
> >> >> >
> >> >> > 4. after ANOTHER 180 seconds, if the resource still isn't online,
> >> >> > the
> >> >> > resource is now timed out and fails.
> >> >> >
> >> >> > Situation 1 makes sense to me. Does anyone know why a reprieve is
> >> >> > given
> >> >> > in
> >> >> > situation 2, effectively making the actual pending timeout to be =
> >> >> > 2x
> >> >> > the
> >> >> > configured pending timeout.
> >> >> >
> >> >> > Thanks in Advance,
> >> >> > Alex
> >> >>
> >> >>
> >> >>
> >>
> >>
> >>
>
>
>
- Next message: Rick: "RE: SUS in cluster mode?"
- Previous message: John Toner [MVP]: "Re: Three questions"
- In reply to: Iype Isac [MSFT]: "Re: Pending timeout v. Reprieve"
- Messages sorted by: [ date ] [ thread ]