Re: Jobs don't run and are stuck with request pending
From: Scott (Scott_at_discussions.microsoft.com)
Date: 11/15/04
- Next message: stoko: "Time Out"
- Previous message: John Bandettini: "RE: task manager, processes wrong?"
- In reply to: Sue Hoegemeier: "Re: Jobs don't run and are stuck with request pending"
- Next in thread: Sue Hoegemeier: "Re: Jobs don't run and are stuck with request pending"
- Reply: Sue Hoegemeier: "Re: Jobs don't run and are stuck with request pending"
- Messages sorted by: [ date ] [ thread ]
Date: Mon, 15 Nov 2004 05:54:06 -0800
How specifically would you like us to capture the trace?
"Sue Hoegemeier" wrote:
> There are fixes in MS03-031 as well as the security patches.
> It's a good idea to install it even though I doubt it has
> anything to do with your issues.
> All the threading looks fine. The jobs are in the job cache
> and are fine. It's just something at the end of running the
> jobs that hangs up.
> I'd still suspect some contention somewhere. Or something is
> erroring out. It will likely be a pain with all the jobs you
> have running but you probably need to run a trace on the
> jobs after getting everything running again - even if it's
> for just one run on some of the jobs. You'll need to use
> that to track down where the issues are. You could try
> catching whatever is going on or additional info by querying
> sysprocesses as well while you run a trace. Watch for wait
> times and blocking in sysprocesses. Watch for errors and
> capture both starting and completing statements in the trace
> so you can see if something timeouts or never completes.
>
> -Sue
>
> On Thu, 11 Nov 2004 14:06:50 -0800, "Scott"
> <Scott@discussions.microsoft.com> wrote:
>
> >1) Yes we are on SP3a for 2000 Enterprise.
> >
> >2) No we didn't have the patch. We will install tonight but it just looks
> >like a security patch and does not look like it is related to the problem. ???
> >
> >3) Here is the output for the dbcc command when the jobs are "stalled":
> >
> >
> >Scheduler ID 0
> > num users 6
> > num runnable 0
> > num workers 12
> > idle workers 11
> > work queued 0
> > cntxt switches 3975785
> > cntxt switches(idle) 9990418
> >Scheduler ID 1
> > num users 6
> > num runnable 0
> > num workers 12
> > idle workers 11
> > work queued 0
> > cntxt switches 1.01E+07
> > cntxt switches(idle) 2.31E+07
> >Scheduler ID 2
> > num users 7
> > num runnable 0
> > num workers 12
> > idle workers 11
> > work queued 0
> > cntxt switches 7623701
> > cntxt switches(idle) 1.17E+07
> >Scheduler ID 3
> > num users 6
> > num runnable 0
> > num workers 12
> > idle workers 11
> > work queued 0
> > cntxt switches 5559495
> > cntxt switches(idle) 1.04E+07
> >Scheduler ID 4
> > num users 5
> > num runnable 0
> > num workers 12
> > idle workers 11
> > work queued 0
> > cntxt switches 1.28E+07
> > cntxt switches(idle) 1.75E+07
> >Scheduler ID 5
> > num users 5
> > num runnable 0
> > num workers 12
> > idle workers 11
> > work queued 0
> > cntxt switches 1.34E+07
> > cntxt switches(idle) 1.52E+07
> >Scheduler ID 6
> > num users 5
> > num runnable 1
> > num workers 11
> > idle workers 9
> > work queued 0
> > cntxt switches 1.46E+07
> > cntxt switches(idle) 1.98E+07
> >Scheduler ID 7
> > num users 4
> > num runnable 0
> > num workers 11
> > idle workers 10
> > work queued 0
> > cntxt switches 2.49E+07
> > cntxt switches(idle) 1.77E+07
> >Scheduler Switches 0
> >Total Work 1.34E+07
> >
> >
> >It looks okay to me. What is your interpretation?
> >Is there anything else we can check?
> >
> >
> >
> >"Sue Hoegemeier" wrote:
> >
> >> Pretty much like you are hitting a max number of jobs but
> >> you aren't getting any errors on running out of threads on
> >> the subsystem or running out of worker threads on the
> >> server. Are you on all the latest service packs? Did you
> >> apply MS03-031?
> >> Did you check worker threads on the server itself? You can
> >> use dbcc sqlperf(umsstats)
> >>
> >> -Sue
> >>
> >> On Thu, 11 Nov 2004 12:24:05 -0800, "Scott"
> >> <Scott@discussions.microsoft.com> wrote:
> >>
> >> >They all got set to a date in the future.
> >> >Of the 5 jobs 1 ran and 4 never ran.
> >> >And the one that started running made another of the jobs (that were
> >> >running) stop.
> >> >
> >> >
> >> >
> >> >"Sue Hoegemeier" wrote:
> >> >
> >> >> On the jobs that are no longer running, what happens if you
> >> >> update the schedule with:
> >> >> sp_update_jobschedule
> >> >> @job_name = 'YourJob',
> >> >> @name = 'ScheduleNameforJob'
> >> >> Does the next run time get set to a time in the future vs.
> >> >> in the past. And does the job then run on the next run time?
> >> >>
> >> >> -Sue
> >> >>
> >> >> On Thu, 11 Nov 2004 08:41:01 -0800, "Scott"
> >> >> <Scott@discussions.microsoft.com> wrote:
> >> >>
> >> >> >1) Event viewer was okay. log files were not full.
> >> >> >
> >> >> >2) Jobs are NOT set to run on idle CPU
> >> >> >
> >> >> >3) Everything about the job is enabled. If we disable other jobs the ones
> >> >> >that were not running start to run.
> >> >> >
> >> >> >4) There was no errors in the Agent log file.
> >> >> >
> >> >> >5) We turned on verbose option and saw that the jobs requested to run:
> >> >> >
> >> >> >Job Application ID: XXX Report Generation has been requested to run by
> >> >> >Schedule XXX (Run Every Minute)
> >> >> >
> >> >> >But did not see the "being logged to sysjobhistory" message when it
> >> >> >shouldhave completed running.
> >> >> >
> >> >> >Note: That we turned on verbose option then turned on the jobs. For each job
> >> >> >the request to run message happens once and never again.
> >> >> >
> >> >> >
> >> >> >
> >> >> >"Sue Hoegemeier" wrote:
> >> >> >
> >> >> >> When you say "The jobs that don't run have a next scheduled
> >> >> >> run time as the time they were suppose to run" do you mean
> >> >> >> dates and times in the past?
> >> >> >> A few other odd things to check: Make sure the Windows event
> >> >> >> logs aren't filled up and no longer logging. Make sure the
> >> >> >> jobs aren't set to run only on idle cpu conditions. Make
> >> >> >> sure everything in the job is enabled - job, schedules.
> >> >> >> For other issues you would typically find something in the
> >> >> >> SQL Agent log (SQLAgent.out file). You may want to turn on
> >> >> >> more verbose logging in Agent - from properties select
> >> >> >> Include execution trace messages. You only want to keep that
> >> >> >> on for troubleshooting though -especially if you have a lot
> >> >> >> of jobs on the box.
> >> >> >>
> >> >> >> -Sue
> >> >> >>
> >> >> >> On Wed, 10 Nov 2004 10:59:05 -0800, "Scott"
> >> >> >> <Scott@discussions.microsoft.com> wrote:
> >> >> >>
> >> >> >> >Okay I ran the sp_help_job several times
> >> >> >> >All the jobs are 1's and 4's. The ones that don't run are 4's which seems
> >> >> >> >odd to me considering that if you try to start it manually it says there is
> >> >> >> >already a request.
> >> >> >> >
> >> >> >> >
> >> >> >> >The jobs that don't run have a next scheduled run time as the time they were
> >> >> >> >suppose to run
> >> >> >> >
> >> >> >> >
> >> >> >> >"Sue Hoegemeier" wrote:
> >> >> >> >
> >> >> >> >> The T-SQL job subsystem defaults to 20. I doubt that is the
> >> >> >> >> issue as well - I'm pretty sure something gets logged in the
> >> >> >> >> SQL Agent log when you run out of job worker threads.
> >> >> >> >> You need to check the execution status of the jobs using
> >> >> >> >> sp_help_job. If there are execution statuses of 2, then it's
> >> >> >> >> related to worker threads. If there are execution statuses
> >> >> >> >> of 7, then the jobs are getting hung up performing something
> >> >> >> >> - sometimes it's on emailing results and mail gets hung up.
> >> >> >> >> There can be other reasons as well. The other thing to check
> >> >> >> >> would be the next scheduled run time. If the scheduler gets
> >> >> >> >> confused, this would be listed as not available or something
> >> >> >> >> like that. Can't remember the exact wording.
> >> >> >> >>
> >> >> >> >> -Sue
> >> >> >> >>
> >> >> >> >> On Wed, 10 Nov 2004 08:02:05 -0800, "Scott"
> >> >> >> >> <Scott@discussions.microsoft.com> wrote:
> >> >> >> >>
> >> >> >> >> >I doubt that it is the TSQL running out of worker threads since it is current
> >> >> >> >> >set at 200 and there are no more than 100 jobs running at any one time. But I
> >> >> >> >> >will verify.
> >> >> >> >> >I belive the default was 20 or 25 before I added the registry entry and
> >> >> >> >> >increased it to 200.
> >> >> >> >> >
> >> >> >> >> >The job history does not show anything out of the ordinary. If the job was
> >> >> >> >> >running and then stopped running it just shows the times that it ran. If the
> >> >> >> >> >job is created and does not run the first time then there is no job historyto
> >> >> >> >> >look at.
> >> >> >> >> >
> >> >> >> >> >There is no blocking or locks. Another thing that I tried is changing the
> >> >> >> >> >transaction isolation level on the sp running to do dirty reads and this did
> >> >> >> >> >not help either.
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >Thanks again Sue for your suggestions.
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> >"Sue Hoegemeier" wrote:
> >> >> >> >> >
> >> >> >> >> >> If it's actually the subsystem running out of worker
> >> >> >> >> >> threads, you could verify this by executing:
> >> >> >> >> >> sp_help_job @execution_status = 2
> >> >> >> >> >> If it is a worker thread issue, you have to increase the
> >> >> >> >> >> value of threads for the appropriate job subsystem depending
> >> >> >> >> >> on the jobs steps and what subsystem is used.
> >> >> >> >> >> If it's not related to threads, did you check the job
> >> >> >> >> >> history and check for any blocking?
> >> >> >> >> >>
> >> >> >> >> >> -Sue
> >> >> >> >> >>
> >> >> >> >> >> On Wed, 10 Nov 2004 07:13:05 -0800, "Scott"
> >> >> >> >> >> <Scott@discussions.microsoft.com> wrote:
> >> >> >> >> >>
> >> >> >> >> >> >1) Yes I rebooted the system to make sure after adding the registrsy entry
> >> >> >> >> >> >for the work threads
> >> >> >> >> >> >
> >> >> >> >> >> >2) Verifyed that the worker threads were set in SQL system after the reboot.
> >> >> >> >> >> >
> >> >> >> >> >> >3) I don't know if that subsystem is where the problem it. It is just one
> >> >> >> >> >> >thing that I tried.
> >> >> >> >> >> >
> >> >> >> >> >> >4) Nothing in the agent log file.
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> >
> >> >> >> >> >> >"Sue Hoegemeier" wrote:
> >> >> >> >> >> >
> >> >> >> >> >> >> Did you restart SQL Agent after you created the registry key
> >> >> >> >> >> >> and changed the max worker threads in the registry for TSQL?
> >> >> >> >> >> >> Did you verify the max worker thread setting with
> >> >> >> >> >> >> sp_enum_sqlagent_subsystems? Is that the job subsystem that
> >> >> >> >> >> >> is being maxed out on threads?
> >> >> >> >> >> >> Anything in the SQL Agent log?
> >> >> >> >> >> >>
> >> >> >> >> >> >> -Sue
> >> >> >> >> >> >>
> >> >> >> >> >> >> On Wed, 10 Nov 2004 06:09:08 -0800, "Scott"
> >> >> >> >> >> >> <Scott@discussions.microsoft.com> wrote:
> >> >> >> >> >> >>
> >> >> >> >> >> >> >Hi,
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >I have a strange SQL Server 2K problem that I am looking for suggestions on
> >> >> >> >> >> >> >how resolve it.
> >> >> >> >> >> >> >In the SQL job scheduler I have 178 jobs. Of these around 80 of them are
> >> >> >> >> >> >> >scheduled to run every minute. We started to experience a problem where that
> >> >> >> >> >> >> >some of these jobs do not run (about 4 jobs) They get stuck in a pending
> >> >> >> >> >> >> >request status.
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >System is an 8 CPU Win2K with SQL Server 2K SP3a Enterprise
> >> >> >> >> >> >> >CPU utilization never goes over 36% and memory (4GB) and disk I/O all look
> >> >> >> >> >> >> >fine.
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >What I have tried:
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >1) I have increased the worker threads to 100 and 200 with no apparent
> >> >> >> >> >> >> >difference.
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >2) Deleting the job and re-creating it is hit or miss. If the job does start
> >> >> >> >> >> >> >running after I have recreated it then another one of the 80 jobs stops. If
> >> >> >> >> >> >> >it doesn’t start running it goes into a pending state and trying to start the
> >> >> >> >> >> >> >job returns the following error:
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >Error 22020: SQL Server Agent Error: Request to run job XXX refused because
> >> >> >> >> >> >> >the job already has a pending request from Schedule XXX.
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >3) I created a very simple job “select getdate()” and that is hit or miss
> >> >> >> >> >> >> >too. If it works then one of the other jobs stops running. If it does not run
> >> >> >> >> >> >> >it just stays in the pending state and attempts to start it return an error
> >> >> >> >> >> >> >like the one describe in #2
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >4) When the job is stuck in pending I can run the scheduled job(s) in the
> >> >> >> >> >> >> >Query Analyzer without any problems. (These jobs that run every minute only
> >> >> >> >> >> >> >take 1-5 seconds to run each)
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >5) I can disable a couple of jobs and then the pending ones start to run.
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >It seems to me with what I have tried that these problems point to some kind
> >> >> >> >> >> >> >of threshold that I am hitting.
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >Does anybody know what problem we are running into?
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >Thanks,
> >> >> >> >> >> >> >
> >> >> >> >> >> >> >Scott
> >> >> >> >> >> >>
> >> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >> >>
> >> >> >> >>
> >> >> >> >>
> >> >> >>
> >> >> >>
> >> >>
> >> >>
> >>
> >>
>
>
- Next message: stoko: "Time Out"
- Previous message: John Bandettini: "RE: task manager, processes wrong?"
- In reply to: Sue Hoegemeier: "Re: Jobs don't run and are stuck with request pending"
- Next in thread: Sue Hoegemeier: "Re: Jobs don't run and are stuck with request pending"
- Reply: Sue Hoegemeier: "Re: Jobs don't run and are stuck with request pending"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|