Re: Bugcheck 101
- From: Alberto <Alberto@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Fri, 23 Jan 2009 13:54:04 -0800
Hi, Stefan,
Sorry for the delay in answering! We haven't totally characterized the speed
of the Vp2000 yet, however, a 4-node Vp2000 with 4Gb of memory can spin a
512x512x512 16-bit CT dataset at 20 frames per second. Actually, the
AquariusNet application has a 3D viewer that's awesome to use, you see real
3D images of real human beings, and the software allows you to cut through
volumes, make holes in them, embed other images into them, and do all kinds
of other manipulations. But all of this is too high level for me! I'm just
the driver guy.
Alberto.
"stefanbanev@xxxxxxxxx" wrote:
.
Hello Alberto,
Thanks for some insightful information about status of VolumePro 2000
development.
A> I wouldn't have bothered too much about this crash, because our
A> Vp2000 board doesn't appear anywhere when I look into it with
A> Windbg; it looks more like an OS bug than anything else.
Definitely OS bugs are quite annoying but customers are so picky ;o)
Unfortunately tech info (link below) regarding board performance is
very limited:
http://www.terarecon.com/downloads/products/datasheet_VP2000_120506.pdf
Besides native perspective volume rendering; what performance
enchantment we may expect vs. Volume Pro 1000: particularly, level of
super-sampling (IC case) for interactive rendering for let say
4000x512x512 (12 bit/pxl volume) and view port 1024x1024 (?).
It took quite a while to develop this board and eventually it seems
getting close... When you expect VolumePro2000 board may be
technically ready for deployment?
Thanks for doing a great job...,
Stefan
On Dec 9, 6:34 am, alberto <amore...@xxxxxxxx> wrote:
Windbg shows processor 0 doing something at IRQL 13 and all other 7
processors halted. The actual function or thread in Processor 0's
stack varies from crash to crash, but it's always some memory
management thread running under the system processor, for example, the
zero memory thread, working set balance, or similar. Everything looks
quite normal, and !analyze -v doesn't give much information. If you
Google for "Bugcheck 101", you will see a few reports by !analyze -v
that say that an unknown device driver generated the bugcheck, and
that's what I invariably get.
I wouldn't have bothered too much about this crash, because our Vp2000
board doesn't appear anywhere when I look into it with Windbg; it
looks more like an OS bug than anything else. However, the bugcheck
doesn't happen unless our Vp2000 board is running. The system has a
Vp1000 board and a Vp2000 board, side by side: when we run the app on
the Vp1000 board, things go ok, but when we run on the Vp2000 board we
get the crash. The bugcheck 101 happens both in Vista64 and XP64, and
it has been reported by many people outside the development
organization; but it does not happen on a 32-bit system. We thought it
was a power issue, and we powered boards from outside; no change in
behavior. We thought it was a heat problem and put a big fan near the
machine; no change in behavior. We turned Verifier on; no change in
behavior. In fact, turning Verifier on unearthed a minor IRP snag in
the Vp1000 driver that has probably been there for the last 5 years or
so, but no issues with the Vp20000 driver!
I'm trying to catch the issue upstream, that is, before we get to the
bugcheck. What happens is, the app is running in the system - for
example, rotating a 3d image - and at some point in time it freezes.
We see the system visibly upset with things, the mouse moves jerkily
and slowly, keystrokes are delayed, and after 10 or 15 seconds, bang,
we get the bugcheck.
Alberto.
On Dec 8, 11:20 pm, "Alexander Grigoriev" <al...@xxxxxxxxxxxxx> wrote:
If you check call stack locations in disassembly window, do you see any
meaningful commands there?
HLT command is providing a way to put an idle processor into lower power
state (C1).
Connect to the system with a debugger and run !analyze - v
"alberto" <amore...@xxxxxxxx> wrote in message
news:fe0ef786-f26f-4a08-9616-08f176b23eff@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Yet that's the processor whose PRCB is given by the bluescreen
parameters.
At the time of the crach the processor isn't idle, it's actually
halted. The intelppm.sys driver issues a hlt instruction. Interrupts
at that point are actually enabled.
My current reading is that somehow the timer interrupt got lost,
either because another processor is stuck at a high IRQL or because
something happened at hardware level that caused the interrupt not to
be generated.
I run it under Verifier, nothing changed. I was somehow hopeful that
memory corruption was to blame, but no cigar!
Alberto.
On Dec 5, 10:06 pm, "Alexander Grigoriev" <al...@xxxxxxxxxxxxx> wrote:
This is stack of an idle processor, doing nothing. intelpp is
processor-specific driver providing, besides from other thing, a
power-saving idle loop.
Your problem is on a differrent proc.
"alberto" <amore...@xxxxxxxx> wrote in message
news:74926d8c-808e-4d7f-baca-fb2709c5f3b7@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
This is what the stack in the hung processor looks like:
fffffa60`00b92685 : 00000000`dfe5f0a2 00000000`00000008
fffffa60`017d8180 fffff800`02055979 : intelppm!C1Halt+0x2
fffff800`0208f7f8 : fffffa60`017db580 fffffa60`017e1d40
fffffa60`0000040e fffffa60`017ffd40 : intelppm!C1Idle+0x9
fffff800`0207eb21 : fffffa60`017d8180 00000000`00061c82
00000000`00000000 00000000`00000000 : nt!PoIdle+0x148
fffff800`0224c5c0 : 00000000`00000000 00000000`00000000
00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x21
I was assuming that the problem was created by the other processor,
but, thanks! This gives me new food for thought. I'll disable
intelppm.sys and see what happens!
Alberto.
On Dec 5, 4:07 pm, "Scott Noone" <sno...@xxxxxxx> wrote:
I've never actually hit this bugcheck, but I'll bite.
The bugcheck information should show the hung processor. Have you looked
at
the call stack on that processor to see why it's locked up?
-scott
--
Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.http://www.osronline.com
"Alberto" <more...@xxxxxxxxxxxxx> wrote in message
news:38960766-493e-4629-8459-02f4e82d662f@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hi, All,
I bumped into a nasty, hard to debug crash. It's a Bugcheck 101. It
happens when we run one of our products on our Vp2000 volume rendering
board: after we play with images for ten or fifteen minutes, the
machine gets unresponsive and after a few more seconds we get the blue
screen. This is a 4-processor Dell 5400 with hyperthreading on and
running 64-bit Vista. There's a lot going on in there at the time of
the crash, and all 8 virtual processors are busy at that time.
By the time the dump gets taken, the system's long gone into la-la
land, and there isn't much in the dump that's useful to diagnose
what's going on. The crash is supposed to be a processor timeout
waiting for a timer interrupt, and while processor 3 is the timed out
processor, a thread in processor 0 seems to be at IRQL 13, which is
the level for the Amd64 timer interrupt. If that's a sustained
situation, that might explain what's going on, although actually
tracking it requires more work.
There isn't much on the web about this Bugcheck, except the normal
"make sure your hw is not overheating or this or that" or "download
your latest bios and video drivers". No indication of what in those
new versions might actually have fixed the problem!
My user has Daemon Tools installed, and I hear that they install a
hard-to-get-rid-of driver called sptd.sys. People on the web say that
sptd.sys sometimes interacts with the rest of the system in ways that
end up generating a Bugcheck 101. My user uninstalled Daemon Tools but
the crash is still there, and I'm pretty sure that sptd.sys has not
been disabled.
My question is, do any of you have any experience with this Bugcheck
you might be willing to share ? At this point, any information,
however minor, will be highly appreciated!
Thanks,
Alberto.- Hide quoted text -
- Show quoted text -- Hide quoted text -
- Show quoted text -- Hide quoted text -
- Show quoted text -
- Prev by Date: Re: USB Suspend and KMDF
- Next by Date: Re: Bugcheck 101
- Previous by thread: signing INF file which uses WinUSB
- Next by thread: Re: Bugcheck 101
- Index(es):