Re: Help to evaluate the system crash resistance method



As some one who has worked in the fault tolerant part of the computer
industry, the problem is a lot harder than you imagine. I know of a number
of companies working on potential solutions (I am a founder of one them),
but you are not going to see discussions here of their technology, you won't
get that without an NDA.

What I can say is you either have to wrap a layer of protection around the
whole driver (such as moving it into its own address space) or provide a way
to capture enough system state to go back before the crash and do something
to avert it. Neither of these is a small task such as tweaking an IDT
member, and neither can easily be explained in a newsgroup.


--
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
Remove StopSpam from the email to reply

"darkside" <zlfeng@xxxxxxxxxxx> wrote in message
news:u53hE9qGGHA.1100@xxxxxxxxxxxxxxxxxxxxxxx
> Hi,
>
> I just thought a approach to hold system crash caused by those tiny SW
> problems, please kindly help to evaluate:
>
> 1. Hacking the IDT, replace those exception vector like memory
> violation,divided by zero with our handling logic.
>
> 2. At our handling logic, if the IRQL is above dispatch, transfer the
> control to original OS handler, which will show blue screen at last; but
> if not, use KeDelayExecutionThreadexction() to hold the problem thread for
> a while, then kernel will re-schedule to other threads.In this case,
> system still alive instead of going to blue screen.
>
> I've been dedicated to fixing kernel bugs for several years, feel very
> pity to see many times the system dead just because of a tiny driver
> problem. Would think to develop a kernel piece that can help this...It NOT
> targets for helping all the system crash cases - I'm aware of many crash
> cases are so severe that it is no use even if you can hold it for a while,
> it targets for those SW problems like DBZ, memory violate etc...each of
> them has a seperate item at IDT which can be selectively replaced.
>
> My questions here are:
> 1. Is there any formal way for us to get the IDT address and selectively
> replace some of IDT items?
> 2. How long can the KeDelayExecutionThreadexction() hold the problem
> thread in practice?
> 3. Will the overall mechnism work when driver code raises a kernel crash?
>
> Thank you!
>
> TR
>


.



Relevant Pages

  • Re: Help to evaluate the system crash resistance method
    ... Neither of these is a small task such as tweaking an IDT ... > Windows 2k/XP/2k3 Filesystem and Driver Consulting ... >> I just thought a approach to hold system crash caused by those tiny SW ... >> for a while, then kernel will re-schedule to other threads.In this case, ...
    (microsoft.public.development.device.drivers)
  • Re: Help to evaluate the system crash resistance method
    ... Neither of these is a small task such as tweaking an IDT ... > Windows 2k/XP/2k3 Filesystem and Driver Consulting ... >> I just thought a approach to hold system crash caused by those tiny SW ... >> while, then kernel will re-schedule to other threads.In this case, system ...
    (microsoft.public.windowsxp.device_driver.dev)
  • Re: Help to evaluate the system crash resistance method
    ... Windows 2k/XP/2k3 Filesystem and Driver Consulting ... > them if not all, it is a standard kernel exception handling mechnism, the ... Neither of these is a small task such as tweaking an IDT ... >> Don Burn (MVP, Windows DDK) ...
    (microsoft.public.windowsxp.device_driver.dev)
  • Re: Help to evaluate the system crash resistance method
    ... Neither of these is a small task such as tweaking an IDT ... Windows 2k/XP/2k3 Filesystem and Driver Consulting ... > I just thought a approach to hold system crash caused by those tiny SW ... > while, then kernel will re-schedule to other threads.In this case, system ...
    (microsoft.public.windowsxp.device_driver.dev)
  • page_alloc.c bug and heavy I/O
    ... Redhat 7.3, original kernel 2.4.18-3 ... Intel Fortran Compiler 7.1 ... I have rmmod the 3com2000.0 network driver. ... The crash still occurs. ...
    (Linux-Kernel)

Loading