Re: how to interpret memory dump

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance




Now I don't see how any corruption of the user-mode program or
corruption of the data which user-mode program feeds to a system call
can result in an incorrect jump INSIDE some instruction of a
kernel-mode procedure -- this looks more like a hardware quirk sending
the cpu to LeaveCriticalSecion+4 instead of LeaveCriticalSecion to
me. Or is this impression too far-fetched?


the byte-code for `call win32k!LeaveCrit (a0002667)` is `e8 6a fc ff
ff`.
It's encoded with a @eip relative offset. I'd bet that flipping one bit at
the time
in the offset you can easily get the `+4` displacement.
This looks like code-page single-bit corruption.
Unless you have ECC memory with MCA events,
it's hard to make further progress

--
--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


"Dirk Zabel" <dzabel@xxxxxxxxxxxxxxxx> wrote in message
news:8FF85CDB-71F8-4D1E-A25F-9B66B5B03841@xxxxxxxxxxxxxxxx
Hi,
first of all, thanks to all who responded.
I played around with the disassembly command and got the following
infos:
the instruction '18 a0 ff 15 b0 6b' is indeed suspicious as Ivan
Brugiolo wrote. The address is at a00029fd = LeaveCrit+04x, i.e. not
inside my binary. This address ist not at the begin but in the middle
of the first instruction of LeaveCriticalSection

If I disassemble LeaveCtriticalSection, I see:
win32k!LeaveCrit:
a0002667 8b0de02318a0 mov ecx,dword ptr [win32k!gpresUser
(a01823e0)]
a000266d ff15b06b17a0 call dword ptr
[win32k!_imp_ExReleaseResourceLite (a0176bb0)]

This explains the MISALIGNED_IP line in the output of the !analyze -
command.

Unfortunately, the whole stack trace does not include instructions
from user mode address space, so I cannot see what user program called
KiThreadStartup eventually.

The next lower stack frame reads

bfd6fb5c a00ad527 00000000 00000000 00000000 win32k!TimersProc+0x133
and disassembly of a00ad520 gives

a00ad520 75a6 jne win32k!RawInputThread+0x4ea (a00ad4c8)
a00ad522 e8ab53f5ff call win32k!TimersProc (a00028d2)
a00ad527 a1dc9e18a0 mov eax,dword ptr [win32k!gnRetryReadInput
(a0189edc)]

Disassembly of TimersProc begins with
win32k!TimersProc:
a00028d2 55 push ebp
a00028d3 8bec mov ebp,esp
a00028d5 83ec0c sub esp,0Ch
a00028d8 53 push ebx
a00028d9 56 push esi
a00028da 57 push edi
a00028db e8b8fdffff call win32k!EnterCrit (a0002698)
a00028e0 ba0000fe7f mov edx,offset SharedUserData (7ffe0000)
a00028e5 8b02 mov eax,dword ptr [edx]
a00028e7 f76204 mul eax,dword ptr [edx+4]
a00028ea 0facd018 shrd eax,edx,18h
a00028ee 8b350c2518a0 mov esi,dword ptr [win32k!gptmrFirst
(a018250c)]
a00028f4 8bf8 mov edi,eax

i.e. TimersProc calls EnterCrit(icalSection?), some pages later I see

a00029db 99 cdq
a00029dc 68f0d8ffff push 0FFFFD8F0h
a00029e1 52 push edx
a00029e2 50 push eax
a00029e3 e8dcfcffff call win32k!_allmul (a00026c4)
a00029e8 6a00 push 0
a00029ea 52 push edx
a00029eb 50 push eax
a00029ec ff35042518a0 push dword ptr [win32k!gptmrMaster (a0182504)]
a00029f2 ff15ec6d17a0 call dword ptr [win32k!_imp__KeSetTimer
(a0176dec)]
a00029f8 e86afcffff call win32k!LeaveCrit (a0002667)
a00029fd 5f pop edi
a00029fe 5e pop esi
a00029ff 5b pop ebx
a0002a00 c9 leave
a0002a01 c3 ret


Now I don't see how any corruption of the user-mode program or
corruption of the data which user-mode program feeds to a system call
can result in an incorrect jump INSIDE some instruction of a
kernel-mode procedure -- this looks more like a hardware quirk sending
the cpu to LeaveCriticalSecion+4 instead of LeaveCriticalSecion to
me. Or is this impression too far-fetched?

Some other question, though: I could not get an overview what
processes where active when the fault occured. I had thought the
command View | Processes and Threads does this. The result is only a
window showing (transcript as ascii-art) :
[-] 000:f0f0f0f0 ntoskrnl.exe
+-000:1

This does not seem not to be a problem of this special dump, however, as I
got
the same when I produced deliberately a dump from some other W2k
system running inside VirtualPC (using NotMyFault from Mark
Russinovich), loaded the resulting full memory dump into windbg and
tried the "Processes and Threads" command on this dump. I guess I am
doing wrong something simple, but what? I did setup windbg to use the
MS symbol server and the symbol cache seems to be ok.

Thank you for any comments,

- Dirk







.



Relevant Pages

  • Re: problems with Nice and Dump in FreeBSD 6.1-Current (Stable-#5)
    ... nice: shell built-in command. ... I use this for a dump ... and then on a restore ... which fills up the screen with seemingly corruption ...
    (freebsd-questions)
  • RE: how to interpret memory dump
    ... I played around with the disassembly command and got the following ... If I disassemble LeaveCtriticalSection, I see: ... corruption of the data which user-mode program feeds to a system call ... This does not seem not to be a problem of this special dump, however, as I got ...
    (microsoft.public.win32.programmer.kernel)
  • Re: dump/restore corrupted filesystems
    ... If the corruption is due to hardware failure, ... if I mount the filesystem read-only. ... fsck see errors and possibly refuse to complete. ... are affected and delete them or set dump not to read them and then copy ...
    (freebsd-questions)
  • Re: Access/VB returning "Module Not Found" error
    ... It's not a common error, no -- but corruption does occur in files, primarily ... On the report menu, I have a command ... suddenly, when I click on the button to run the List Active Users, VB ...
    (microsoft.public.access.formscoding)
  • Re: PANIC hard disk error
    ... htfs Freeblock 11647299 Epi freed on htfs dev hd ... $cannot dump 81823 pages to dump dev hd Space for only 32000." ... corruption; if a block was simultaneously shown as free and in use, ... there are probably other parts of the filesystem that are also corrupt, ...
    (comp.unix.sco.misc)