Heap corruption error.



Hi all,

I'm battling a heap corruption error. I know you're probably thinking
it's either a buffer overrun or writing to a freed block. I've gone
after enough of these to know that's what it usually is, and I know
how to find and fix such errors. However this one appears to be
different, and I'll explain why.

First of all, I'm using MS Visual Studio 6.0.

When the error manifests, the debug heap's linked list of allocated
objects is corrupted. More specifically, following the block's
pBlockHeaderNext pointers quickly takes me to heap memory that is not
a block. I can often, but not always, find the block close to where
the pointer says it should be. I know it's the next block because its
pBlockHeaderPrev pointer points to the block I just came from.
Sometimes its 8 or 16 bytes away. Usually more. Occasionally I can't
find it at all.

The pBlockHeaderNext pointer looks valid in that it's most significant
byte is 04, like all of my heap's blocks, and the least significant
byte is 00 or 08, since the heap blocks are on 8 byte boundaries. So
the pointer looks valid, it just doesn't point to anything
recognizable as a block.

In the cases where I could find the next block, in all but one case
the pBlockHeaderPrev links were fine. So following the linked list
from most recent allocation back was broken, but following the list
from earliest allocation forward was fine. In the one case where the
pBlockHeaderPrev wasn't fine, I found the Prev pointer in the middle
of another block (corrupting it, of course).

I've also had one case where the head of the linked list,
_pFirstBlock, pointed into the heap, but not at a block. I'm guessing
that the program was freeing the last allocated block, and that
block's pBlockHeaderNext was already spurious, and the free operation
copied the block's spurious pBlockHeaderNext to pFirstBlock.

So I don't think it's a buffer overrun because:
a) If the block containing the pBlockHeaderNext was overrun onto, that
block should look trashed. There's very little chance the
pBlockHeaderNext pointer would still look like a valid heap pointer.
Even if just the first byte of that block was overwritten (the Next
pointer's LSB), it would be have to be changing a 00 to a 08 or visa-
versa to cause a problem and still look like a valid heap pointer.
b) If the next block were overrun so that it was no longer
recognizable as a block, I would never be able to find the block
elsewhere in the heap.
c) I haven't found a block yet that had its no-mans-land bytes
overwritten. I've gotten pretty good at spotting no-mans-land bytes,
and using them to find the start and end of blocks.

It may be a write-after-free, in which the freed area has been
reallocated and the write is corrupting the linked list. Again, I
think that it would be extremely unlikely that writing over the
pBlockHeaderPrev would result in something that still looks like a
valid heap pointer. Also, I've tried setting the
_CRTDBG_DELAY_FREE_MEM_DF flag. This causes the bug to stop
manifesting, and ASSERT(AfxCheckMemory()), which should be checking to
make sure the freed blocks contain 0xdd, never trips.

So this is looking like a type of heap corruption that I've not seen
before. It almost looks like a concurrency issue in which an
allocation and free or two frees occur in different threads corrupting
the updating of the linked list pointers. However, my application and
all of my libraries are using the Multithreaded Debug DLL, so allocs
and frees should be protected from concurrent access.

If you have any insight into this, or any suggestions, or if you can
find any holes in my logic. please let me know.

Thanks in advance.

.



Relevant Pages

  • Re: thread specific information
    ... Some of these bugs are directly related to the aforementioned "programming techniques", so such things should always be viewed with caution. ... Heap is at best "casually" thread-specific. ... If it keeps that pointer to itself there's no reason for another thread to access it, ... Again, this effectively allocates GLOBALLY visible memory to which only one thread is granted a pointer; but there's nothing to prevent that thread from making a pointer visible to other threads, or to keep other threads from accidentally "scribbling" over the data via a random uninitialized pointer. ...
    (comp.programming.threads)
  • Re: Garbage collectable pinned arrays!
    ... Pinning is an explicit ... I've already given two examples of APIs in widespread use which require a buffer to stay in one position after the initial function call which accepts the pointer. ... That means a one time cost to pin a buffer that lives until the end of the process, if you do this early in the process you won't suffer from fragmentation of the gen0 heap as this object will end on the gen2 heap anyway. ... If this doesn't suits your needs, then you will have to use the Marshal class or GCHandle.Alloc, carefully considering it's costs. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: ptrs validity
    ... I have a pointer that points to an unknown heap memory block, ... hardware checked segment for each allocation. ...
    (comp.lang.c)
  • Re: What Kind of DataStructures C using? ( Heap or Tree ??)
    ... > Some were said heap, ... instructions and data is put on a stack. ... reserve memory on the heap and ... return a pointer to this memory area. ...
    (comp.lang.c)
  • Re: WBEM/WMI crashing. A lot. And I mean a lot
    ... > computer to a different one or to your installation disk. ... I enabled the debug heap and noticed that it was corrupting its heap ... storing was the environment, though what was making this copy ... it was corrupting the heap. ...
    (microsoft.public.win32.programmer.wmi)