Operator new failing in windows after several days of operation
- From: Adrian Vacaliuc <adrian.vacaliuc@xxxxxxxxxxxxxxx>
- Date: Tue, 09 Aug 2005 10:25:56 -0700
I've been trying to track down a possible heap corruption/operator new
bug for the last three weeks. I'm working on a large Windows
application that is controlling several sensors, receiving
Command/Control information and reporting target information to
another computer via XML messages over a socket. The problem is that
under extremely heavy load, the program crashes with an access
violation after approximately 3 days of opperation.
The project makes heavy use of OO techniques, and there are many small
objects that are constantly being allocated on the heap. In
particular, our application generates many instances of a std::string
as part of building outgoing messages. Were using Xerces v.2.6 as our
XML parser. The application manages to make over 4 billion memory
allocations/deallocations in around 24-36 hours. I know this because
if I run the application under a debug build, it trips the debug
breakpoint in the C runtime under windows when it's internal
allocation counter rolls over. (What a wonderful timebomb, btw)
Regardless, the application crashes consistently in the constructor of
an std::string, in std::string::Copy. Here's the disassembly of the
function that it always seems to crash in:
?_Copy@?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AAEXI@Z:
780C1B48 mov eax,offset _DisableThreadLibraryCalls@4+45Ah
(780e875c)
780C1B4D call __EH_prolog (780c1535)
780C1B52 sub esp,0Ch
780C1B55 push ebx
780C1B56 push esi
780C1B57 push edi
780C1B58 mov edi,dword ptr [ebp+8]
780C1B5B or edi,1Fh
780C1B5E mov esi,ecx
780C1B60 cmp edi,0FFFFFFFDh
780C1B63 mov dword ptr [ebp-10h],esp
780C1B66 mov dword ptr [ebp-14h],esi
780C1B69 ja
std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::_Copy+23h (780c87c
780C1B6F and dword ptr [ebp-4],0
780C1B73 lea eax,[edi+2]
780C1B76 test eax,eax
780C1B78 jl
std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::_Copy+31h (780c87c
780C1B7E push eax
780C1B7F call operator new (780c1a01)
780C1B84 pop ecx
780C1B85 mov dword ptr [ebp+8],eax
780C1B88 mov eax,dword ptr [esi+8]
780C1B8B or dword ptr [ebp-4],0FFFFFFFFh
780C1B8F test eax,eax
780C1B91 ja
std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::_Copy+6Fh (780c87f
780C1B97 mov ebx,dword ptr [esi+8]
780C1B9A push 1
780C1B9C mov ecx,esi
780C1B9E call
std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::_Tidy (780c1a79)
780C1BA3 mov eax,dword ptr [ebp+8]
780C1BA6 inc eax
780C1BA7 mov dword ptr [esi+4],eax
780C1BAA and byte ptr [eax-1],0 <---- Crashes here
780C1BAE cmp ebx,edi
780C1BB0 mov dword ptr [esi+0Ch],edi
780C1BB3 ja
std::basic_string<char,std::char_traits<char>,std::allocator<char>
>::_Copy+0A6h (780c1b
780C1BB5 mov edi,ebx
780C1BB7 mov eax,dword ptr [esi+4]
780C1BBA mov ecx,dword ptr [ebp-0Ch]
780C1BBD mov dword ptr [esi+8],edi
780C1BC0 and byte ptr [eax+edi],0
780C1BC4 pop edi
780C1BC5 pop esi
780C1BC6 mov dword ptr fs:[0],ecx
780C1BCD pop ebx
780C1BCE leave
780C1BCF ret 4
It always crashes at the instruction at address 0x780C1BAA, and the
value of EAX is always 1. From looking at the assembly, you can see
that the return value of operator new is stored at [ebp+8]. Later,
after the call to std::string::Tidy, the new pointer is read in and
the contents are written to, causing the access violation.
I think there is very strong evidence that operator new is returning a
NULL pointer on me. The application has a mix of very small
allocations and very large allocations. Right before operator new
returns NULL, a series of exceptions are thrown down in NTDLL in the
RtlHeapAllocate calls. These exceptions are handled down in NTDLL,
and I suspect eventually leads to a call to operator new failing. I
only saw them because I had first-chance exceptions turned on.
Under heavy load, the application is maintaining about a 70MB memory
footprint, so I know I'm not leaking memory. We use boost shared ptrs
to hold all of our dynamic objects, so I have a very high confidence
I'm not dealing with a double-free kind of issue. We make extensive
use of STL containers, and very rarely use raw memory buffers.
The crash doesn't seem to manifest itself under a debug build, and I
do NOT get heap corruption errors under a debug build. It really just
seems to behave as if operator new failed to allocate some memory.
Are there any known problems with the default implementation of
operator new and memory fragmentation in windows? The VM Page size in
Task Manager is also showing around a 70MB memory footprint. This
problem seems to consitently happen around the 8-10 billion allocation
mark.
Am I dealing with a heap corruption issue? I don't think that's
what's happening here, because a heap corruption type of problem
usually throws an exception that your application can catch. That
isn't happening here.
Anyone have any ideas? Should I try using a Third party memory
allocator like SmartHeap? Any ideas would be greatly appreciated!
-Adrian
.
- Follow-Ups:
- Re: Operator new failing in windows after several days of operation
- From: Carl Daniel [VC++ MVP]
- Re: Operator new failing in windows after several days of operation
- From: Aslan
- Re: Operator new failing in windows after several days of operation
- From: Ivan Brugiolo [MSFT]
- Re: Operator new failing in windows after several days of operation
- Prev by Date: Re: NoReturnOnConstructor?
- Next by Date: "delete" causes prog to crash!
- Previous by thread: Re: VC++ generated code (calling convention problem?)
- Next by thread: Re: Operator new failing in windows after several days of operation
- Index(es):
Relevant Pages
|
Loading