Re: Operator new failing in windows after several days of operation




"Adrian Vacaliuc" <adrian.vacaliuc@xxxxxxxxxxxxxxx>, haber iletisinde
sunlari yazdi:ckkhf1hnju705h5r16kg86tk73bllf2g6v@xxxxxxxxxx
> I've been trying to track down a possible heap corruption/operator new
> bug for the last three weeks. I'm working on a large Windows
> application that is controlling several sensors, receiving
> Command/Control information and reporting target information to
> another computer via XML messages over a socket. The problem is that
> under extremely heavy load, the program crashes with an access
> violation after approximately 3 days of opperation.
>
> The project makes heavy use of OO techniques, and there are many small
> objects that are constantly being allocated on the heap. In
> particular, our application generates many instances of a std::string
> as part of building outgoing messages. Were using Xerces v.2.6 as our
> XML parser. The application manages to make over 4 billion memory
> allocations/deallocations in around 24-36 hours. I know this because
> if I run the application under a debug build, it trips the debug
> breakpoint in the C runtime under windows when it's internal
> allocation counter rolls over. (What a wonderful timebomb, btw)
>
> Regardless, the application crashes consistently in the constructor of
> an std::string, in std::string::Copy. Here's the disassembly of the
> function that it always seems to crash in:
>
>
?_Copy@?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AAEXI@Z:
> 780C1B48 mov eax,offset _DisableThreadLibraryCalls@4+45Ah
> (780e875c)
> 780C1B4D call __EH_prolog (780c1535)
> 780C1B52 sub esp,0Ch
> 780C1B55 push ebx
> 780C1B56 push esi
> 780C1B57 push edi
> 780C1B58 mov edi,dword ptr [ebp+8]
> 780C1B5B or edi,1Fh
> 780C1B5E mov esi,ecx
> 780C1B60 cmp edi,0FFFFFFFDh
> 780C1B63 mov dword ptr [ebp-10h],esp
> 780C1B66 mov dword ptr [ebp-14h],esi
> 780C1B69 ja
> std::basic_string<char,std::char_traits<char>,std::allocator<char>
> >::_Copy+23h (780c87c
> 780C1B6F and dword ptr [ebp-4],0
> 780C1B73 lea eax,[edi+2]
> 780C1B76 test eax,eax
> 780C1B78 jl
> std::basic_string<char,std::char_traits<char>,std::allocator<char>
> >::_Copy+31h (780c87c
> 780C1B7E push eax
> 780C1B7F call operator new (780c1a01)
> 780C1B84 pop ecx
> 780C1B85 mov dword ptr [ebp+8],eax
> 780C1B88 mov eax,dword ptr [esi+8]
> 780C1B8B or dword ptr [ebp-4],0FFFFFFFFh
> 780C1B8F test eax,eax
> 780C1B91 ja
> std::basic_string<char,std::char_traits<char>,std::allocator<char>
> >::_Copy+6Fh (780c87f
> 780C1B97 mov ebx,dword ptr [esi+8]
> 780C1B9A push 1
> 780C1B9C mov ecx,esi
> 780C1B9E call
> std::basic_string<char,std::char_traits<char>,std::allocator<char>
> >::_Tidy (780c1a79)
> 780C1BA3 mov eax,dword ptr [ebp+8]
> 780C1BA6 inc eax
> 780C1BA7 mov dword ptr [esi+4],eax
> 780C1BAA and byte ptr [eax-1],0 <---- Crashes here
> 780C1BAE cmp ebx,edi
> 780C1BB0 mov dword ptr [esi+0Ch],edi
> 780C1BB3 ja
> std::basic_string<char,std::char_traits<char>,std::allocator<char>
> >::_Copy+0A6h (780c1b
> 780C1BB5 mov edi,ebx
> 780C1BB7 mov eax,dword ptr [esi+4]
> 780C1BBA mov ecx,dword ptr [ebp-0Ch]
> 780C1BBD mov dword ptr [esi+8],edi
> 780C1BC0 and byte ptr [eax+edi],0
> 780C1BC4 pop edi
> 780C1BC5 pop esi
> 780C1BC6 mov dword ptr fs:[0],ecx
> 780C1BCD pop ebx
> 780C1BCE leave
> 780C1BCF ret 4
>
> It always crashes at the instruction at address 0x780C1BAA, and the
> value of EAX is always 1. From looking at the assembly, you can see
> that the return value of operator new is stored at [ebp+8]. Later,
> after the call to std::string::Tidy, the new pointer is read in and
> the contents are written to, causing the access violation.
>
> I think there is very strong evidence that operator new is returning a
> NULL pointer on me. The application has a mix of very small
> allocations and very large allocations. Right before operator new
> returns NULL, a series of exceptions are thrown down in NTDLL in the
> RtlHeapAllocate calls. These exceptions are handled down in NTDLL,
> and I suspect eventually leads to a call to operator new failing. I
> only saw them because I had first-chance exceptions turned on.
>
> Under heavy load, the application is maintaining about a 70MB memory
> footprint, so I know I'm not leaking memory. We use boost shared ptrs
> to hold all of our dynamic objects, so I have a very high confidence
> I'm not dealing with a double-free kind of issue. We make extensive
> use of STL containers, and very rarely use raw memory buffers.
>
> The crash doesn't seem to manifest itself under a debug build, and I
> do NOT get heap corruption errors under a debug build. It really just
> seems to behave as if operator new failed to allocate some memory.
>
> Are there any known problems with the default implementation of
> operator new and memory fragmentation in windows? The VM Page size in
> Task Manager is also showing around a 70MB memory footprint. This
> problem seems to consitently happen around the 8-10 billion allocation
> mark.
>
> Am I dealing with a heap corruption issue? I don't think that's
> what's happening here, because a heap corruption type of problem
> usually throws an exception that your application can catch. That
> isn't happening here.
>
> Anyone have any ideas? Should I try using a Third party memory
> allocator like SmartHeap? Any ideas would be greatly appreciated!
>
> -Adrian

How many projects do you have when you build your application?
If more than one, have you checked your code generation settings?
Which is the setting for "Use run time library"? And are they all the same
for all the projects and set to "multitreaded" or "multithreaded dll"?



.



Relevant Pages

  • Re: Windows Update & System restore
    ... history" on the windows update page still lists the update. ... "phantom usage" relating to the unused portions of memory allocation ... the memory allocation requests that are issued by Windows components, ...
    (microsoft.public.windowsxp.general)
  • Re: some .NET questions
    ... memory managemt itself deleting dynamically created object just to increase ... I'm I right if I say the learning curve is much shorter for using Windows ... > theorists tell us that the amortized time per allocation with a GC is ... Object lifetime. ...
    (microsoft.public.dotnet.languages.vc)
  • Re: old macs were the best macs
    ... But at least Windows 3.x didn't require user allocation of memory. ... Macintosh had long since progressed to System 7. ...
    (comp.sys.mac.advocacy)
  • Operator new failing in windows after several days of operation
    ... I'm working on a large Windows ... allocation counter rolls over. ... the application is maintaining about a 70MB memory ... Am I dealing with a heap corruption issue? ...
    (microsoft.public.vc.language)
  • RE: Operator new failing in windows after several days of operation
    ... The application manages to make over 4 billion memory ... > allocation counter rolls over. ... These exceptions are handled down in NTDLL, ... > do NOT get heap corruption errors under a debug build. ...
    (microsoft.public.vc.language)