Re: String to byte[] reloaded

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



"Ben Voigt" <rbv@xxxxxxxxxxxxx> wrote in message news:%23nKmeowTHHA.3980@xxxxxxxxxxxxxxxxxxxxxxx

"nano2k" <adrian.rotaru@xxxxxxxxxxx> wrote in message news:1171270149.638283.256770@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hi, thanks all for your replys.
I will answer to some ideas in this one place.
Indeed, most of them don't allocate, but relies on you to allocate.
So, in my perspective, it's the same.
I am using .NET framework v1.1 and I need to compress my string.
Unfortunately, the compressing library (no sources available to me)
takes an byte[] as input parameter and I have a string to compress.
It is frustrating that I have to allocate new memory to perform this
operation. This sometimes leads to webservice crash, as many requests
simultaneously require this operation => not enough memory.
I do not intend to use the buffer but for strict readonly operations.
I am aware that any "unmanaged" changes in such an intimate buffer
could cause future unexpected behavior.

Here's your loaded gun, plenty of rope to hang yourself (I looked at the C++/CLI function PtrToStringChars):

string s = "This is a test string";

GCHandle ptr = GCHandle.Alloc(s);

byte* pString = *(byte**)GCHandle.ToIntPtr(ptr).ToPointer() + System.Runtime.CompilerServices.RuntimeHelpers.OffsetToStringData;

char* c = (char*)pString;

ptr.Free();

GCHandle pinptr = GCHandle.Alloc(s, GCHandleType.Pinned);

pString = (byte*)pinptr.AddrOfPinnedObject().ToPointer();

c = (char*)pString;

pinptr.Free();



Note that the Large Object Heap is a heap, and not subject to garbage collection, in which case you ought not need to pin the object.

BTW, that OffsetToStringData is 12 (at least in .NET 2.0) but it's a real property, not a constant, so the value is gotten from your actual runtime library, not when you compile. Of course the JIT will inline that property access to nothing anyway.

The PtrToStringChars code is the same in both .NET 1.1 and 2.0, so this might be almost stable.

I don't know why the offset isn't needed when you use a pinning pointer. I did notice that the pinning action moves the string though... let me try with a larger string....

With a one million character string, there is no change in the pointer, and the AddrOfPinnedObject call still includes the correct offset. Probably small objects get moved to the Large Object Heap in order to pin them (and do they come back, maybe once pinned, always pinned until all references disappear?).

So that's how to get a zero-copy pointer to the internal data of a large string.

Note that everything here is based on my quick tests and reading vcclr.h and I may just have gotten lucky; pressure on the GC could move things around and mess things up, or other bad things could happen.




Above assumes that the GC would never compact the LOH, IMO no-one ever said that future versions of the CLR would not attempt to compact the LOH, so I think it's dangerous to assume no pinning is needed.

Anyway, why make it that complicated when you have the "fixed" statement in C#?

[DllImport("somedll")]
unsafe private static extern bool Foo(char* bytes);
...
string hugeString = ............
unsafe {
fixed (char* phugeString = hugeString ) {
Foo(phugeString );
}
}

Note that it's easy to corrupt the heap when passing native pointers to unmanaged....


Willy.


.



Relevant Pages

  • Re: This is getting really weird.
    ... I thought 4 bytes for reference count and 4 for string length. ... > There should be no memory allocation for that line. ... > manager may allocate more space than requested for its own efficiency. ... > that New returned with a pointer to the string constant. ...
    (alt.comp.lang.borland-delphi)
  • Re: String to byte[] reloaded
    ... Indeed, most of them don't allocate, but relies on you to allocate. ... I am using .NET framework v1.1 and I need to compress my string. ... Note that the Large Object Heap is a heap, ... I don't know why the offset isn't needed when you use a pinning pointer. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: strings, arrays, pointers and dynamic memory allocation
    ... Compute the length of a string. ... Pointer to the destination string. ... Allocate a new memory chunk from the dynamic heap. ...
    (comp.lang.c)
  • Re: Freeing a record which contains a string field leads to a memory/string leak ?!
    ... > pointer to a stack variable. ... > allocated on the heap, ... string should be allocated on the heap. ... How does one allocate a string on the heap? ...
    (alt.comp.lang.borland-delphi)
  • Re: Efficency and the standard library
    ... loop will dereference a null pointer if argument strInstring is ... this code, out of adversarial hatred, envy and malice. ... Some C programmers in these threads would suggest const ... the output string. ...
    (comp.lang.c)