Re: Fast way to allocate buffer for producer/consumer scenario
- From: "Tom Widmer [VC++ MVP]" <tom_usenet@xxxxxxxxxxx>
- Date: Mon, 09 Jan 2006 13:45:37 +0000
Ulrich Eckhardt wrote:
Hi!
The topic already says most of what I want, but here is a slightly more verbose explanation: I have a thread that performs network service by simply reading from a socket. At the moment, it then copies the received data into the internal queue and triggers another thread to handle the received data. The size of the received data ranges from around 6 bytes to 500 bytes for 98% of all cases but other, rare cases being even in the range of a few megabytes.
What I now did was to restructure this so that instead of copying the buffer
to the queue, the data is moved into the queue, i.e. the queue assumes
ownership of the buffer and the network service thread allocates a new
buffer for the next transfer.
How about a hybrid approach - copy if up to 500 bytes, move if over 500 bytes?
During that, I noticed that the performance strongly depends on the way the buffers are allocated, which is why I wanted to ask for the best way to do it. I guess that the probably fastest way would be a dedicated memory pool, i.e. one that is used only for this transfer. However, this is not a trivial task a) to get right and b) to test that you got it right, so I'd rather not do this but build on a good system-provided way.
The most performant method is not to make any memory allocations at all (e.g. allocate sufficient up front), though this sounds like it might have too much memory overhead for your situation.
I have at this moment tried GlobalAlloc, VirtualAlloc and 'operator new'. All three of them had a comparable performance but I'm not satisfied with either of them. GlobalAlloc() is marked as obsolete and slower than VirtualAlloc(), which is supposed to replace it. VirtualAlloc() however interoperates with the OS to reserve pages in virtual and physical memory, which in itself presents an overhead. Doing so repeatedly seems like a lot of overhead to me, I'd rather recycle the pages inside the process before interacting with the OS. 'operator new' does exactly that, but it uses a bytesize granularity instead of a pagesize granularity which imposes a certain overhead.
Given that your allocations are mostly in the range of 6-500 bytes, surely you want byte granularity?
I have taken a look at the heap functions for creating and accessing a heap, is that perhaps the right approach? To be honest, this API seems to be quite complicated, which is why I ask here first if it would solve my problems.
Any suggestions anyone?
Release build operator new calls the Heap* functions pretty much directly, but you might get a small benefit from using a specific heap. For small operator new allocations, you can get a speed up by calling:
_set_sbh_threshold(512);
or similar at program startup. However, I think you'll get the biggest benefit by using a "small buffer" optimization, where smaller allocations (e.g. up to 500 bytes) are held directly in the object and therefore copied rather than moved. This can be encapsulated easily enough in a class of course. e.g.
class mybuffer
{
mybuffer(std::size_t size)
:m_size(size)
{
if (m_size > MAX_INTERNAL)
{
m_buffer.m_external = new char[m_size];
}
} ~mybuffer()
{
if (m_size > MAX_INTERNAL)
delete[] m_buffer.m_external;
} //transfers buffer
mybuffer(mybuffer& other)
:m_size(other.m_size)
{
if (m_size > MAX_INTERNAL)
{
m_buffer.m_external = other.m_buffer.m_external;
other.m_buffer.m_external = 0;
other.m_size = 0;
}
else
{
std::memcpy(m_buffer.m_internal,
other.m_buffer.m_internal, m_size);
}
}//access members like size() and get().
//operator=(mybuffer& other) should be done too
private:
std::size_t const MAX_INTERNAL = 500;
std::size_t m_size;
union
{
char* m_external;
char m_internal[MAX_INTERNAL];
} m_buffer;
};Tom .
- References:
- Fast way to allocate buffer for producer/consumer scenario
- From: Ulrich Eckhardt
- Fast way to allocate buffer for producer/consumer scenario
- Prev by Date: Re: dynamic window updating from a worker thread
- Next by Date: Re: RS232 Keep RTS high
- Previous by thread: Fast way to allocate buffer for producer/consumer scenario
- Next by thread: Re: Fast way to allocate buffer for producer/consumer scenario
- Index(es):
Relevant Pages
|