Re: Any good refs on dealing with tightly packed data in C#?
- From: Jeroen Mostert <jmostert@xxxxxxxxx>
- Date: Sat, 12 Apr 2008 13:50:37 +0200
_dee wrote:
I'm working on a port of a legacy app originally written in C. DataYes, but the resulting assembly code probably wasn't. :-) To access those bit fields, the compiler needs to generate tricky code for masking and shifting the values every time they're accessed. The resulting slowdown can be prohibitive, which is why storage space needs to really be at a premium for bit fields to be worth it.
was compacted into bit fields. Are there any sites or books that cover
optimized handling of this type of data? I'd need to develop optimized
functions for reading and writing, and given the volume of source, I
don't want to end up with pages of obscure code. (Surprisingly, the
original C was pretty clear).
But all that's neither here nor there, since you can't change the original. In order to use structures with bit fields in C#, you have to know how the compiler aligned, ordered and padded the fields. For example, if you have this structure:
struct bits {
unsigned short a: 4;
unsigned char b;
unsigned int c: 3;
unsigned int d: 14;
unsigned int e;
unsigned short f;
}
This might be laid out in any number of ways, and the standard leaves the compiler quite free. If we're specifically talking about Visual C++ on a 32-bit x86 platform with default packing (other compilers on Windows will probably follow its lead to remain compatible) then the layout will look like this (in 16-bit words, MSB first):
0: -- -- -- -- -- -- -- -- -- -- -- -- a3 a2 a1 a0
2: -- -- -- -- -- -- -- -- b7 b6 b5 b4 b3 b2 b1 b0
4: dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 c2 c1 c0
6: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- dd
8: ev eu et es er eq ep eo en em el ek ej ei eh eg
a: ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0
c: ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0
e: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
10
Here "--" indicate padding bits.
As you can see, the compiler operates by a few rules to determine the layout: the struct as a whole is aligned to a 4-byte boundary, member alignment is 2 bytes and bit fields are packed LSB-first in sequential order.
Accessing the fields is another matter. Code like this:
bits b;
b.d--;
is effectively implemented as:
bits b;
unsigned int i = ((unsigned int*) b)[2];
i = (i >> 3) & 0x3fff;
--i;
i = (i & 0x3fff) << 3;
((unsigned int*) b)[2] |= i;
This is the unoptimized version. There's room for optimization, but this is the general idea: mask and shift to get the field, perform the operation, mask and shift again, logical-or with existing data. As you can imagine, this gets tedious quickly.
Having to write the masking and shifting code in C# itself is a real pain, and you'll have to maintain separate code for 64-bit (if that's relevant). Some of the pain can be eased by judicious use of the BitArray and BitVector32 classes, but it's still not trivial.
It might be an option to just write those functions in pure C and p/invoke to them, or to write them in C++/CLI for easier integration with managed code. Doing it all in pure C# is possible, but probably not the most effective approach.
It depends on your scenario a bit what's best. In particular: what sort of operations you'll need, how frequently you need to use them and whether you're going to be doing many operations on individual structures or one big operation on an array of them.
--
J.
.
- Follow-Ups:
- References:
- Prev by Date: Any good refs on dealing with tightly packed data in C#?
- Next by Date: Completely Free online dating service, Completely free personals ...
- Previous by thread: Any good refs on dealing with tightly packed data in C#?
- Next by thread: Re: Any good refs on dealing with tightly packed data in C#?
- Index(es):
Relevant Pages
|