Re: Any good refs on dealing with tightly packed data in C#?

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



_dee wrote:
I'm working on a port of a legacy app originally written in C. Data
was compacted into bit fields. Are there any sites or books that cover
optimized handling of this type of data? I'd need to develop optimized
functions for reading and writing, and given the volume of source, I
don't want to end up with pages of obscure code. (Surprisingly, the
original C was pretty clear).

Yes, but the resulting assembly code probably wasn't. :-) To access those bit fields, the compiler needs to generate tricky code for masking and shifting the values every time they're accessed. The resulting slowdown can be prohibitive, which is why storage space needs to really be at a premium for bit fields to be worth it.

But all that's neither here nor there, since you can't change the original. In order to use structures with bit fields in C#, you have to know how the compiler aligned, ordered and padded the fields. For example, if you have this structure:

struct bits {
unsigned short a: 4;
unsigned char b;
unsigned int c: 3;
unsigned int d: 14;
unsigned int e;
unsigned short f;
}

This might be laid out in any number of ways, and the standard leaves the compiler quite free. If we're specifically talking about Visual C++ on a 32-bit x86 platform with default packing (other compilers on Windows will probably follow its lead to remain compatible) then the layout will look like this (in 16-bit words, MSB first):

0: -- -- -- -- -- -- -- -- -- -- -- -- a3 a2 a1 a0
2: -- -- -- -- -- -- -- -- b7 b6 b5 b4 b3 b2 b1 b0
4: dc db da d9 d8 d7 d6 d5 d4 d3 d2 d1 d0 c2 c1 c0
6: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- dd
8: ev eu et es er eq ep eo en em el ek ej ei eh eg
a: ef ee ed ec eb ea e9 e8 e7 e6 e5 e4 e3 e2 e1 e0
c: ff fe fd fc fb fa f9 f8 f7 f6 f5 f4 f3 f2 f1 f0
e: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
10

Here "--" indicate padding bits.

As you can see, the compiler operates by a few rules to determine the layout: the struct as a whole is aligned to a 4-byte boundary, member alignment is 2 bytes and bit fields are packed LSB-first in sequential order.

Accessing the fields is another matter. Code like this:

bits b;
b.d--;

is effectively implemented as:

bits b;
unsigned int i = ((unsigned int*) b)[2];
i = (i >> 3) & 0x3fff;
--i;
i = (i & 0x3fff) << 3;
((unsigned int*) b)[2] |= i;

This is the unoptimized version. There's room for optimization, but this is the general idea: mask and shift to get the field, perform the operation, mask and shift again, logical-or with existing data. As you can imagine, this gets tedious quickly.

Having to write the masking and shifting code in C# itself is a real pain, and you'll have to maintain separate code for 64-bit (if that's relevant). Some of the pain can be eased by judicious use of the BitArray and BitVector32 classes, but it's still not trivial.

It might be an option to just write those functions in pure C and p/invoke to them, or to write them in C++/CLI for easier integration with managed code. Doing it all in pure C# is possible, but probably not the most effective approach.

It depends on your scenario a bit what's best. In particular: what sort of operations you'll need, how frequently you need to use them and whether you're going to be doing many operations on individual structures or one big operation on an array of them.

--
J.
.



Relevant Pages

  • Re: Operation recomputation
    ... The answer depends on whether your compiler can recognize the common ... in my case it's not a micro optimization since for my ... Profile first, then optimize. ... For instance, on ARM processors in ARM mode, shift is free for additions and array indexing but is not free for multiplications. ...
    (comp.lang.c)
  • Re: Bit Manipulation Tricks
    ... involving substituting bit manipulation for other things. ... These are optimized automatically by the compiler. ... optimization using the CPU view. ... "div 2" into some code which does a shift and checks if the result ...
    (borland.public.delphi.language.basm)
  • Re: 68k chip, Diab Compiler and odd behavior
    ... > problem is in the inner for loop. ... > help from Windriver (I think they bought Diab compiler) because we ... > 5 ulword shift; ... Compile this file with optimization OFF. ...
    (comp.arch.embedded)
  • Re: C and MISRA, blues... (arithmetic shifts)
    ... int ArithmeticShiftRight (int value, unsigned int shift) ... arithmetic shifts on signed integers in the last 20 years? ... if shift is a constant and 0 then the compiler would eliminate the ...
    (comp.arch.embedded)
  • Re: Brian Kernighan, maybe Im not worthy, maybe Im scum
    ... what experienced programmers do, ... optimization, ... Thugs" ad nauseum fits that a lot more closely than discussing compiler ... be modified outside a loop, and guessing ...
    (comp.programming)