Re: which way is more efficient to check bit value

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



When you have isolated the exact bit, you only need to look
for 9 values in the table - at offsets 0, 1, 2, 4, 8, 16, 32, 64
and 128 (the offset zero is if the original number was zero).
Thus you only need a table populated in the range 0-128
(that's 129 entries) and you can omit filling the other 127
entries in the complete lookup table. The values in the slots
not listed above are irrelevant and you should use zeroes
to allow for better compression.

The logarithmic bit search is simple. You have 4 steps:
- step 1: check if the bit is in the high order 4 bits or the
low order 4 bits of the 8-bit value; if it's the high order you
add 4 to your result and shift it to fit in the low order 4 bits;
in either case you end up with a 4-bit value
- step 2: check if the bit is in the high order 2 bits or the
low order 2 bits of the 4-bit value; if it's the high order you
add 2 to your result and shift it to fit in the low order 2 bits;
in either case you end up with a 2-bit value
- step 3: check if the bit is in the high order bit or the low
order bit of the 2-bit value; if it's the high order you add 1 to
your result and shift it one bit to the right; in either case you
end up with a 1-bit value
- step 4: add the resulting single bit number to the result

You end up with 1 + log2(n) for a non-zero number and
zero for a zero number, which is the one-based bit position
when present.

--
=====================================
Alexander Nickolov
Microsoft MVP [VC], MCSD
email: agnickolov@xxxxxxxx
MVP VC FAQ: http://vcfaq.mvps.org
=====================================

"George" <George@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:6B8D0CC3-2B74-44F6-852D-2DD582AFF36B@xxxxxxxxxxxxxxxx
Thanks Alexander,


Sorry for the late response, I studied your algorithms for a couple of
days
and they look great! Cool!

Two more comments after thinking,

code!). The fastest mechanism is still using a table lookup and
in this case you need only 129 bytes in the table (which is probably
the only improvement compared to the 256 bytes needed for
complete lookup).

1. why need 129 bytes? We need 2^8 bits to represent all possible values
of
a byte (8-bit), which is 256 bytes, right? How do you optimize the space
by
saving (256 - 128) bytes?

2. I am interested to learn the following algorithm, but can not
understand
after reading for a couple of times. :-) Could you describe the general
idea
of how this algorithm works please?

Bit position and logarithmic search in assembly:

xor eax, eax
mov bl, [source]
mov bh, bl
neg bl
and bl, bh
mov bh, 0f0h
and bh, bl
jz end4
rar bl, 4
add eax, 4
end4:
mov bh, 0ch
and bh, bl
jz end2
rar bl, 2
add eax, 2
end2:
mov bh, 2
and bh, bl
jz end1
rar bl, 1
inc eax
end1:
; here bl is either 1 after all the shifts or it was zero all along
add al, bl
ret 4


regards,
George

"Alexander Nickolov" wrote:

Still, you get a positonal bit, not a number. It still has to be
converted to a number, which you can now do in logarithmic
time, but for small numbers (8 in this case) there's not much gain
from sequential scan (can even be negative due to the support
code!). The fastest mechanism is still using a table lookup and
in this case you need only 129 bytes in the table (which is probably
the only improvement compared to the 256 bytes needed for
complete lookup).

Below are the implementations of the four cases in assembly.
They all return the bit position as a number in the range 0-8
(0 is for no bits, 1 is bit zero and 8 is bit 7). Consider them as the
body of the following C function:

__declspec(naked) int __stdcall GetBitPos(char source)
{
__asm {
...
}
}

In that case [source] is [esp - 4].

Linear search in assembly:

mov bl, [source]
xor eax, eax
; need to test for zero beforehand to speed up the loop
or bl, bl
jz end
loop:
inc eax
rcr bl
jnc loop
end:
ret 4

Lookup in assembly:

xor eax, eax
mov al, [source]
; here we need a full 256 bytes lookup table
mov ebx, [lookup_table]
xlat
ret 4

Bit position and logarithmic search in assembly:

xor eax, eax
mov bl, [source]
mov bh, bl
neg bl
and bl, bh
mov bh, 0f0h
and bh, bl
jz end4
rar bl, 4
add eax, 4
end4:
mov bh, 0ch
and bh, bl
jz end2
rar bl, 2
add eax, 2
end2:
mov bh, 2
and bh, bl
jz end1
rar bl, 1
inc eax
end1:
; here bl is either 1 after all the shifts or it was zero all along
add al, bl
ret 4

Bit position and lookup in assembly:

xor eax, eax
mov al, [source]
mov bl, al
neg al
and al, bl
; here the table needs to contain only 129 bytes
; since the value of al is between 0 and 128
mov ebx, [lookup_table]
xlat
ret 4

If you are looking for space savings, you'd go for linear search.
Speed is achieved by direct table lookup. Your solution is a
compromise since it saves almost half of the table at the minimal
speed expense of isolating the bit first. It should also compress very
well since it is sparse with lots of zeroes (unused bytes). of note, the
table for regular lookup should compress well too since it only
contains numbers in the range 1-8. The case with logarithmic lookup
should really be tested agains the linear lookup - all bets are off as
to its performance. It's certainly greater code size though.

--
=====================================
Alexander Nickolov
Microsoft MVP [VC], MCSD
email: agnickolov@xxxxxxxx
MVP VC FAQ: http://vcfaq.mvps.org
=====================================

"Steve Alpert" <sra@xxxxxxxxxxxxxxxxx> wrote in message
news:eDguQ4JJIHA.1620@xxxxxxxxxxxxxxxxxxxxxxx
play with it a bit. it works based on binary arithmetic and the fact
that
the "negative" of a number is the one's complement plus one!

when you take the 1's complement of a number, 1 => 0 and 0 => 1.
think about where the rightmost zero is in a 1's complement. It is
exactly where the rightmost one is in the original number. When you
add 1
to the one's complement, all 1's to the right of the rightmost zero
toggle
to 0 and the carry falls into the rightmost 0 and it becomes a 1. Thus
when you do the logical and, all of the zeros to the right of that one
yield zero and everything to the left of that one is still the one's
complement of the original and also yield zero. Thus you will only
have a
single 1 left in the result!

/steveA

George wrote:
Hi Steve,


You mean using AND operation and using the number and its negative
(-number) counterpart will find the 1st non-zero bit? Why? Could you
provide more description please?


regards,
George

"Steve Alpert" wrote:

I don't know which way the "bits" are important to you but I remember
tricks from my old PDP-11 days! zoiks!

do a logical AND of a number and -number

eg:
1 0 0 1 0 1 0

1's comp: 0 1 1 0 1 0 1
2's comp: 0 1 1 0 1 1 0 = -number

result: 0 0 0 0 0 1 0 (note lowest 1 bit from original is 1)
for speed, you could create an array to use the result to index to
get
0..6 (in this case)

/steveA

George wrote:
Hello everyone,


I have a long array (char*) and I need to check which bit is the
first
bit whose value is 1.

I have two ways to implement,

1. Iterate each byte, then iterate each bit in each byte one by one;
2. Iterate each byte, and check whether the value of the byte itself
is
non-zero, if yes, then iterate each bit in the byte to find which
bit
is the first bit which is set to 1, or else skip this byte and
continue
to iterate next byte.

I think (2) is always faster, right? But I do not know why (2) is
faster since it is just my personal estimation without any concrete
basis analysis.


thanks in advance,
George





.



Relevant Pages