473,386 Members | 1,823 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes and contribute your articles to a community of 473,386 developers and data experts.

Old tricks

11,448 Expert 8TB
Greetings,

Introduction

This week's tip describes a few old tricks that are almost forgotten by most
people around here. Sometimes there's no need for these tricks anymore because
processors nowadays are so fast and memory comes in abundance. But still, if
we implement an algorithm that is better, or more efficient, than another one,
those faster processors run the first algorithm faster than the other one. If
an algorithm takes less memory to run, we can run larger instances of a problem
to solve in our larger memory computers.

Larger and faster computers are still no excuse to implement slow and/or memory
hogging algorithms. The next paragraphs describe two little problems and the
way they can be solved. The first paragraph deals with speed and the following
paragraph deal with memory requirements without sacrificing speed requirements.

Bit counting

For a plethora of reasons people want to know how many bits are set in a byte
or word. A naive approach to count those bit in an int is the following:

Expand|Select|Wrap|Line Numbers
  1. public int count1(int word) {
  2.     int ones= 0; // the number of one-bits in the 'word'
  3.     for (int mask= 0x00000001; mask != 0; mask<<= 1)
  4.         if ((word&mask) != 0) ones++;
  5.     return ones;
  6. }
  7.  
This is a very simple approach: we check every single bit using the 'mask'
variable; if the corresponding bit in the word is set we increment the 'ones'
counter. The method returns the 'ones' variable when the loop has finished.

No matter the value of the word, the loop loops 32 times because an int contains
32 bits and all need to be checked.

We can improve on this a bit: when a one bit has been found, we set it back to
zero again. If no one bits remain in the word variable we're done. This is how
it can be implemented:

Expand|Select|Wrap|Line Numbers
  1. public int count2(int word) {
  2.     int ones= 0; // the number of one-bits in the 'word'
  3.     for (int mask= 0x00000001; word != 0; mask<<= 1)
  4.         if ((word&mask) != 0) {
  5.             word&= ~mask; // remove the bit
  6.             ones++;
  7.         }
  8.     return ones;
  9. }
  10.  
The loop now stops when there are no more bits in the word set to one. The count2
method certainly runs more efficient than the previous count1 method. How much
more efficient exactly? If we consider a random 32 bit pattern, 50% of those
bit patterns has their leftmost bit set to one (check it). So in 50% of all the
possible values of 'word', count2 doesn't run more efficient than method count1.

For 2^31 possible values count2 needs 32 iterations, for 2^30 possible values
count2 needs only 31 iterations etc. etc. all the way to 2^0 possible values
the count2 method needs zero iterations; that's for value 0 where the loop body
is skipped entirely.

Mathematically speaking, on average the count2 method needs:

sum(i= 0, 31: 2^i*(i+1))/2^32 == 31 iterations.

Big deal, on average we're saving one single iteration. We must be able to do
better than this. Have a look at the following notion:

For every bit pattern with at least one bit set to one, we can represent the
bit pattern as follows:

xxx ... xxxx1000 ... 0000

There are bits 'xxx' to the left of the 1 bit and zero or more 0 bits to the right
of the 1 bit.

If we subtract 1 from that bit pattern we obtain the bit pattern:

xxx ... xxxx0111 ... 1111

If we bitwise 'and' both bit patterns we get the bit pattern:

xxx ... xxxx0000 ... 0000

We have effectively removed the rightmost 1 bit from the original bit pattern.
We have done this:

Expand|Select|Wrap|Line Numbers
  1. pattern&= pattern-1;
  2.  
We can use this notion in our count3 method:

Expand|Select|Wrap|Line Numbers
  1. public int count3(int word) {
  2.     int ones= 0; // the number of one-bits in the 'word'
  3.     for (; word != 0; ones++)
  4.         word&= word-1;
  5.     return ones;
  6. }
  7.  
Every pass through the loop a rightmost 1 bit is removed from 'word' until 'word'
equals zero. On average in a 32 bit word there are 16 bits set to 1. So this
count3 method certainly does better than the count1 and count2 methods. count2
wasn't much better than count1 one, but count3 takes only half of the loop passes.

Can we do better than this? Sure, mathematically speaking we can: we simply build
a table with 2^32 elements such that table[i] == j if the bit pattern i contains
j bits set to 1. Sure, mathematicians are loonies: we can't build a table or array
with 2^32 elements; even nowadays we need humongous amounts of memory for that.

We can build such a table for the 256 byte values though; the table would look
something like this:

Expand|Select|Wrap|Line Numbers
  1. byte[] table= { 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4 ... 8 };
  2.  
We could use this table like this:

Expand|Select|Wrap|Line Numbers
  1. public int count4(int word) {
  2.     int ones;
  3.     for (int byte= 0; byte < 4; byte++) {
  4.         ones+= table[word&0xff];
  5.         word>>>= 8;
  6.     }
  7.     return ones;
  8. }
  9.  
This method simply counts all the one-bits in every byte (there are four bytes
in an int using Java) so this algorithm takes only four loop passes ever. But
it does need a 256 byte table. Can we do without such a table and still do better
than the count3 method?

We need 2 bits to store the number of ones in a two bit wide word. There are
at most 2 bits equal to 1 in a two bits wide word and the value 2 needs two bits
by itself. We only need 6 bits to store a value in the range [0, 32]. At most
32 bits can be equal to one in a 32 bits wide word.

How can we find the number of bits set to one in a two bit wide word? Easy:

Expand|Select|Wrap|Line Numbers
  1. int word; // just two bits in here, zero or one
  2. int ones= (word&1)+(word>>>1);
  3.  
But we can use that little trick in parallel; for a 32 bits wide word we can
store 16 two bit counters for the 16 two bit words that make up the single 32
bit word:

Expand|Select|Wrap|Line Numbers
  1. int word; // 32 bits wide word interpreted as 16 two bit words;
  2. int ones= (word&0x55555555)+((word&0xaaaaaaaa)>>>1);
  3.  
The 'ones' int now contains sixteen small two bit values representing the number
of bits set to one if we interpret the 32 bit word value as 16 two bit small words.

All we need to do now is add those 16 two bit values in the 'ones' int and we
have the number of bits set to 1 in the word int.

We can add 16 two bit values stored in one 32 bit word using the same trick:
instead of the previous masks we use 0x33333333 and 0xccccccc to mask out the
16 two bit values and add them together again; we have 4 bits to store the sum.

But then again we can repeat this little procedure over and over again:

Expand|Select|Wrap|Line Numbers
  1. public int count5(int word) {
  2.     word= (word&0x55555555)+((word&0xaaaaaaaa)>>> 1); // sum  1 bits
  3.     word= (word&0x33333333)+((word&0xcccccccc)>>> 2); // sum  2 bits
  4.     word= (word&0x0f0f0f0f)+((word&0xf0f0f0f0)>>> 4); // sum  4 bits;
  5.     word= (word&0x00ff00ff)+((word&0xff00ff00)>>> 8); // sum  8 bits;
  6.     word= (word&0x0000ffff)+((word&oxffff0000)>>>16); // sum 16 bits;
  7.  
  8.     return word;
  9. }
  10.  
This method doesn't use a loop anymore: it uses five steps and a table in disguise; let's see what we've got now:


  • count1: steps: 32
  • count2: steps: 31 on average
  • count3: steps: 16 on average
  • count4: steps: 4 + 256 byte explicit table
  • count5: steps: 5 + 40 byte implicit table
Now let's have a look what happens when words are n bits wide; the table will
look like this then:
  • count1: steps: n
  • count2: steps: n-1 on average
  • count3: steps: n/2 on average
  • count4: steps: n/8 + 256 byte explicit table
  • count5: steps: 2log(n) + n/8*2*2log(n) byte implicit table
When n gets larger and larger (and it will) the last algorithm's memory
requirements grows logarithmically with n. The first four algorithms grow
linearly proportional with n; pick your choice.

Memory manipulation

We like to move memory around almost continuously; it seems as if we can't make
up our mind where we want our data to be stored and we keep on dragging those
bytes around over and over again. For example: I wrote the first paragraph of
this article after I wrote this second paragraph.

Effecively I moved the second paragraph further down in memory to make room for
the first paragraph. And now I'm appending more data to the rest of the text.

Suppose I have a sequence of memory cell; for the examples I use plain text but
for the development of the algorithms I'll use a more abstract notion: memory
consists of cells that are indexed using the index numbers 0, 1, 2, ... n.

The first thing I want to be able to do is to swap the content of two cells i and j,
so this hypothetical memory implements the following interface:

Expand|Select|Wrap|Line Numbers
  1. public interface Swapeable {
  2.     public void swap(int i, int j);
  3. }
  4.  
Given this simple interface I can reverse chunks of memory, e.g. if this is the
memory content and two index values i and j

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | H | e | l | l | o |   | W | o | r | l | d | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.               ^                   ^
  5.               |                   |
  6.               i                   j
  7.  
The following method reverses the content of the memory between index values i and j:

Expand|Select|Wrap|Line Numbers
  1. public void reverse(Swapeable s, int i, int j) {
  2.  
  3.     for(; --j > i; i++)
  4.         s.swap(i, j);
  5. }
  6.  
And the memory content will look like this afterwards:

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | H | e | l | o | W |   | o | l | r | l | d | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.               ^                   ^
  5.               |                   |
  6.               i                   j
  7.  
Well, big deal; that's CS101 and (almost?) everbody can do that. True, but there's
more we can do using just that simple reverse() method and not so many people
realize that fact.

Suppose I reverse this part:

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | H | e | l | l | o |   | W | o | r | l | d | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.   ^                                           ^
  5.   |                                           |
  6.   i                                           j
  7.  
The result is:

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | d | l | r | o | W |   | o | l | l | e | H | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.   ^                                           ^
  5.   |                                           |
  6.   i                                           j
  7.  
Next I reverse this part:

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | d | l | r | o | W |   | o | l | l | e | H | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.   ^                   ^
  5.   |                   |
  6.   i                   j
  7.  
The result is:

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | W | o | r | l | d |   | o | l | l | e | H | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.   ^                   ^
  5.   |                   |
  6.   i                   j
  7.  
And finally I reverse this part:

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | W | o | r | l | d |   | o | l | l | e | H | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.                           ^                   ^
  5.                           |                   |
  6.                           i                   j
  7.  
Which leads to this:

Expand|Select|Wrap|Line Numbers
  1. +---+---+---+---+---+---+---+---+---+---+---+---+
  2. | W | o | r | l | d |   | H | e | l | l | o | ! |
  3. +---+---+---+---+---+---+---+---+---+---+---+---+
  4.                           ^                   ^
  5.                           |                   |
  6.                           i                   j
  7.  
I have just interchanged two blocks of memory; this was the original content
of the memory: B1 B2 B3 where B1 == Hello, B2 == <space> and B3 == World. The
result is this B3 B2 B1. The '!' sign wasn't used nor touched. If block B2
contains more than one element I had to reverse that block as well.

Even more important: no additional temporary memory was used except for a single
memory cell used by the swap() method itself. In the early days when memory was
measured in Kilobytes or even single bytes instead of Gigabytes this little trick
was a precious little gem and used a lot. But what keeps you from still using it?

In the following pseudo code I use the notation 'b' for a block B reversed.
The steps needed are:

1: reverse B1 B2 B3 yielding b3 b2 b1
2: reverse b3 yielding B3 b2 b1
3: reverse b2 yielding B3 B2 b1
4: reverse b1 yielding B3 B2 B1

Here's the code for it:

Expand|Select|Wrap|Line Numbers
  1. // assume ibeg <= iend <= jbeg <= jend
  2. public void move(Swapeable s, int ibeg, int iend, int jbeg, int jend) {
  3.     int ilen= iend-ibeg;
  4.     int jlen= jend-jbeg;
  5.  
  6.     reverse(s, ibeg, jend);
  7.     reverse(s, ibeg, ibeg+jlen);
  8.     reverse(s, ibeg+jlen, jend-ilen);
  9.     reverse(s, jend-ilen, jend);
  10. }
  11.  
I leave it to the reader to check all the index fiddling in that method. This
method interchanges two blocks of memory; the blocks don't need to be of the
same size; the first block needs to be to the left of the second block; it doesn't
take rocket science to extend this method so the two blocks can be located
anywhere in memory, i.e. the first block doesn't need to be located to the left
of the second block; (hint: write a little wrapper method).

Concluding remarks

Those two tricks I described here are proud members of my bag of (old) tricks
because they come to good use now and then although I do need to dust them off.
Maybe I'll describe a couple more (old) tricks in the near future. For now,
please understand the details of those two tricks; it's worth the time.

I hope we'll meet again here next week and

kind regards,

Jos
Sep 29 '07 #1
1 3873
Thanx alot it is realy helpping
Jun 24 '08 #2

Sign in to post your reply or Sign up for a free account.

Similar topics

822
by: Turamnvia Suouriviaskimatta | last post by:
I 'm following various posting in "comp.lang.ada, comp.lang.c++ , comp.realtime, comp.software-eng" groups regarding selection of a programming language of C, C++ or Ada for safety critical...
25
by: Dave Turner | last post by:
I know that its impossible to completely prevent somebody from ripping a site (or cracking software) if that person has the skills and the time/patience, but there are tricks that can be employed...
28
by: sowmiyakc18 | last post by:
Please clear my doubt. When do we declare a variable to be a register variable? What is its significance? What are the conditions to be adhered to when register variables are passed between...
0
by: travolta003 | last post by:
Windows XP tips and tricks. Learn how to bypass very common windows problems, speed up your system and make it more reliable with useful tips and tricks. http://windowsxpsp2pro.blogspot.com
0
by: travolta004 | last post by:
Windows XP tips and tricks. Learn how to bypass very common windows problems, speed up your system and make it more reliable with useful tips and tricks. http://windowsxpsp2pro.blogspot.com
68
bartonc
by: bartonc | last post by:
I've decide to compile a bunch of your favorite tips and tricks for the Articles section. I found a post yesterday by Chrisjc that is a perfect example. I copied his post over to create Dealing with...
2
bartonc
by: bartonc | last post by:
I've decide to compile a bunch of your favorite tips and tricks for the Articles section. Post your favorite tips and tricks here, in this thread, and I'll copy the best ones to a Tips and Tricks...
0
by: e.expelliarmus | last post by:
check this out buddies. kool website for: * hacking and anti hacking tricks * anti hackng tricks. * registry tweaks * orkut tricks * small virus * computer tricks and loads of different...
8
Frinavale
by: Frinavale | last post by:
Edit Many times we spend hours and hours trying to solve a problem. When we finally figure it out, we want to share it to keep others from suffering the same way! That's why we have this "Tips...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.