474,037 Members | 2,484 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Efficient sieving implementation?

Dear Reader,

I'm trying to implement the following short algorithm:

Given a number n, remove all the multiples of n
from the list of non-negative numbers from 1 through a limit k.

My solution is to take an array with all entries '1' and iterate
over it in steps of n, setting the current pointer to '0'.

Here's the code:

const LIMIT=k+1;
unsigned char Sieve[LIMIT];

memset(Sieve,1, LIMIT); // initialise all entries to '1'
for(i=n;i<LIMIT ;i+=n) {Sieve[i]=0;}

As k gets large this implementation is really slow for small n,
especially if you have to sieve more than one n. Is there a
more efficient way to implement this? Are there fast memory
operators like memset in c++ for this task?

Thanks for your help!

Best,
Oswald Kluge

Jul 24 '06 #1
15 1813

Oswald Kluge wrote:
Dear Reader,

I'm trying to implement the following short algorithm:

Given a number n, remove all the multiples of n
from the list of non-negative numbers from 1 through a limit k.

My solution is to take an array with all entries '1' and iterate
over it in steps of n, setting the current pointer to '0'.

Here's the code:

const LIMIT=k+1;
unsigned char Sieve[LIMIT];

memset(Sieve,1, LIMIT); // initialise all entries to '1'
for(i=n;i<LIMIT ;i+=n) {Sieve[i]=0;}

As k gets large this implementation is really slow for small n,
especially if you have to sieve more than one n. Is there a
more efficient way to implement this? Are there fast memory
operators like memset in c++ for this task?
Sure, instead of

if (Sieve[blah] == 0) { happy_dance(); }

Do

if (Sieve(blah) == 0) { happy_dance(); }

Where Sieve() is a function that returns the value desired.

If your domain is truly large than a table is just inefficient and
wasteful. If the table is really small then your concern is moot.

Tom

Jul 24 '06 #2

Oswald Kluge wrote:
Dear Reader,

I'm trying to implement the following short algorithm:

Given a number n, remove all the multiples of n
from the list of non-negative numbers from 1 through a limit k.

My solution is to take an array with all entries '1' and iterate
over it in steps of n, setting the current pointer to '0'.

Here's the code:

const LIMIT=k+1;
unsigned char Sieve[LIMIT];

memset(Sieve,1, LIMIT); // initialise all entries to '1'
for(i=n;i<LIMIT ;i+=n) {Sieve[i]=0;}

As k gets large this implementation is really slow for small n,
especially if you have to sieve more than one n. Is there a
more efficient way to implement this? Are there fast memory
operators like memset in c++ for this task?

Thanks for your help!

Best,
Oswald Kluge
It may be possible to improve this for a generalized case (it can be
done for specific cases of multiple "n"s), but it will certainly
involve more memory.

I know of nothing like memset that would be useful in this case. When
you get down to the machine level, any processor I've ever seen (and
I've seen many) will still need to do a loop like your for loop. A
good compiler will turn the for loop into efficient machine code - it's
a very simple bit of code.

Can you provide more information about the range of values for which
you are doing this, the number and types of "n"s, programmatic
constraints on cpu time and memory availability, and the overall
purpose? If you are doing this to find primes, etc, this is not likely
the best route.

Jeff

Jul 24 '06 #3
I'm trying to implement the following short algorithm:
>
Given a number n, remove all the multiples of n
from the list of non-negative numbers from 1 through a limit k.
Seems to be the "Sieve of Eratosthenes", or at least a very similar
algorithm ...
My solution is to take an array with all entries '1' and iterate
over it in steps of n, setting the current pointer to '0'.
That might be what you want (input and output). On the other hand, a
direct method (like Tom already suggested) can be much faster. And even
if you need this table-driven approach, this usual iteration is not the
only way to achieve this goal.
As k gets large this implementation is really slow for small n,
This is because memory is optimized for streaming access and such
pseudo-random pattern isn't cache-friendly.
especially if you have to sieve more than one n.
Instead of sieving for each n separately, try to combine several passes
for different n's and sieve in blocks of your processors cache size.
Depending on the number of n's, this can give you good performance
boost.

Christian

Jul 24 '06 #4
it is worth caching if you will call Sieve(x) on the same x several
times. Otherwise don't cache. If you are just not sure try this:
_______________ _______________ _____
const int K = 1000000;
unsigned char sieve[K];

const int N = 3;
int divs[N] = {3, 7, 11};

char div = 0, notdiv = 1, unknown = 2;

void init_Sieve(){ memset(Sieve, unknown, K); }

inline Sieve(int x)
{
if( sieve[x] == unknown)
{
for(int i=0; i<N; i++)
if(x%divs[i] == 0){
sieve[x] = div;
break;
}
}
return sieve[x];
}
_______________ _______________ _________
This way you get sure you don't calculate twice for the same x, with a
trade off -- another couple of CPU cycles for each call Sieve(x) BUT
never calculate Sieve(x) for any never-used x.

Jul 24 '06 #5
Can you provide more information about the range of values for which
you are doing this, the number and types of "n"s, programmatic
constraints on cpu time and memory availability, and the overall
purpose? If you are doing this to find primes, etc, this is not likely
the best route.

Jeff
The purpose of this program is to find coprime numbers. If n is prime
then
if you sieve out all multiples of n, then all remaining numbers will
have no
common divisors with n.

If n is composite, I always know what prime factors are occurring, so
after
sieving out all the prime factors you still get all the coprime
numbers.

The range of values starts at k=10^5. I got a pentium 2 GHz with 2GB
RAM.

Oswald

Jul 24 '06 #6

Oswald Kluge wrote:
Can you provide more information about the range of values for which
you are doing this, the number and types of "n"s, programmatic
constraints on cpu time and memory availability, and the overall
purpose? If you are doing this to find primes, etc, this is not likely
the best route.

Jeff

The purpose of this program is to find coprime numbers. If n is prime
then
if you sieve out all multiples of n, then all remaining numbers will
have no
common divisors with n.
Why not just compute the GCD of the two numbers?

Tom

Jul 24 '06 #7

Oswald Kluge wrote:
....snip...
The purpose of this program is to find coprime numbers. If n is prime
then
if you sieve out all multiples of n, then all remaining numbers will
have no
common divisors with n.

If n is composite, I always know what prime factors are occurring, so
after
sieving out all the prime factors you still get all the coprime
numbers.

The range of values starts at k=10^5. I got a pentium 2 GHz with 2GB
RAM.

Oswald
I was really looking for a range of values (i.e. both ends). If the
cardinality of the range is not too large, sieving might be worthwhile.
I'm uncertain why you would need to know all of the coprimes of n
within a given range.

If the range of values is not too large, and if you have to do a LOT of
sieving, it might be possible to set up a network in memory, wherein
each number in the range would be connected to its prime factors. To
eliminate numbers with a common factor to n, you would only have to
specify the factors of n itself. This concept would require a fair bit
of precomputation and would use a lot of memory for overhead, so it
would not be useful if your range was, say, hundreds of millions of
numbers.

Are you really certain that you need to know all of the coprimes of n
within a given range? If not, Tom's suggestion of calculating the GCD
of n and a specific other number is excellent.

Jeff

Jul 24 '06 #8
Christian Siebert wrote:
>I'm trying to implement the following short algorithm:

Given a number n, remove all the multiples of n
from the list of non-negative numbers from 1 through a limit k.
>As k gets large this implementation is really slow for small n,
This is because memory is optimized for streaming access and such
pseudo-random pattern isn't cache-friendly.
>especially if you have to sieve more than one n.
Instead of sieving for each n separately, try to combine several passes
for different n's and sieve in blocks of your processors cache size.
One way to do this, though it may be obvious to some, is to combine n's
into groups by taking the least common multiple. That gives you a pattern
you can repeat. For example, when doing a sieve with 2 and 3, you can do
the straightforward way:

for (i = 0; i < end; i += 2) { a[i] = 0; }
for (i = 0; i < end; i += 3) { a[i] = 0; }

Or instead, you can (mostly) combine these into one pass:

for (i = 0; i < end; i += 6)
{
/* multiples of 2 */
a[i + 0] = 0;
a[i + 2] = 0;
a[i + 4] = 0;

/* multiples of 3 */
a[i + 0] = 0;
a[i + 3] = 0;
}

This can be minimized by combining the redundant assignments:

for (i = 0; i < end; i += 6)
{
a[i + 0] = 0;
a[i + 2] = 0;
a[i + 3] = 0;
a[i + 4] = 0;
a[i + 5] = 0;
}

Now, here's what I think is the fun part. Once you've established
that pattern that repeats after 6 elements, you can, instead of
performing all the assignment statements separately, create a mask:

int lcm = 6;
bool mask[lcm];

for (i = 0; i < lcm; i++) { mask[i] = 1; }

for (i = 0; i < lcm; i += 2) { mask[i] = 0; }
for (i = 0; i < lcm; i += 3) { mask[i] = 0; }

and then apply the mask:

for (i = 0; i < end; i += lcm)
{
for (j = 0; j < lcm; j++)
{
a[i + j] &= mask[j];
}
}

The good thing about this, obviously, is that the inner loop in the
code just above can be optimized to DEATH. Your operating system
or some readily available library may in fact already provide a
routine to AND two regions of memory together efficiently. You
can duplicate the mask until it is some convenient size. You can
make its size a multiple of 8 or 16 or 32 if that helps you align
it in a useful way. And you can combine multiple different n's
together to produce one mask.

Of course, all this is a waste of time if the n's aren't relatively
small compared to k.

- Logan
Jul 25 '06 #9
Hi Logan,
Instead of sieving for each n separately, try to combine several passes
for different n's and sieve in blocks of your processors cache size.

One way to do this, though it may be obvious to some, is to combine n's
into groups by taking the least common multiple. That gives you a pattern
you can repeat.
nice that you think of this improvement too. I posted this idea some
months ago in conjunction with the QS:
http://groups.google.de/group/sci.cr...90676702d754ba
which gives you an additional speed-up. Though I meant the "more
common" optimization technique in the first place:

Instead of using the whole large sieve at once

memset(Sieve,1, LIMIT); // initialise all entries to '1'
for each n do
for (i=n; i<LIMIT; i+=n) { Sieve[i]=0; }
use_sieve(Sieve , LIMIT);

it is often better to split this large sieve into smaller blocks

start = 0;
while (start < LIMIT) {
memset(SieveBlo ck,1,CACHESIZE) ;
for each n do
for (i = ceil(start/n)*n; i < (start+CACHESIZ E); i+=n)
{ SieveBlock[i-start]=0; }
use_sieve_block (SieveBlock, start, CACHESIZE);
start += CACHESIZE; // proceed with next block
}
Now, here's what I think is the fun part. Once you've established
that pattern that repeats after 6 elements, you can, instead of
performing all the assignment statements separately, create a mask:
... and then apply the mask: ... & ...
Your operating system or some readily available library may in fact
already provide a routine to AND two regions of memory together efficiently.
Why should you waste an additional logical operation, when it is
possible to shift this work into the initialization step. Create the
"mask" like suggested, and then use memcpy() to repeatedly copy it to
your sieve memory...

Christian

Jul 25 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
8177
by: Alex | last post by:
I'm looking for a fast way to split a string into individual tokens. The typical format of the string is token1|token2|token3|...|tokenN| and the number of tokens varies (which is why i use a vector to hold the tokens). My current implementation uses the following method to split up the string: vector<char*> tokenize(char* str, char delim) { vector<char> theString;
25
2975
by: CodeCracker | last post by:
Problem details below: I have few items(simple values or objects) to be put into an array and I implement it through a set rather than just an array of items. This is because every time I get a new item I have to look into the Set whether it is there or not. Depending on whether it is there or not I have to respond differntly. Since it is easy to search into a Set rather than a number of if statement which will compare each of the...
3
1940
by: Daniel Weinand | last post by:
hello ng, i have a problem and a imho an insufficient method of solution. strings should be sorted by specific text pattern and dispayed in groups. the strings are stored in a db and have the following layout: 1.0.0.0 1.1.0.0 1.1.1.0 1.1.2.0
22
7718
by: Curious | last post by:
Hi, I am searching for a data structure that stores key-value pairs in it. This data structure is to hold large amounts of key-value pairs, and so needs to be efficient both in insertion and deletion. Does anybody know of ready data structures that can be freely used under the .Net framwork. Thanks in Advance
7
1331
by: Kay Schluehr | last post by:
I want to manipulate a deeply nested list in a generic way at a determined place. Given a sequence of constant objects a1, a2, ..., aN and a variable x. Now construct a list from them recursively: L = ]...]] The value of x is the only one to be changed. With each value of x a new instance of L is needed so that we actually have a function: L = lambda x: ]...]]
35
1835
by: spekyuman | last post by:
Pointer and reference arguments provide C++ programmers with the ability to modify objects. What is more efficient, passing arguments via pointer or reference? Avoid the stereotypical urge of debating effective coding style. Instead, think of particular scenarios: passing a 512MB object to function. Think behind the scenes, from the stack and heap to the physical linking and build of the program.
28
3928
by: Mahesh | last post by:
Hi, I am looking for efficient string cancatination c code. I did it using ptr but my code i think is not efficient. Please help. Thanks a lot
82
3865
by: Bill David | last post by:
SUBJECT: How to make this program more efficient? In my program, a thread will check update from server periodically and generate a stl::map for other part of this program to read data from. Let's name the update method as doUpdate and stl::map read methods as getData and copyData. Since stl::map is not thread-safe, we should do synchronization by ourselves. A usable solution is to create a boost::mutex::scoped_lock object in all above...
0
10532
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10328
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11591
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
11128
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
8685
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7857
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
6639
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
4933
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3951
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.