473,498 Members | 523 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Algorithm to break up a vector

Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsize/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

Anyone have any ideas?

Thanks.

Nov 14 '05 #1
12 3135


No Such Luck wrote:
Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.
If you give us C code to work with, then we will point out
possible algorithmic shortcomings.
A request for a C algorithm is not topical here.
comp.programming might be a good starting point to ask for
an algorithm. Then you can implement it -- and if you have
problems because your program does not run as intended,
then you can isolate a minimal running program exhibiting
these problems and we will help you.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsize/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

Anyone have any ideas?


Yes. This sounds like an idiotic homework question. More natural
would be to give the maximum length of subvectors and to go for
7|7|7|6|6|6|6 instead of 6|7|6|7|6|7|6.

Hints:
45/7 = 6
45%7 = 3
Cheers
Michael
--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #2
No Such Luck wrote:
Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsize/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6


An easy way to proceed is (pseudocode):

int vecsize = length of original vector;
int subcount = number of sub-vectors desired;
int subsizes[subcount];
for ( ; subcount > 0; --subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}

(That mess in the middle is merely "vecsize / subcount,
rounded." You could use an ordinary integer division or
a "ceiling" without affecting the validity of the method,
but you'd get a different pattern of long and short
sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible. Its principal virtue is that it's
easy to see why it works, and hence hard to get wrong.

<off-topic>

It's also an illustrative example of a divide-and-conquer
solution that *doesn't* beg for a recursive implementation.

</off-topic>

--
Er*********@sun.com

Nov 14 '05 #3

"Eric Sosman" <er*********@sun.com> wrote in message
news:cp**********@news1brm.Central.Sun.COM...
No Such Luck wrote:
Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsize/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6


An easy way to proceed is (pseudocode):

int vecsize = length of original vector;
int subcount = number of sub-vectors desired;
int subsizes[subcount];
for ( ; subcount > 0; --subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}

(That mess in the middle is merely "vecsize / subcount,
rounded." You could use an ordinary integer division or
a "ceiling" without affecting the validity of the method,
but you'd get a different pattern of long and short
sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible. Its principal virtue is that it's
easy to see why it works, and hence hard to get wrong.

<off-topic>

It's also an illustrative example of a divide-and-conquer
solution that *doesn't* beg for a recursive implementation.

</off-topic>


Thanks, Eric. I will give this implementation a whirl tomorrow. A divide and
conquer technique never occurred to me. I thought I would have to determine
the size of all the subvectors in the outset...
Nov 14 '05 #4

"No Such Luck" <no@such.luck> wrote in message
news:BZ********************@rcn.net...
Thanks, Eric. I will give this implementation a whirl tomorrow. A divide and conquer technique never occurred to me.


It should have. :-)

'Divide and conquer' is one of the cornerstones of programming,
in any language.

-Mike
Nov 14 '05 #5
On Thu, 09 Dec 2004 21:22:08 -0500, No Such Luck wrote:

....
The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:
You can do this using modulo arithmetic and I give some example code
below. It doesn't use % to perform the modulo arithmetic but it could do.
Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6


An easy way to proceed is (pseudocode):

int vecsize = length of original vector; int subcount = number of
sub-vectors desired; int subsizes[subcount]; for ( ; subcount > 0;
--subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}
}
(That mess in the middle is merely "vecsize / subcount, rounded." You
could use an ordinary integer division or a "ceiling" without affecting
the validity of the method, but you'd get a different pattern of long
and short sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible.
It doesn't distribute them evenly. For example with vecsize=45 and
subcount=14 I get subvectors from it like this:

3 4 3 4 3 4 3 3 3 3 3 3 3 3
Its principal virtue is that it's easy to see
why it works, and hence hard to get wrong.
I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the uneven
distribution indicates. Essentially it creates a non-linear
convergence.
<off-topic>

It's also an illustrative example of a divide-and-conquer
solution that *doesn't* beg for a recursive implementation.
If this qualifies as a divide and conquer algorithm it is a degenerate
one, more a lop one off the end and conquer. :-) You might just as well
say that a linear search (as well as many simple loops) is divide and
conquer because you can test the first (or last) element and then if
necessary do a linear search on the rest. Compare that to a binary search.
</off-topic>


Thanks, Eric. I will give this implementation a whirl tomorrow. A divide
and conquer technique never occurred to me. I thought I would have to
determine the size of all the subvectors in the outset...


Try this. It does produce an even (or as even as possible) distribution of
subvector sizes, although a small amount of bias can be controlled by the
initial value of sum. It doesn't write the results to an array but you can
do so easily if you want. With vecsize=45 and subcount=14 it produces

3 3 4 3 3 3 4 3 3 3 3 4 3 3
static void subsizes(int vecsize, int subcount)
{
int basesize = vecsize / subcount;
int bumps = vecsize % subcount;
int sum = subcount/2; /* Try also 0, subcount-1, (subcount+1)/2 */
int testtotal = 0; /* For checking only */
int i;

for (i = 0; i < subcount; i++) {
int subsize = basesize;

if ((sum += bumps) >= subcount) {
sum -= subcount;
subsize++;
}

printf(" %d", subsize);
testtotal += subsize;
}

printf("\n\ntesttotal=%d\n", testtotal);
}

Lawrence

Nov 14 '05 #6

Lawrence Kirby wrote:
On Thu, 09 Dec 2004 21:22:08 -0500, No Such Luck wrote:

...
The obvious solution to the above example be to have 3 subvectors of size 7, and 4 subvectors of size 6, but I'm having a hard time thinking of a way to implement this for an arbitrary vector size and number of subvectors. I know the mod(%) operator is probably required, and I know for the above example that the algorithm I need will result:
You can do this using modulo arithmetic and I give some example code
below. It doesn't use % to perform the modulo arithmetic but it could do. Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

An easy way to proceed is (pseudocode):

int vecsize = length of original vector; int subcount = number of
sub-vectors desired; int subsizes[subcount]; for ( ; subcount > 0; --subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}
}
(That mess in the middle is merely "vecsize / subcount, rounded." You could use an ordinary integer division or a "ceiling" without affecting the validity of the method, but you'd get a different pattern of long and short sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible.
It doesn't distribute them evenly. For example with vecsize=45 and
subcount=14 I get subvectors from it like this:

3 4 3 4 3 4 3 3 3 3 3 3 3 3
Its principal virtue is that it's easy to see
why it works, and hence hard to get wrong.

I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the

uneven distribution indicates. Essentially it creates a non-linear
convergence.


It works perfectly. I have implemented it and tried in on thousands of
variations of vecsize and subcount, all with correct results. And it
doesn't matter that the distribution of subsizes is uneven. The only
stipulation is that the sizes of all the subvectors are either n or
n+1, and that their sum is the original vecsize.

Nov 14 '05 #7
On Wed, 15 Dec 2004 13:20:21 -0800, No Such Luck wrote:

....
I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the uneven
distribution indicates. Essentially it creates a non-linear
convergence.


It works perfectly. I have implemented it and tried in on thousands of
variations of vecsize and subcount, all with correct results. And it


I'm not saying it doesn't work, I'm pretty sure that it does. My question
is whether you can PROVE that it is correct. What you have done is good
verification but unless you have tested EVERY possible variation it isn't
proof.
doesn't matter that the distribution of subsizes is uneven. The only
stipulation is that the sizes of all the subvectors are either n or
n+1, and that their sum is the original vecsize.


In which case you can solve this very simply, for example

static void subsizes2(int vecsize, int subcount)
{
int basesize = vecsize / subcount;
int bumps = vecsize % subcount;
int i;

for (i = 0; i < subcount; i++) {
printf(" %d", basesize + (i < bumps));
}

putchar('\n');
}
Nov 14 '05 #8

Lawrence Kirby wrote:
On Wed, 15 Dec 2004 13:20:21 -0800, No Such Luck wrote:

...
I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the uneven
distribution indicates. Essentially it creates a non-linear
convergence.


It works perfectly. I have implemented it and tried in on thousands of variations of vecsize and subcount, all with correct results. And

it
I'm not saying it doesn't work, I'm pretty sure that it does. My question is whether you can PROVE that it is correct. What you have done is good verification but unless you have tested EVERY possible variation it isn't proof.


Like I said, I've tested Eric's algorithm on hundreds of vecsizes
ranging from 100 to 1500, and subcounts ranging from 1 to 100 (i.e.,
thousands of trials). All with correct results (all verified
numerically, and some verified visually).

What proof are you looking for? Something like:

for (vecsize = 1; vecsize < 1000000; vecsize++)
{
for (subcount = 1; soubcount < 1000000; subcount++)
{
test_algorithm(vecsize, subcount);
}
}

Nov 14 '05 #9
On 16 Dec 2004 10:49:33 -0800, in comp.lang.c , "No Such Luck"
<no*********@hotmail.com> wrote:

Like I said, I've tested Eric's algorithm on hundreds of vecsizes
ranging from 100 to 1500, and subcounts ranging from 1 to 100 (i.e.,
thousands of trials). All with correct results (all verified
numerically, and some verified visually).
This isn't proof tho, its empirical evidence. A similar huge body of
evidence exists to support Newton's laws. They're wrong.
What proof are you looking for?
You seem to be trying to prove an algo. I'd suggest a mathematical method.
Something like:

for (vecsize = 1; vecsize < 1000000; vecsize++)
{
for (subcount = 1; soubcount < 1000000; subcount++)
{
test_algorithm(vecsize, subcount);
}
}


Thats still empirical.
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Nov 14 '05 #10
On 9 Dec 2004 12:47:36 -0800, "No Such Luck" <no*********@hotmail.com>
wrote:
Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsize/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

Anyone have any ideas?


I must be missing something. If N is the number of elements in the
vector and M is the number of desired subvectors, divide M into N
getting the quotient and remainder. Let them be q and r. Then r of
the M subvectors have length q+1 and M-r have length q. Mix them up
any way you like. Since this is comp.lang.c it might be nice to have
some C code:

q = N/M;
r = N-q*M;
Richard Harter, cr*@tiac.net
http://home.tiac.net/~cri, http://www.varinoma.com
I started out in life with nothing
I still have most of it left
Nov 14 '05 #11
On Fri, 17 Dec 2004 10:03:43 +0000, Richard Harter wrote:

....
I must be missing something. If N is the number of elements in the
vector and M is the number of desired subvectors, divide M into N
getting the quotient and remainder. Let them be q and r. Then r of
the M subvectors have length q+1 and M-r have length q. Mix them up
any way you like. Since this is comp.lang.c it might be nice to have
some C code:

q = N/M;
r = N-q*M;


Both of the code examples I posted in effect use this method, although you
can of course write the 2nd as r = N%M;

Eric's method is interesting though becasue it uses a fundamentally
different approach.

Lawrence

Nov 14 '05 #12
On Thu, 16 Dec 2004 10:49:33 -0800, No Such Luck wrote:

....
I'm not saying it doesn't work, I'm pretty sure that it does. My

question
is whether you can PROVE that it is correct. What you have done is

good
verification but unless you have tested EVERY possible variation it

isn't
proof.


Like I said, I've tested Eric's algorithm on hundreds of vecsizes
ranging from 100 to 1500, and subcounts ranging from 1 to 100 (i.e.,
thousands of trials). All with correct results (all verified
numerically, and some verified visually).

What proof are you looking for? Something like:


....

No, proof of the correctness of the algorithm. Maybe something
like following outline:

Initially there are subcount subvectors to define over vecsize elements.
Each final subvector can have size S where S is either vecsize/subcount or
vecsize/subcount+1.

Let T be the total number of subvectors remaining
to allocate. Initially T=subcount

Let N be the number of subvectors of size vecsize/subcount+1
remaining to allocate. Initially N=vecsize%subcount.

Each iteration of the loop reduces T by 1 until it reaches 0, and reduces
N by 0 or 1 depending on a calculation in the loop. The algorithm is
correct if it maintains the relation 0 <= N <= T at every iteration. In
particular this means that N is 0 when the loop terminates.

Lawrence
Nov 14 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1604
by: nan.li.g | last post by:
Hello, all, This is probably not so much related to C++ (Sorry in advance). But I did get this question during a C++ job interview. You are asked to design some data structure and algorith for the...
9
808
by: Patrick Guio | last post by:
Dear all, I am trying to use the std::transform algorithm to to the following vector< vector<char> >::iterator ik = keys.begin(); // key list iterator vector< vector<char> >::iterator is = ik;...
17
4318
by: Allerdyce.John | last post by:
Hi, I am trying to compare the amount of work between using STL algorithm VS a plain Java loop. Let's say the class Rect has 2 attributes: area, and areaPerCent. In Java, I just write a...
3
4521
by: devel | last post by:
Hello, Could someone tell me why the find_if statement applied to my multimap dictionnary is doesn't compile? Does this algorithm doesn't work on a multimap? I don't understand why the...
5
2550
by: Draw | last post by:
Hi All, Just a thought, about the find() algorithm in the C++ STL. I read that the find algorithm can take a range of iterators. If it does not find the element it is looking for in that range...
2
2130
by: Sherrie Laraurens | last post by:
Hi all, I'm trying to write a generic algorithm routine that will take begin and end iterators of a container, iterate through the range and perform a "calculation" of sorts. The trouble is...
3
4866
nabh4u
by: nabh4u | last post by:
i have a program where i have to use gale shapley's algorithm to match companies and persons. i create preference lists for both companies and persons. i am almost done but i am not getting the...
6
4071
by: pj | last post by:
Hi, I 'm currently writing a program that performs transliteration (i.e., converts greek text written using the english alphabet to "pure" greek text using the greek alphabet) as part of my...
10
6047
by: arnuld | last post by:
WANTED: /* C++ Primer - 4/e * * Exercise: 9.26 * STATEMENT * Using the following definition of ia, copy ia into a vector and into a list. Use the single iterator form of erase to...
0
7124
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
6998
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7200
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
5460
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
3090
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3078
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1416
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
651
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
287
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.