Algorithm to break up a vector

No Such Luck

Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsiz e/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

Anyone have any ideas?

Thanks.

Nov 14 '05 #1

Subscribe Reply

3159

Michael Mair

No Such Luck wrote:

Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.
If you give us C code to work with, then we will point out
possible algorithmic shortcomings.
A request for a C algorithm is not topical here.
comp.programmin g might be a good starting point to ask for
an algorithm. Then you can implement it -- and if you have
problems because your program does not run as intended,
then you can isolate a minimal running program exhibiting
these problems and we will help you.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsiz e/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

Anyone have any ideas?

Yes. This sounds like an idiotic homework question. More natural
would be to give the maximum length of subvectors and to go for
7|7|7|6|6|6|6 instead of 6|7|6|7|6|7|6.

Hints:
45/7 = 6
45%7 = 3
Cheers
Michael
--
E-Mail: Mine is a gmx dot de address.

Nov 14 '05 #2

Eric Sosman

No Such Luck wrote:

Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsiz e/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

An easy way to proceed is (pseudocode):

int vecsize = length of original vector;
int subcount = number of sub-vectors desired;
int subsizes[subcount];
for ( ; subcount > 0; --subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}

(That mess in the middle is merely "vecsize / subcount,
rounded." You could use an ordinary integer division or
a "ceiling" without affecting the validity of the method,
but you'd get a different pattern of long and short
sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible. Its principal virtue is that it's
easy to see why it works, and hence hard to get wrong.

<off-topic>

It's also an illustrative example of a divide-and-conquer
solution that *doesn't* beg for a recursive implementation.

</off-topic>

--
Er*********@sun .com

Nov 14 '05 #3

No Such Luck

"Eric Sosman" <er*********@su n.com> wrote in message
news:cp******** **@news1brm.Cen tral.Sun.COM...

No Such Luck wrote:
Hi All:

I'm not sure if this is the right place to ask this question, but I
couldn't find a more appropriate group. This is more of a theory
question regarding an algorithm implemented in C, not necessarily a C
language question.

I'm trying to break up a vector into an arbitrary number of subvectors,
equal (or as near to equal) in size as possible. My problem occurs when
the vector is not evenly divisible by the number of subvectors (i.e.,
vector length: 45, number of subvectors: 7)

If I break up the vector based on (int)(vectorsiz e/numsubvectors), not
all of the original vector data will be included. If I break up the
vector based on ceil(vectorsize/numsubvectors), the end of the vector
will be passed, resulting in a memory violation.

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:

Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

An easy way to proceed is (pseudocode):

int vecsize = length of original vector;
int subcount = number of sub-vectors desired;
int subsizes[subcount];
for ( ; subcount > 0; --subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}

(That mess in the middle is merely "vecsize / subcount,
rounded." You could use an ordinary integer division or
a "ceiling" without affecting the validity of the method,
but you'd get a different pattern of long and short
sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible. Its principal virtue is that it's
easy to see why it works, and hence hard to get wrong.

<off-topic>

It's also an illustrative example of a divide-and-conquer
solution that *doesn't* beg for a recursive implementation.

</off-topic>

Thanks, Eric. I will give this implementation a whirl tomorrow. A divide and
conquer technique never occurred to me. I thought I would have to determine
the size of all the subvectors in the outset...

Nov 14 '05 #4

Mike Wahler

"No Such Luck" <no@such.luck > wrote in message
news:BZ******** ************@rc n.net...

Thanks, Eric. I will give this implementation a whirl tomorrow. A divide and conquer technique never occurred to me.

It should have. :-)

'Divide and conquer' is one of the cornerstones of programming,
in any language.

-Mike

Nov 14 '05 #5

Lawrence Kirby

On Thu, 09 Dec 2004 21:22:08 -0500, No Such Luck wrote:

....

The obvious solution to the above example be to have 3 subvectors of
size 7, and 4 subvectors of size 6, but I'm having a hard time thinking
of a way to implement this for an arbitrary vector size and number of
subvectors. I know the mod(%) operator is probably required, and I know
for the above example that the algorithm I need will result:
You can do this using modulo arithmetic and I give some example code
below. It doesn't use % to perform the modulo arithmetic but it could do.
Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

An easy way to proceed is (pseudocode):

int vecsize = length of original vector; int subcount = number of
sub-vectors desired; int subsizes[subcount]; for ( ; subcount > 0;
--subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}
}
(That mess in the middle is merely "vecsize / subcount, rounded." You
could use an ordinary integer division or a "ceiling" without affecting
the validity of the method, but you'd get a different pattern of long
and short sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible.
It doesn't distribute them evenly. For example with vecsize=45 and
subcount=14 I get subvectors from it like this:

3 4 3 4 3 4 3 3 3 3 3 3 3 3
Its principal virtue is that it's easy to see
why it works, and hence hard to get wrong.
I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the uneven
distribution indicates. Essentially it creates a non-linear
convergence.
<off-topic>

It's also an illustrative example of a divide-and-conquer
solution that *doesn't* beg for a recursive implementation.
If this qualifies as a divide and conquer algorithm it is a degenerate
one, more a lop one off the end and conquer. :-) You might just as well
say that a linear search (as well as many simple loops) is divide and
conquer because you can test the first (or last) element and then if
necessary do a linear search on the rest. Compare that to a binary search.
</off-topic>

Thanks, Eric. I will give this implementation a whirl tomorrow. A divide
and conquer technique never occurred to me. I thought I would have to
determine the size of all the subvectors in the outset...

Try this. It does produce an even (or as even as possible) distribution of
subvector sizes, although a small amount of bias can be controlled by the
initial value of sum. It doesn't write the results to an array but you can
do so easily if you want. With vecsize=45 and subcount=14 it produces

3 3 4 3 3 3 4 3 3 3 3 4 3 3
static void subsizes(int vecsize, int subcount)
{
int basesize = vecsize / subcount;
int bumps = vecsize % subcount;
int sum = subcount/2; /* Try also 0, subcount-1, (subcount+1)/2 */
int testtotal = 0; /* For checking only */
int i;

for (i = 0; i < subcount; i++) {
int subsize = basesize;

if ((sum += bumps) >= subcount) {
sum -= subcount;
subsize++;
}

printf(" %d", subsize);
testtotal += subsize;
}

printf("\n\ntes ttotal=%d\n", testtotal);
}

Lawrence

Nov 14 '05 #6

No Such Luck

Lawrence Kirby wrote:

On Thu, 09 Dec 2004 21:22:08 -0500, No Such Luck wrote:

...
The obvious solution to the above example be to have 3 subvectors of size 7, and 4 subvectors of size 6, but I'm having a hard time thinking of a way to implement this for an arbitrary vector size and number of subvectors. I know the mod(%) operator is probably required, and I know for the above example that the algorithm I need will result:
You can do this using modulo arithmetic and I give some example code
below. It doesn't use % to perform the modulo arithmetic but it could do. Subvector 1: Size 6
Subvector 2: Size 7
Subvector 3: Size 6
Subvector 4: Size 7
Subvector 5: Size 6
Subvector 6: Size 7
Subvector 7: Size 6

An easy way to proceed is (pseudocode):

int vecsize = length of original vector; int subcount = number of
sub-vectors desired; int subsizes[subcount]; for ( ; subcount > 0; --subcount) {
subsizes[subcount-1] =
(2 * vecsize + subcount) / (2 * subcount);
vecsize -= subsizes[subcount-1];
}
}
(That mess in the middle is merely "vecsize / subcount, rounded." You could use an ordinary integer division or a "ceiling" without affecting the validity of the method, but you'd get a different pattern of long and short sub-vectors.)

This method will balance the sub-vector lengths as
evenly as is possible.
It doesn't distribute them evenly. For example with vecsize=45 and
subcount=14 I get subvectors from it like this:

3 4 3 4 3 4 3 3 3 3 3 3 3 3
Its principal virtue is that it's easy to see
why it works, and hence hard to get wrong.

I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the

uneven distribution indicates. Essentially it creates a non-linear
convergence.

It works perfectly. I have implemented it and tried in on thousands of
variations of vecsize and subcount, all with correct results. And it
doesn't matter that the distribution of subsizes is uneven. The only
stipulation is that the sizes of all the subvectors are either n or
n+1, and that their sum is the original vecsize.

Nov 14 '05 #7

Lawrence Kirby

On Wed, 15 Dec 2004 13:20:21 -0800, No Such Luck wrote:

....

I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the uneven
distribution indicates. Essentially it creates a non-linear
convergence.

It works perfectly. I have implemented it and tried in on thousands of
variations of vecsize and subcount, all with correct results. And it

I'm not saying it doesn't work, I'm pretty sure that it does. My question
is whether you can PROVE that it is correct. What you have done is good
verification but unless you have tested EVERY possible variation it isn't
proof.
doesn't matter that the distribution of subsizes is uneven. The only
stipulation is that the sizes of all the subvectors are either n or
n+1, and that their sum is the original vecsize.

In which case you can solve this very simply, for example

static void subsizes2(int vecsize, int subcount)
{
int basesize = vecsize / subcount;
int bumps = vecsize % subcount;
int i;

for (i = 0; i < subcount; i++) {
printf(" %d", basesize + (i < bumps));
}

putchar('\n');
}

Nov 14 '05 #8

No Such Luck

Lawrence Kirby wrote:

On Wed, 15 Dec 2004 13:20:21 -0800, No Such Luck wrote:

...
I can see that it probably works, but I haven't proved it to my
satisfaction yet. This isn't as simple as it first looks as the uneven
distribution indicates. Essentially it creates a non-linear
convergence.

It works perfectly. I have implemented it and tried in on thousands of variations of vecsize and subcount, all with correct results. And

it
I'm not saying it doesn't work, I'm pretty sure that it does. My question is whether you can PROVE that it is correct. What you have done is good verification but unless you have tested EVERY possible variation it isn't proof.

Like I said, I've tested Eric's algorithm on hundreds of vecsizes
ranging from 100 to 1500, and subcounts ranging from 1 to 100 (i.e.,
thousands of trials). All with correct results (all verified
numerically, and some verified visually).

What proof are you looking for? Something like:

for (vecsize = 1; vecsize < 1000000; vecsize++)
{
for (subcount = 1; soubcount < 1000000; subcount++)
{
test_algorithm( vecsize, subcount);
}
}

Nov 14 '05 #9

Mark McIntyre

On 16 Dec 2004 10:49:33 -0800, in comp.lang.c , "No Such Luck"
<no*********@ho tmail.com> wrote:

Like I said, I've tested Eric's algorithm on hundreds of vecsizes
ranging from 100 to 1500, and subcounts ranging from 1 to 100 (i.e.,
thousands of trials). All with correct results (all verified
numerically, and some verified visually).
This isn't proof tho, its empirical evidence. A similar huge body of
evidence exists to support Newton's laws. They're wrong.
What proof are you looking for?
You seem to be trying to prove an algo. I'd suggest a mathematical method.
Something like:

for (vecsize = 1; vecsize < 1000000; vecsize++)
{
for (subcount = 1; soubcount < 1000000; subcount++)
{
test_algorithm( vecsize, subcount);
}
}

Thats still empirical.
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt >

----== Posted via Newsfeeds.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Nov 14 '05 #10

Similar topics

1614

data structure/algorithm problem

by: nan.li.g | last post by:

Hello, all, This is probably not so much related to C++ (Sorry in advance). But I did get this question during a C++ job interview. You are asked to design some data structure and algorith for the following function: List getNames ( List nameList, String prefix ); What it does is: given a name list and a prefix, return all the names in the name list that start with prefix.

C / C++

808

STL transform algorithm

by: Patrick Guio | last post by:

Dear all, I am trying to use the std::transform algorithm to to the following vector< vector<char> >::iterator ik = keys.begin(); // key list iterator vector< vector<char> >::iterator is = ik; // save begin vector<char>::const_iterator i = ik->begin(); vector<char>::const_iterator e = ik->end(); vector<char>::iterator s = is->begin();

C / C++

4340

STL algorithm VS Java loop

by: Allerdyce.John | last post by:

Hi, I am trying to compare the amount of work between using STL algorithm VS a plain Java loop. Let's say the class Rect has 2 attributes: area, and areaPerCent. In Java, I just write a plain for loop with a list: public static void calculateAreaPerCent(List rectList, float containerArea) {

C / C++

4551

find_if algorithm on multimaps

by: devel | last post by:

Hello, Could someone tell me why the find_if statement applied to my multimap dictionnary is doesn't compile? Does this algorithm doesn't work on a multimap? I don't understand why the adjacent_find compile and not the find_if statement? By the way my first intention was to write: find_if (j, dictionnary.end(), bind1st (not_equal_to <string> (), *j));

C / C++

2577

About the find() is STL algorithm

by: Draw | last post by:

Hi All, Just a thought, about the find() algorithm in the C++ STL. I read that the find algorithm can take a range of iterators. If it does not find the element it is looking for in that range it returns the iterator to the last element in the range, not to the last element in the container, to the last element in the range. That being said, how can we tell if find() has been successful in finding the element we need? Its easy when we...

C / C++

2148

Algorithm template specialization problem

by: Sherrie Laraurens | last post by:

Hi all, I'm trying to write a generic algorithm routine that will take begin and end iterators of a container, iterate through the range and perform a "calculation" of sorts. The trouble is that the algorithm will behave differently for the two different types. I've considered calling the algorithm foo_A and foo_B, but I don't like that approach because it will blow out in naming complexity down the track.

C / C++

4891

use gale shapley's algorithm to match companies and persons

by: nabh4u | last post by:

i have a program where i have to use gale shapley's algorithm to match companies and persons. i create preference lists for both companies and persons. i am almost done but i am not getting the output. i should get the output with the persons who were hired and the companies who hire them. please help me as i have a very short deadline to submit this program. /----match.h--------------------------------------------------------------/ ...

C / C++

4092

Best algorithm/data structure combination for word lookup?

by: pj | last post by:

Hi, I 'm currently writing a program that performs transliteration (i.e., converts greek text written using the english alphabet to "pure" greek text using the greek alphabet) as part of my thesis. I have decided to add the capability to convert words using some sort of lookup algorithm as a sidekick to the "normal" conversion algorithm, and here it starts getting interesting. I want to find an algorithm that satisfies these...

C / C++

6073

removing elements from vector<int> using <algorithm>

by: arnuld | last post by:

WANTED: /* C++ Primer - 4/e * * Exercise: 9.26 * STATEMENT * Using the following definition of ia, copy ia into a vector and into a list. Use the single iterator form of erase to remove the elements with odd values from your list * and the even values from your vector.

C / C++

8991

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

8830

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

9544

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

9324

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

8243

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...

Career Advice

6796

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

6074

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

2783

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

2215

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General