sorting K+1 vectors with NlogN comparisons not (K+1)NlogN

Pratyush

Hi,

Suppose there is a vector of objects of class A, i.e., std::vector<A>
vec_A(N); The class A satisifies all the STL vector requirements.

Now I wish to add some attributes for each of the objects in the
vector vec_A. Suppose there are K attributes to be added. For each of
the attributes I define K vectors of appropriate types. Say, the
attributes have types type1, type2, ..., typeK. So I define
std::vector<type1> attr1(vec_A.size());
std::vector<type2> attr2(vec_A.size());
....
std::vector<typeK> attrK(vec_A.size());

The important condition is that the object at location i in vec_A has
attributes at location i in attr1, attr2, ..., attrK.

Now suppose I need to sort the vector vec_A on basis of some of the
attributes. How can it be efficiently (both in terms of time and
memory) done so that the above mentioned condition is satisfied?

One approach that I used is to have copies of the basis attributes
vectors. Then sort vec_A, attr1, attr2, ..., attrK using the copies.
But this involves on average (K+1)*NlogN comparisons of the basis
attributes. Additionally I am creating copies of the basis attributes.
So this approach seems quite unsatisfactory. Is it possible to have
only NlogN comparisons? Best if no copies of the basis attributes are
involved.

Of course, one can define class A to have the attributes as its
fields. But then its not general and flexible enough. In that case,
one loses the flexibilty of having one attribute, two attributes and
so on.

One example of such a need can be the case where class A represent a
vertex of a graph, and the attributes are different properties of a
vertex.
Thanks & regards,
Pratyush

Jul 19 '05 #1

Subscribe Post Reply

3388

tom_usenet

On 10 Nov 2003 00:24:27 -0800, pk****@noida.atrenta.com (Pratyush)
wrote:

Hi,

Suppose there is a vector of objects of class A, i.e., std::vector<A>
vec_A(N); The class A satisifies all the STL vector requirements.

Now I wish to add some attributes for each of the objects in the
vector vec_A. Suppose there are K attributes to be added. For each of
the attributes I define K vectors of appropriate types. Say, the
attributes have types type1, type2, ..., typeK. So I define
std::vector<type1> attr1(vec_A.size());
std::vector<type2> attr2(vec_A.size());
...
std::vector<typeK> attrK(vec_A.size());

The important condition is that the object at location i in vec_A has
attributes at location i in attr1, attr2, ..., attrK.
Parallel vectors are rarely a good idea!

Now suppose I need to sort the vector vec_A on basis of some of the
attributes. How can it be efficiently (both in terms of time and
memory) done so that the above mentioned condition is satisfied?
Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
sort that, using a sorting criteria that looks up elements in the A
vector. e.g.

struct IndexSort
{
private:
//...
//probably begin iterators for the relevent attibute vectors.
//or references to those containers
public:
typedef bool result_type;
typedef int first_argument_type;
typedef int second_argument_type;

IndexSort(/*probably begin iterators for the relevent attibute
vectors*/)
{}

bool operator()(first_argument_type lhs,
second_argument_type rhs) const
{
//lhs and rhs give the index in the vectors to compare.
return true; //perform comparison
}
};

std::sort(indexvec.begin(), indexvec.end(), IndexSort(/*args*/));

Now you have a vector like:

(10, 99, 2, 7, ...)

You now use this to rearrange all of your parallel vectors into the
correct order. e.g.

template <class It, class RanIt>
void index_shuffle(It begin, It end, RanIt toSort)
{
using std::iter_swap;
RanIt it = toSort;
while(begin != end)
{
iter_swap(it, toSort + *begin);
++begin;
++it;
}
}

calling it on each:
index_shuffle(indexvec.begin(), indexvec.end(), vec_A.begin());
index_shuffle(indexvec.begin(), indexvec.end(), attr1.begin());
....
index_shuffle(indexvec.begin(), indexvec.end(), attrK.begin());

One approach that I used is to have copies of the basis attributes
vectors. Then sort vec_A, attr1, attr2, ..., attrK using the copies.
But this involves on average (K+1)*NlogN comparisons of the basis
attributes. Additionally I am creating copies of the basis attributes.
So this approach seems quite unsatisfactory. Is it possible to have
only NlogN comparisons? Best if no copies of the basis attributes are
involved.
Sort a vector of indices then, applying the reordering to each vector
in turn. NlogN comparisons as required.

Of course, one can define class A to have the attributes as its
fields. But then its not general and flexible enough. In that case,
one loses the flexibilty of having one attribute, two attributes and
so on.

One example of such a need can be the case where class A represent a
vertex of a graph, and the attributes are different properties of a
vertex.

There are many ways to give flexible attribute lists. At the very
least you have:
struct AHolder
{
A a;
type1 attr1;
//...
typek attrk;
};

A runtime map<string, type>, or compile time template parameters
(typelist, tuple, etc.) are alternatives.

Incidently, you can find a graphs library here:

http://www.boost.org/libs/graph/doc/..._contents.html

Tom

Jul 19 '05 #2

David Rubin

tom_usenet wrote:

[snip - how to sort many parallel vectors?]

Now suppose I need to sort the vector vec_A on basis of some of the
attributes. How can it be efficiently (both in terms of time and
memory) done so that the above mentioned condition is satisfied?
Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
sort that, using a sorting criteria that looks up elements in the A
vector. e.g.

[snip] Now you have a vector like:

(10, 99, 2, 7, ...)

You now use this to rearrange all of your parallel vectors into the
correct order. e.g.

Once you have this vector (v), you don't need to rearrange the others.
The ith element in this vector refers to element vec_A[v[i]] with
attributes attr*[v[i]].

/david

--
Andre, a simple peasant, had only one thing on his mind as he crept
along the East wall: 'Andre, creep... Andre, creep... Andre, creep.'
-- unknown

Jul 19 '05 #3

Pratyush

David Rubin <bo***********@nomail.com> wrote in message news:<3F***************@nomail.com>...

tom_usenet wrote:

[snip - how to sort many parallel vectors?]
Now suppose I need to sort the vector vec_A on basis of some of the
attributes. How can it be efficiently (both in terms of time and
memory) done so that the above mentioned condition is satisfied?

Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
sort that, using a sorting criteria that looks up elements in the A
vector. e.g.

[snip]
Now you have a vector like:

(10, 99, 2, 7, ...)

You now use this to rearrange all of your parallel vectors into the
correct order. e.g.

Once you have this vector (v), you don't need to rearrange the others.
The ith element in this vector refers to element vec_A[v[i]] with
attributes attr*[v[i]].

/david

Thanks a lot guys. I think if the usage is such that we need to access
the attributes too often then its better to rearrange the vectors. In
this case, instead of two deferences we will be doing one. Otherwise
we can use indexing as suggested by David.

Thanks & regards,
Pratyush

Jul 19 '05 #4

Pratyush

tom_usenet <to********@hotmail.com> wrote in message news:<60********************************@4ax.com>. ..

On 10 Nov 2003 00:24:27 -0800, pk****@noida.atrenta.com (Pratyush)
wrote:
Hi, [snip]
Now suppose I need to sort the vector vec_A on basis of some of the
attributes. How can it be efficiently (both in terms of time and
memory) done so that the above mentioned condition is satisfied?
Create a vector containing (0, 1, 2, 3, 4, ..., vec_A.size() - 1). Now
sort that, using a sorting criteria that looks up elements in the A
vector. e.g.

struct IndexSort
{

[snip] };

std::sort(indexvec.begin(), indexvec.end(), IndexSort(/*args*/));

Now you have a vector like:

(10, 99, 2, 7, ...)

You now use this to rearrange all of your parallel vectors into the
correct order. e.g.

template <class It, class RanIt>
void index_shuffle(It begin, It end, RanIt toSort)
{
using std::iter_swap;
RanIt it = toSort;
while(begin != end)
{
iter_swap(it, toSort + *begin);
++begin;
++it;
}
}
It seems that this index_shuffle algorithm is not correct. For
example, consider the following:

int vi[] = {5,3};

// vector containing (0,1)
vector<int> indexvec(2); indexvec[0] = 0; indexvec[1] = 1;

// sort using index sort mechanism as suggested by tom_usenet
std::sort(indexvec.begin(), indexvec.end(), IndexSort(/*args*/))

//now indexvec contains (1,0)

index_shuffle(indexvec.begin(), indexvec.end(), vi);

// this step will produce vi containing (5,3) and not (3,5)...
//
// basically the while-loop in index-shuffle will loop twice.
// in the first loop it will do iter_swap(vi, vi+1).
// in the second loop it will do iter_swap(vi+1, vi).
But anyways thanks for the pointer. I now need to write the correct
index_shuffle algorithm. By sorting the indices, we are creating a
permutation of those indices. It seems one needs to take into account
the cycles in a permutation while performing the index_shuffle.

[snip]

Jul 19 '05 #5

tom_usenet

On 11 Nov 2003 01:02:02 -0800, pk****@noida.atrenta.com (Pratyush)
wrote:

It seems that this index_shuffle algorithm is not correct. For
example, consider the following:

int vi[] = {5,3};

// vector containing (0,1)
vector<int> indexvec(2); indexvec[0] = 0; indexvec[1] = 1;

// sort using index sort mechanism as suggested by tom_usenet
std::sort(indexvec.begin(), indexvec.end(), IndexSort(/*args*/))

//now indexvec contains (1,0)

index_shuffle(indexvec.begin(), indexvec.end(), vi);

// this step will produce vi containing (5,3) and not (3,5)...
//
// basically the while-loop in index-shuffle will loop twice.
// in the first loop it will do iter_swap(vi, vi+1).
// in the second loop it will do iter_swap(vi+1, vi).
Yes, you're right, I'm afraid I didn't put much thought into it.

But anyways thanks for the pointer. I now need to write the correct
index_shuffle algorithm. By sorting the indices, we are creating a
permutation of those indices. It seems one needs to take into account
the cycles in a permutation while performing the index_shuffle.

This can't be done in place without making it at least O(n log n), and
more likely O(n*n). You have to find the disjoint cycles and then
rotate them (this takes me back to discrete maths and combinatorics!).
So I suggest you do a copy algorithm:

#include <algorithm>

template <class FwdIt1, class RanIt, class FwdIt2>
void index_shuffle_copy(FwdIt1 indexBegin, FwdIt1 indexEnd, RanIt
source, FwdIt2 output)
{
using std::iter_swap;
while(indexBegin != indexEnd)
{
*output = *(source + *indexBegin);
++indexBegin;
++output;
}
}

Unfortunately this annoyingly makes you copy everything. You can of
course do a swap at the end to minimize copying.

Tom

Jul 19 '05 #6

Similar topics

puzzle

by: Darius | last post by:

ok here is a puzzle. there is an integer array whose length is odd, and all the numbers in the array appear exactly two times except one. Find the number in O(n) time. Try to do it without using...

C / C++

sorting with expensive compares?

by: Dan Stromberg | last post by:

Hi folks. Python appears to have a good sort method, but when sorting array elements that are very large, and hence have very expensive compares, is there some sort of already-available sort...

Python

Sorting lists containing big objects.

by: ALiX | last post by:

I am using a std::list<MyClass> where objects of type MyClass can be big in size. Can using std::list<>::sort result in objects inside the list being copied around or does the list merely reassign...

C / C++

Fast STL data structure

by: tim.lino | last post by:

Hello, I would like to use C++ STL to store a set of Object's which is as follows: class Object { public: int value; ......

C / C++

A question: Is 200,000 element array worth sorting and search?

by: mike-yue | last post by:

The topic comes from a question: Would you rather wait for the results of a quicksort, a linear search, or a bubble sort on a 200000 element array? 1Quicksort 2Linear Search 3Bubble Sort ...

C / C++

Using a link list over an array.

by: pereges | last post by:

Hi, I am wondering which of the two data structures (link list or array) would be better in my situation. I have to create a list of rays for my ray tracing program. the data structure of ray...

C / C++

python 3: sorting with a comparison function

by: Thomas Heller | last post by:

Does Python 3 have no way anymore to sort with a comparison function? Both .sort() and sorted() seem to accept only 'key' and 'reverse' arguments, the 'cmp' argument seems to be gone. Can that...

Python

sorting list of complex numbers

by: skip | last post by:

The thread on sorting in Python 3 got me to thinking. How could I sort a list of complex numbers using key? As expected: Traceback (most recent call last): File "<stdin>", line 1, in...

Python

Sorting lists

by: asc | last post by:

Hi all, I have a problem and I'm not sure whether sort() can help me. I understand that if I have a list; say L = I can use L.sort() and I will then have; L = But my problem is this. I have a...

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++