473,883 Members | 1,786 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Vector question

Hi, everyone.

Given a vector<int>, what is the fastest way to find out whether there is a
repeated element in it? The result is just "true" or "false". Thanks.

Kitty
Jul 22 '05 #1
16 2754
In article <41**********@r ain.i-cable.com>, Kitty <No spam> wrote:

Given a vector<int>, what is the fastest way to find out whether there is a
repeated element in it? The result is just "true" or "false". Thanks.


I'd construct an empty set<int>, then walk through the vector, processing
each element in turn. If the element is in the set already, stop and
return 'true'; otherwise insert the element into the set and continue.
If you reach the end of the vector without finding any duplicates, return
'false'.

--
Jon Bell <jt*******@pres by.edu> Presbyterian College
Dept. of Physics and Computer Science Clinton, South Carolina USA
Jul 22 '05 #2
"Kitty" <No spam> wrote in message news:41******** **@rain.i-cable.com...
Given a vector<int>, what is the fastest way to find out whether there is a repeated element in it? The result is just "true" or "false". Thanks.


Sounds like an interview question. There are many answers, each with
tradeoffs. Jon's method uses additional memory, but is fast time wise (time
is N*O(lg(N), space is O(N)), and easy to write too -- that's important too
because software that is easier to write and maintain pays off long term,
though this rarely comes up in interviews (though it should). You can also
not use additional memory, but sacrifice time (time would be O(N^2), space
is O(1)). And each method has its own varations, such as set or hashtable,
type of hash function if a hash, set or sorted vector, vector or list, etc,
etc.
Jul 22 '05 #3
In a similar problem, I use a different approach due to the fact that
jon's solution is not an option as it consumes too much memmory.

I use a modified qsort and in the comparation part if two sucessive
elements are the same, I return true. This is slow with small lists but
fast with huge datasets (30.000+ as in my application takes about 100ms)
but require no additional memmory!

/Casper

Kitty wrote:
Hi, everyone.

Given a vector<int>, what is the fastest way to find out whether there is a
repeated element in it? The result is just "true" or "false". Thanks.

Kitty

Jul 22 '05 #4
"Casper" <ca****@jbr.d k> wrote in message news:GFeYc.7339 6
In a similar problem, I use a different approach due to the fact that
jon's solution is not an option as it consumes too much memmory.

I use a modified qsort and in the comparation part if two sucessive
elements are the same, I return true. This is slow with small lists but
fast with huge datasets (30.000+ as in my application takes about 100ms)
but require no additional memmory!


A good solution, I think the fastest possible without using extra space.
But it changes the original vector. Can you think of a way to do it without
changing the original?
Jul 22 '05 #5
"Kitty" <No spam> wrote:
Hi, everyone.

Given a vector<int>, what is the fastest way to find out whether there is
a repeated element in it? The result is just "true" or "false". Thanks.

Kitty


Here are three obvious ways of using standard algorithms. These solutions
have the advantage that they are easy to understand. The second one could
avoid copying if it was allowed to modify the sequence in the first place.

#include <vector>
#include <set>
#include <iostream>

template < typename ConstIter >
bool has_dublicates_ a ( ConstIter from, ConstIter to ) {
typedef typename std::iterator_t raits< ConstIter >::value_type value_type;
typedef typename std::set< value_type > set_type;
typedef typename set_type::size_ type size_type;

size_type i = 0;
set_type s;
for ( ConstIter iter = from; iter != to; ++iter ) {
++ i;
s.insert( *iter );
// this test could also be done after the loop:
if ( s.size() != i ) {
return( true );
}
}
return( false );
}

template < typename ConstIter >
bool has_dublicates_ b ( ConstIter from, ConstIter to ) {
typedef typename std::iterator_t raits< ConstIter >::value_type value_type;
typedef typename std::vector< value_type > vector_type;

vector_type v ( from, to );
std::sort( v.begin(), v.end() );
return( std::adjacent_f ind( v.begin(), v.end() ) != v.end() );
}

int main( int argn, char ** args ){
std::vector< int > a;
std::vector< int > b;
for ( int i = 0; i < 10; ++i ) {
a.push_back( i );
b.push_back( i ); b.push_back( i );
}

std::cout << has_dublicates_ a( a.begin(), a.end() )
<< " "
<< has_dublicates_ a( b.begin(), b.end() ) << "\n";
std::cout << has_dublicates_ b( a.begin(), a.end() )
<< " "
<< has_dublicates_ b( b.begin(), b.end() ) << "\n";
}

As for speed, I would suggest a method based on the "partition by exchange"
idea underlying quicksort, possibly modified like introsort to avoid O(N^2)
worst case runtime. But that idea has been mentioned in another posting
already.
Best

Kai-Uwe Bux
Jul 22 '05 #6

"Kitty" <No spam> wrote in message news:41******** **@rain.i-cable.com...
Hi, everyone.

Given a vector<int>, what is the fastest way to find out whether there is a
repeated element in it? The result is just "true" or "false". Thanks.

Kitty

See the thread titled "If vector contains only different elements" at
http://groups.google.com/groups?thre...0uni-berlin.de
--
Alex Vinokur
http://mathforum.org/library/view/10978.html
http://sourceforge.net/users/alexvn
Jul 22 '05 #7
Siemel Naran wrote:
"Casper" <ca****@jbr.d k> wrote in message news:GFeYc.7339 6
In a similar problem, I use a different approach due to the fact that
jon's solution is not an option as it consumes too much memmory.

I use a modified qsort and in the comparation part if two sucessive
elements are the same, I return true. This is slow with small lists but
fast with huge datasets (30.000+ as in my application takes about 100ms)
but require no additional memmory!
A good solution, I think the fastest possible without using extra space.


I agree that this is probably the fastest solution using comparisons. Given
that the question was for vector<int>, there could be a linear time
solution. Here is a (messy) draft of something like that. The idea is to do
partition by exchange based on the even/odd distinction (note that
dublicates have the same parity). So in one pass, we put all the even
elements to the left of all the odd elements. During the same pass, shift
down by one bit. Now, we return true if there is a dublicate in the left
segment or in the right segment. Farther down in the recursion, we can
sometimes return true just based on the length of the segment: if we have
shifted so many times that all values are in [0..15], then any segment of
length 17 is bound to contain dublicates. The run time for the worst case
should be O( N * bit_length<int> ).
#include <vector>
#include <iostream>
#include <kubux/sequenceio>

typedef std::vector< unsigned int > Uvector;

void print_range ( u_int* from, u_int* to ) {
std::cerr << "[ ";
while ( from <= to ) {
std::cerr << *from << " ";
++from;
}
std::cerr << "]";
}

bool has_dublicates_ helper ( u_int* from, u_int* to, u_int max ) {
// WARNING, range is closed: [from,to]
print_range( from, to );
if ( to <= from ) {
return( false );
}
if ( max < to - from ) {
return( true );
}
u_int* low = from;
u_int* high = to;
while ( true ) {
while ( ( low < high ) && ( *low % 2 == 0 ) ) {
*low >>= 1;
++low;
}
while ( ( low < high ) && ( *high % 2 != 0 ) ) {
*high >>= 1;
--high;
}
// either ( low == high ) or ( *high is even )
if ( low < high ) {
// *low is odd and *high is even
std::swap( *low, *high );
} else {
break;
}
}
std::cerr << std::endl;
print_range( from, to );
std::cerr << std::endl;
// low == high;
if ( *low % 2 == 0 ) {
*low >>= 1;
return( has_dublicates_ helper( from, low, max >> 1 )
||
has_dublicates_ helper( high+1, to, max >>1 ) );
} else {

*low >>= 1;
return( has_dublicates_ helper( from, low-1, max >> 1 )
||
has_dublicates_ helper( high, to, max >>1 ) );
}
}

bool has_dublicates ( Uvector u_vect ) {
// this makes a copy
return( has_dublicates_ helper( &u_vect[0], &u_vect[ u_vect.size()-1 ],
std::numeric_li mits< unsigned int >::max() ) );
}

int main( int argn, char ** args ){
Uvector a;
Uvector b;
for ( int i = 0; i < 10; ++i ) {
a.push_back( i );
b.push_back( i ); b.push_back( i );
}

std::cout << has_dublicates( a )
<< " "
<< has_dublicates( b )
<< "\n";
}

But it changes the original vector. Can you think of a way to do it
without changing the original?


Same here, I do not see how to avoid that.
Best

Kai-Uwe Bux
Jul 22 '05 #8

"Siemel Naran" <Si*********@RE MOVE.att.net> wrote in message
news:GS******** *************@b gtnsc05-news.ops.worldn et.att.net...
"Casper" <ca****@jbr.d k> wrote in message news:GFeYc.7339 6
In a similar problem, I use a different approach due to the fact that
jon's solution is not an option as it consumes too much memmory.

I use a modified qsort and in the comparation part if two sucessive
elements are the same, I return true. This is slow with small lists but
fast with huge datasets (30.000+ as in my application takes about 100ms)
but require no additional memmory!
A good solution, I think the fastest possible without using extra space.
But it changes the original vector. Can you think of a way to do it

without changing the original?
If time is not an issue, perhaps this will do?

bool has_duplicates( vector<int> const& i_Ints)
{
for (
std::vector<int >::const_iterat or cit = i_Ints.begin();
cit != i_Ints.end();
)
{
int i = *cit++;
std::vector<int >::const_iterat or jit = cit;
while (jit != i_Ints.end())
if (*jit++ == i)
return true;
}
return false;
}
cheers,
Conrad Weyns

Jul 22 '05 #9
The most important thing is to keep elements in the same positions after
processing. That is sorting-like algorithm is not allowed.
"Kitty" <No spam> ¦b¶l¥ó news:41******** **@rain.i-cable.com ¤¤¼¶¼g...
Hi, everyone.

Given a vector<int>, what is the fastest way to find out whether there is a repeated element in it? The result is just "true" or "false". Thanks.

Kitty

Jul 22 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
11437
by: Jessica | last post by:
Hi, I do not have a lot of experience with STL and I hope some of you might be able to help me on this seemingly elementary question. I have a vector of doubles (v1). I am trying to copy the values to a 2D vector, of which every vector has the same length. I tried the following but I get a "System.NullReferenceException" error when I ran it.
10
7086
by: Stefan Höhne | last post by:
Hi, as I recon, std::vector::clear()'s semantics changed from MS VC++ 6.0 to MS' DOT.NET - compiler. In the 6.0 version the capacity() of the vector did not change with the call to clear(), in DOT.NET the capacity() is reduced to 0.
12
3171
by: No Such Luck | last post by:
Hi All: I'm not sure if this is the right place to ask this question, but I couldn't find a more appropriate group. This is more of a theory question regarding an algorithm implemented in C, not necessarily a C language question. I'm trying to break up a vector into an arbitrary number of subvectors, equal (or as near to equal) in size as possible. My problem occurs when the vector is not evenly divisible by the number of subvectors...
24
2975
by: toton | last post by:
Hi, I want to have a vector like class with some additional functionality (cosmetic one). So can I inherit a vector class to add the addition function like, CorresVector : public vector<Corres>{ public: void addCorres(Corres& c); //it do little more than push_back function. }
2
3336
by: danielhdez14142 | last post by:
Some time ago, I had a segment of code like vector<vector<int example; f(example); and inside f, I defined vector<int>'s and used push_back to get them inside example. I got a segmentation fault which I resolved by doing vector<vector<int example; example.push_back(vector<int>());
9
3766
by: Jess | last post by:
Hello, I tried to clear a vector "v" using "v.clear()". If "v" contains those objects that are non-built-in (e.g. string), then "clear()" can indeed remove all contents. However, if "v" contains built-in types (e.g. int), then "clear()" doesn't remove anything at all. Why does "clear()" have this behaviour? Also, when I copy one vector "v1" from another vector "v2", with "v1" longer than "v2" (e.g. "v1" has 2 elements and "v2" has...
7
3904
by: nw | last post by:
Hi, We've been having a discussion at work and I'm wondering if anyone here would care to offer an opinion or alternative solution. Aparently in the C programming HPC community it is common to allocate multidimentional arrays like so: int *v_base = (int *) malloc(1000000*sizeof(int)); int **v = (int **) malloc(1000*sizeof(int *));
6
11630
by: jmsanchezdiaz | last post by:
CPP question: if i had a struct like "struct str { int a; int b };" and a vector "std::vector < str test;" and wanted to push_back a struct, would i have to define the struct, fill it, and then push_back it, or could i pushback the two ints directly somehow? Thanks for all.
13
1935
by: prasadmpatil | last post by:
I am new STL programming. I have a query regarding vectors. If I am iterating over a vector using a iterator, but do some operations that modify the size of the vector. Will the iterator recognize this? I wrote the following program to test this out. #include <fstream> #include <iostream> #include <string>
6
7393
by: Mr. K.V.B.L. | last post by:
I want to start a map with keys but an empty vector<string>. Not sure what the syntax is here. Something like: map<string, vector<string MapVector; MapVector.insert(make_pair("string1", new vector<string>)); MapVector.insert(make_pair("string2", new vector<string>)); MapVector.insert(make_pair("string3", new vector<string>));
0
9942
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9792
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11142
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10743
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
7971
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5797
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5991
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4220
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3233
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.