435,300 Members | 2,044 Online
Need help? Post your question and get tips & solutions from a community of 435,300 IT Pros & Developers. It's quick & easy.

# PLS HELP w/ Interview Question!!

 P: n/a I've been looking for a job for a while now, and have run into this interview question twice now... and have stupidly kind of blown it twice... (although I've gotten better)... time to finally figure this thing out... basically the interview question is: given an unsorted listed such as: 3,1,3,7,1,2,4,4,3 find the FIRST UNIQUE number, in this case 7... and of course the list can be millions long... the first time I got this question, my solution was something like: 1) declare a second array A2 with n elements 2) loop through original array A1 and insert the element in its sorted position in A2, if I find a match eliminate, mark all copies of that element as "duplicate" 3) loop through original array A1 again and find the first number that is not listed as a duplicate in A2 I think at the time of that interview, I pointed out that this wasnt a very good algorithm as it would be something like O(2*(n^2)), but it will get the job done (I did get offered that job actually, but stupidly turned it down)... I then offered up another solution at that interview which I thought was better: 1) build a copy of A1 that has the original position of each element 2) sort A1 3) loop through sorted A1 deleting any duplicate items 4) now A1 will only contain unique items... so loop through it again looking for the smallest position that was marked in step 1 this was kind of the algorithm I started out with in interview #2 (which I really wanted, but didnt get)...as the interviewer asked me to give him the run time of that algorithm which I said was: step #1: n step #2: n log n step #3: n step #4: n resulting in 3n + n log n... this was a lot better then the original algorithm, but still not too great because of the 3n term... I got my rejection from the 2nd place yesterday, so I kind of was up all night trying to figure out how to get it faster... I came up with something like this: 1) loop through A1 building a customized AVL tree... an AVL tree insertion is O(log n), so building this tree should take O(n log n).. the customized part is... each node inserted will bring along its original position in the array (0..n)... if I run across a duplicate node, I will mark the position of the node thats already in the tree as "-1" and dump the node I'm trying to insert so basically I've spent O(n log n) to combine steps 1, 2 & 3 of my last solution... basically taking 2n + n log n down to O(n log n)... 2) but now I need to visit every node in the tree to find out which one has the lowest position which is O(n) obviously ignoring any nodes I marked as -1. so this final "optimized" solution I came up with last night would still require O(n + n log n). Is there any way to optimize this further? or a better algorithm all together? I'm thinking perhaps to keep track of the first unique node as I'm creating the tree? but I couldn't nail down an idea that would work... I'm sort of invisioning the tree having a member called m_nPos initialized to 0... and as I'm building the tree, if I find a duplicate element at m_nPos, to increment it. So basically I insert element 0, then later on, I find a duplicate for it, so the first unique item can't possibly be at zero, it has to be at least at position 1. But then as I did a few test cases, I poke holes in this idea very easily. Any ideas? Nov 18 '06 #1
18 Replies

 P: n/a basically the interview question is: given an unsorted listed such as: 3,1,3,7,1,2,4,4,3 Where there any other constraints? Nov 18 '06 #2

 P: n/a >basically the interview question is: given an unsorted listed such as:3,1,3,7,1,2,4,4,3 Where there any other constraints? Nope, just given an unsorted list of n numbers (n could be millions), find the first unique number. Obviously you can't use a built in "FindFirstUnique()" number function, the exercise was to write that function. I thought there may be some tricks with using a database, but thought the overhead of the database would kill any performance gains in the algorithm. Obviously the point is to find the first unique number in the shortest time possible... the algorithm I described in the original post was O (n + n log n)... I'm wondering if it can be done faster. Nov 18 '06 #3

 P: n/a No idea for solving that. The AVL tree is unclear for me, but a brute-force solution is simple, though time-consuming Nov 18 '06 #4

 P: n/a Nobody wrote: I've been looking for a job for a while now, and have run into this interview question twice now... and have stupidly kind of blown it twice... (although I've gotten better)... time to finally figure this thing out... basically the interview question is: given an unsorted listed such as: 3,1,3,7,1,2,4,4,3 Here's the first method that springs to mind for me: Create a vector of bools initialised to false the same size as the list, and then run through the list, setting the item in the vector at the position of the number encountered to true, and removing ones that are already true. Then simply look at the first element of the vector. Bingo. One pass, right? I bet I've made some glaring error in process or efficiency. Nov 18 '06 #5

 P: n/a Wayne Marsh wrote: Nobody wrote: >I've been looking for a job for a while now, and have run into thisinterview question twice now... and have stupidly kind of blown ittwice... (although I've gotten better)... time to finally figure thisthing out...basically the interview question is: given an unsorted listed such as:3,1,3,7,1,2,4,4,3 Here's the first method that springs to mind for me: Create a vector of bools initialised to false the same size as the list, and then run through the list, setting the item in the vector at the position of the number encountered to true, and removing ones that are already true. Then simply look at the first element of the vector. Bingo. One pass, right? I bet I've made some glaring error in process or efficiency. This doesn't even make sense at all. Sorry. Nov 18 '06 #6

 P: n/a Nobody: given an unsorted listed such as: 3,1,3,7,1,2,4,4,3 find the FIRST UNIQUE number, in this case 7... and of course the list can be millions long... I've only recently begun reading a book on the C++ Standard Library, so I'm not very familiar with all the built-in iterator routines and so forth, so I can't provide an optimal solution. However, I could give you a inefficient solution that gets the job done by simply putting thought into it: (This code won't be Grade A because I'm only starting to get the hang of iterators and so forth. Expect mistakes.) #include template std::size_t QuantOccur(Iterator begin,Iterator const end,T const &obj) { size_t quant = 0; while (end!=begin) if (*begin++==obj) ++quant; return quant; } template Iterator FindFirstUnique(Iterator begin,Iterator const end) { for (;end!=begin;++begin) if (1==QuantOccur(begin,end,*begin)) break; return begin; } #include #include int main() { int arr[] = {3,1,3,7,1,2,4,4,3}; int i = *FindFirstUnique(arr,arr+sizeof arr/sizeof*arr); std::cout << i << std::endl; } For all I know, there could be a Standard Library facility that does this with a single line of code! -- Frederick Gotham Nov 18 '06 #7

 P: n/a Frederick Gotham: size_t quant = 0; Once again my retard compiler failed to spot the error. Nov 18 '06 #8

 P: n/a

 P: n/a "Wayne Marsh" Nobody wrote: >>I've been looking for a job for a while now, and have run into thisinterview question twice now... and have stupidly kind of blown ittwice... (although I've gotten better)... time to finally figure thisthing out...basically the interview question is: given an unsorted listed such as:3,1,3,7,1,2,4,4,3 Here's the first method that springs to mind for me:Create a vector of bools initialised to false the same size as the list,and then run through the list, setting the item in the vector at theposition of the number encountered to true, and removing ones that arealready true. Then simply look at the first element of the vector. Bingo.One pass, right?I bet I've made some glaring error in process or efficiency. This doesn't even make sense at all. Sorry. Probably not... you'd have to be able to map the bools to the items and the position which would start getting to the order of my original first pass 2*(n^2) algorithm. Same thing with a hash table, you'd need the original order somehow and you can't really remove the first and second item, because then when a 3rd item comes along, you wont know its a duplicate. Nov 18 '06 #10

 P: n/a "Frederick Gotham" given an unsorted listed such as:3,1,3,7,1,2,4,4,3find the FIRST UNIQUE number, in this case 7... and of course the listcan be millions long... I've only recently begun reading a book on the C++ Standard Library, so I'm not very familiar with all the built-in iterator routines and so forth, so I can't provide an optimal solution. However, I could give you a inefficient solution that gets the job done by simply putting thought into it: (This code won't be Grade A because I'm only starting to get the hang of iterators and so forth. Expect mistakes.) #include template std::size_t QuantOccur(Iterator begin,Iterator const end,T const &obj) { size_t quant = 0; while (end!=begin) if (*begin++==obj) ++quant; return quant; } template Iterator FindFirstUnique(Iterator begin,Iterator const end) { for (;end!=begin;++begin) if (1==QuantOccur(begin,end,*begin)) break; return begin; } #include #include int main() { int arr[] = {3,1,3,7,1,2,4,4,3}; int i = *FindFirstUnique(arr,arr+sizeof arr/sizeof*arr); std::cout << i << std::endl; } For all I know, there could be a Standard Library facility that does this with a single line of code! -- Frederick Gotham If I understand the algorithm, you go through the list and get the occurence count of each item, and stop on the first one where count = 1. Hmm... Big-O wise.. its something like for 0 .. n // loop through list for 0 .. n // get occurence count for outer loop So the runtime here appears to be n^2. Its much cleaner and simpler then my original 2*(n^2) algorithm.. Hehe.... no to sound picky here, but I didn't get the job because my algorithm was apperently not good enough... so I'm trying to nail this down further then even the AVL tree approach with O(n + n log n)... I'm wondering now if building an AVL tree is really n log n... since the outer n is constant, but the inner n is increasing... Nov 18 '06 #11

 P: n/a * Nobody: I've been looking for a job for a while now, and have run into this interview question twice now... and have stupidly kind of blown it twice... (although I've gotten better)... time to finally figure this thing out... basically the interview question is: given an unsorted listed such as: 3,1,3,7,1,2,4,4,3 find the FIRST UNIQUE number, in this case 7... and of course the list can be millions long... [snip] Any ideas? Yeah, post in [comp.programming], it's off-topic in clc++, and don't post questions that have better than 50% chance of being HOMEWORK. Hints: (1) you can't do it in less than linear time, (2) it's trivial to do it in linear time with linear memory consumption, (3) with more stringent conditions it becomes a challenge just to find out whether it's doable. Follow-ups to [comp.programming]. -- A: Because it messes up the order in which people normally read text. Q: Why is it such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? Nov 18 '06 #12

 P: n/a "Alf P. Steinbach" * Nobody: >I've been looking for a job for a while now, and have run into thisinterview question twice now... and have stupidly kind of blown ittwice... (although I've gotten better)... time to finally figure thisthing out...basically the interview question is: given an unsorted listed such as:3,1,3,7,1,2,4,4,3find the FIRST UNIQUE number, in this case 7... and of course the listcan be millions long... [snip] >Any ideas? Yeah, post in [comp.programming], it's off-topic in clc++, and don't post questions that have better than 50% chance of being HOMEWORK. Hints: (1) you can't do it in less than linear time, (2) it's trivial to do it in linear time with linear memory consumption, (3) with more stringent conditions it becomes a challenge just to find out whether it's doable. Follow-ups to [comp.programming]. Thanks, I've taken it there. By the way, I've been out of school for 12 years now ;). Nov 18 '06 #13

 P: n/a "Nobody"

 P: n/a "Wayne Marsh"

 P: n/a Nobody wrote: .... Any ideas? You could use an "index" sort and this would be O(n) - this only works if the elements are "indexable" i.e. a limited range otherwise you're limited to nlogn. i.e. using an index sort: a) create a "tally" vectorsetting first to true and second to each consecutive iterator of the element list b) create a "pointers" initialized to point to each element in flags c) use a special compare routine that compares two * pair if the (*second) are equal then set the first to false on both operands. d) scan through the "flags" vector looking for the first element with first as "true". This uses a property of sort algorithms (except for index sorts) that is tricky to describe but I'll give it a go. While performing any sort on any list of elements, any element that is duplicated must be compared to at least one other duplicated element. Hence if you instrument the compare operator of a sort algorithm, you can eliminate duplicate elements. The corollary is that removing duplicates from a list of elements need not be more complex that sorting. It's probably not any easier than sorting either. G Here is an implementation of the second algorithm: // ======== FirstUniqueDupeEliminator ================================= /** * Helper for FirstUnique(). * */ template < typename w_iterator > class FirstUniqueDupeEliminator { public: typedef std::pair< bool, w_iterator Element; std::vector< Element m_values; std::vector< Element * m_values_ptrs; w_iterator m_result; void Remove( Element * i_listiter ) { i_listiter->first = false; } bool operator()( Element * i_rhs, Element * i_lhs ) { // if we're pointing to the same element // we don't do anything special if ( i_lhs == i_rhs ) { return false; } if ( *(i_lhs->second) < *(i_rhs->second) ) { return true; } if ( *(i_rhs->second) < *(i_lhs->second) ) { return false; } // oops they're the same eliminate the elements from the list // Remove( i_rhs ); Remove( i_lhs ); return false; } FirstUniqueDupeEliminator( const w_iterator & i_begin, const w_iterator & i_end ) { if ( i_begin == i_end ) { m_result = i_end; return; } w_iterator l_tmp = i_begin; while ( l_tmp != i_end ) { m_values.push_back( Element( true, l_tmp ) ); ++ l_tmp; } m_values_ptrs.resize( m_values.size() ); for ( std::size_t i = 0; i < m_values.size(); ++i ) { m_values_ptrs[ i ] = & m_values[ i ]; } std::sort( m_values_ptrs.begin(), m_values_ptrs.end(), *this ); for ( std::size_t i = 0; i < m_values.size(); ++i ) { if ( m_values[ i ].first ) { m_result = m_values[ i ].second; return; } } m_result = i_end; } operator w_iterator() const { return m_result; } }; // ======== FirstUnique =============================================== /** * FirstUnique returns the first unique element in a container * * @param i_begin The first element in the continer * @param i_end The last element in the container * * @return The iterator to the first element in the * range that is unique or i_end if no elements are unique */ template < typename w_iterator > w_iterator FirstUnique( const w_iterator & i_begin, const w_iterator & i_end ) { return FirstUniqueDupeEliminator( i_begin, i_end ); } const int array[] = { 3,1,3,7,1,2,4,4,3 }; int main() { const int * l_val = FirstUnique( &array[0], at::ArrayEnd( array ) ); } Nov 19 '06 #16

 P: n/a "Andrew Koenig" Hehe... that was kind of the point of my question. In the interview, Igot it down to O(3n + n log n), later on at home I got it down to O(n + nlog n). I'm trying to see if there is a faster way. If someone told me that an algorithm was O(3n+n log n), I would be wondering whether that person really understood what big-O notations means, because O(3n+n log n) means exactly the same as O(n log n). Yeah, that was another one of my mistakes during the interview :(... I also stumbled on the fact that a quick sort was n log n and not log n... I asked a friend who is very good at algorithms, and he mistakenly blurted out log n as well til I pointed it out to him :). Honestly, I didn't really study up on algorithms per say, because I was going in for a user interface type position and focused on that... you live and learn I guess... Nov 19 '06 #17

 P: n/a "Gianni Mariani" Any ideas? You could use an "index" sort and this would be O(n) - this only works if the elements are "indexable" i.e. a limited range otherwise you're limited to nlogn. i.e. using an index sort: a) create a "tally" vectorsetting first to true and second to each consecutive iterator of the element list b) create a "pointers" initialized to point to each element in flags c) use a special compare routine that compares two * pair if the (*second) are equal then set the first to false on both operands. d) scan through the "flags" vector looking for the first element with first as "true". Thanks for the response and the code Gianni. I took the thread over to comp.lang.algorithms as a poster requested. The AVL tree solution is actually O (n log n) as well, but as per Alf, he pointed out that there is a trivial O (n) solution. I was totally stumped til someone mentioned the key word: hash table. basically the O (n) "trivial" solution was to loop through the list like so: for (0..n) { if (hash.Find(element[n]) is NOT found) hash[element[n]] = 1; else hash[element[n]] = hash[element[n]] + 1; } that'll build a hash table of the occurance count of each item in the array. then loop through like this to find the first unique element... for (0..n) { if (hash.Find(element[n]) == 1) FOUND IT } that ends up being O(2n) = O(n). no sorting or deletions needed :)... another poster on this thread actually came close to this solution using ref counts, but he basically looped through the list to count them O (n^2) for each element rather then build the table. hash tables would have solved in O(n) a few of the other questions they asked me as well. I obviously knew about hash tables going in, but didn't have it on the tip of my brain that operations were considered O(1). And I don't know why I only used a hash table on the "2nd pass" while they were grilling me... But now I'm starting to see the wide array of uses for them :). Nov 19 '06 #18

 P: n/a "Nobody" const size_t len = 9; int values[len] = {3,1,3,7,1,2,4,4,3}; int notuniquevalues[len]; // array to remember already found not uniques size_t ni = 0; // count of already found not unique values bool stillunique[len] = {true, true, true, true, true, true, true, true, true}; int compare_count = 0; bool IsUnique(int value, size_t pos, size_t len) { bool retval = false; if(pos == len) { retval = true; for(int i = 0; i < ni; i++) { compare_count++; if(value == notuniquevalues[i]) { retval = false; break; } } } else { if(stillunique[pos]) { compare_count++; if(value == values[pos]) { stillunique[pos] = false; notuniquevalues[ni++] = value; } else { retval = IsUnique(value, pos+1, len); } } else { retval = IsUnique(value, pos+1, len); } } return retval; } int main(int argc, char** argv) { size_t pos =0; while(pos < len) { if(stillunique[pos]) { if(IsUnique(values[pos], pos+1, len)) { cout << "Found first unique " << values[pos] << " at position " << pos << endl; cout << "Count of compares: " << compare_count << endl; break; } } pos++; } return 0; } Nov 28 '06 #19

### This discussion thread is closed

Replies have been disabled for this discussion.