473,412 Members | 4,127 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,412 software developers and data experts.

Looking for List comparison algorithm

I have a requirement to compare two lists, generating a third which is a
combination of both.

With the first list as the reference list and the second as the test
list, I want the each of the result list's elements to contain a moniker
denoting whether that element:

1..Exists only in the reference list

or

2..Exists in boths lists

or

3...Exists only in the test list

In particular, I would like to use the most efficient algorithm possible
using the least amount of iterations and the least amount of
comparisons.

Any ideas?

Ben Fidge
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 15 '05 #1
8 19775
Ben Fidge <be*******@btopenworld.com> wrote:
I have a requirement to compare two lists, generating a third which is a
combination of both.

With the first list as the reference list and the second as the test
list, I want the each of the result list's elements to contain a moniker
denoting whether that element:

1..Exists only in the reference list

or

2..Exists in boths lists

or

3...Exists only in the test list

In particular, I would like to use the most efficient algorithm possible
using the least amount of iterations and the least amount of
comparisons.

Any ideas?


Do the elements have any kind of natural ordering? Does the order of
the final list matter? How big are the lists? Can you put the lists
into maps (mapping each element to itself to make lookup easier)? Do
you know whether two elements which are in both lists will actually be
references to the same object, or will they just display object
equality (i.e. calling Object.Equals)?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #2
Hi Jon,

Thanks for your response. Here's my reply to your questions:

Q1 - "Do the elements have any kind of natural ordering?"
A1 - They are sorted alphabetically in ascending order by name.

Q2 - "Does the order of the final list matter?"
A2 - It would nice to have them sorted alphabetically at the end. This
can be done afterwards however.

Q3 - "How big are the lists?"
A3 - They could potentially contain a couple of thousand items each,
although several hundred is more likely.

Q4 - "Can you put the lists into maps (mapping each element to itself to
make lookup easier)?"
A4 - The ref and test lists are readonly collections of readonly
objects. The result list contains objects that reference either the ref
item, test item or both.

Q5 - "Do you know whether two elements which are in both lists will
actually be references to the same object, or will they just display
object equality (i.e. calling Object.Equals)?"
A5 - The objects in the ref and test lists are distinct and are compared
by the string names.

Thanks for your help,

Ben
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 15 '05 #3
Ben Fidge <be*******@btopenworld.com> wrote:
Thanks for your response. Here's my reply to your questions:

Q1 - "Do the elements have any kind of natural ordering?"
A1 - They are sorted alphabetically in ascending order by name.

Q2 - "Does the order of the final list matter?"
A2 - It would nice to have them sorted alphabetically at the end. This
can be done afterwards however.

Q3 - "How big are the lists?"
A3 - They could potentially contain a couple of thousand items each,
although several hundred is more likely.

Q4 - "Can you put the lists into maps (mapping each element to itself to
make lookup easier)?"
A4 - The ref and test lists are readonly collections of readonly
objects. The result list contains objects that reference either the ref
item, test item or both.

Q5 - "Do you know whether two elements which are in both lists will
actually be references to the same object, or will they just display
object equality (i.e. calling Object.Equals)?"
A5 - The objects in the ref and test lists are distinct and are compared
by the string names.


Right. The lists being sorted makes things a *lot* easier. Basically,
you keep to indexes - one into the reference list and one into the test
list.

If you think about the lists being like this:

Test Ref

A
B
C
D
E E
F
G
H

etc an algorithm should present itself - you fetch the first entry of
each list, and find which one is "lower" - then keep adding entries
from that list until you find one which is equal to or greater than the
first entry of the other list - at which point you swap, etc.

That's a little bit vague, but hopefully gives you some idea of how to
proceed. If not, let me know and I'll write up some code for you.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #4


Hi Jon,

Thanks for your response. Here's my reply to your questions:

Q1 - "Do the elements have any kind of natural ordering?"
A1 - They are sorted alphabetically in ascending order by name.

Q2 - "Does the order of the final list matter?"
A2 - It would nice to have them sorted alphabetically at the end. This
can be done afterwards however.

Q3 - "How big are the lists?"
A3 - They could potentially contain a couple of thousand items each,
although several hundred is more likely.

Q4 - "Can you put the lists into maps (mapping each element to itself to
make lookup easier)?"
A4 - The ref and test lists are readonly collections of readonly
objects. The result list contains objects that reference either the ref
item, test item or both.

Q5 - "Do you know whether two elements which are in both lists will
actually be references to the same object, or will they just display
object equality (i.e. calling Object.Equals)?"
A5 - The objects in the ref and test lists are distinct and are compared
by the string names.

Thanks for your help,

Ben
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 15 '05 #5
Jon,

Thanks a lot. That's very similar to what I concluded last night while
pondering on the subject. I didn't manage to get it to work before
giving up, so with your infinite wisdom at hand, I'll have another stab
tonight.

Thanks

Ben
*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 15 '05 #6
Jon,

I would be very interested to see how you approach this as my implementation
is returning duplicate entries in the result list. I found an example in
Java at the following site:

http://www.cs.sjsu.edu/faculty/fecte...ListMerge.html

with the actual Java code as follows:

http://www.cs.sjsu.edu/faculty/fecte...mainCS46B.html

If you have time, could I take you up on the offer of some sample code.

Many thanks,

Ben

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Ben Fidge <be*******@btopenworld.com> wrote:
Thanks for your response. Here's my reply to your questions:

Q1 - "Do the elements have any kind of natural ordering?"
A1 - They are sorted alphabetically in ascending order by name.

Q2 - "Does the order of the final list matter?"
A2 - It would nice to have them sorted alphabetically at the end. This
can be done afterwards however.

Q3 - "How big are the lists?"
A3 - They could potentially contain a couple of thousand items each,
although several hundred is more likely.

Q4 - "Can you put the lists into maps (mapping each element to itself to
make lookup easier)?"
A4 - The ref and test lists are readonly collections of readonly
objects. The result list contains objects that reference either the ref
item, test item or both.

Q5 - "Do you know whether two elements which are in both lists will
actually be references to the same object, or will they just display
object equality (i.e. calling Object.Equals)?"
A5 - The objects in the ref and test lists are distinct and are compared
by the string names.


Right. The lists being sorted makes things a *lot* easier. Basically,
you keep to indexes - one into the reference list and one into the test
list.

If you think about the lists being like this:

Test Ref

A
B
C
D
E E
F
G
H

etc an algorithm should present itself - you fetch the first entry of
each list, and find which one is "lower" - then keep adding entries
from that list until you find one which is equal to or greater than the
first entry of the other list - at which point you swap, etc.

That's a little bit vague, but hopefully gives you some idea of how to
proceed. If not, let me know and I'll write up some code for you.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #7
Ben Fidge <be*******@btopenworld.com> wrote:
I would be very interested to see how you approach this as my implementation
is returning duplicate entries in the result list. I found an example in
Java at the following site:

http://www.cs.sjsu.edu/faculty/fecte...ListMerge.html

with the actual Java code as follows:

http://www.cs.sjsu.edu/faculty/fecte...mainCS46B.html

If you have time, could I take you up on the offer of some sample code.


Sure. This example doesn't keep track of which element came from where, but
that's very easy to do by making each element of the new list an object
containing the original element and an enum value.

Let me know if you have any problems with the code below:

using System;
using System.Collections;

public class Test
{
public static void Main()
{
ArrayList a = new ArrayList (new object[]
{"ant", "attack", "bee",
"car", "frog",
"joke", "rage", "tree"});
ArrayList b = new ArrayList (new object[]
{"bar", "car", "ear",
"got", "joke", "smoke"});

IList merged = Merge (a, b);

foreach (string x in merged)
{
Console.WriteLine (x);
}
}

static IList Merge (IList a, IList b)
{
IList ret = new ArrayList();

IEnumerator enumA = a.GetEnumerator();
IEnumerator enumB = b.GetEnumerator();

enumA.Reset();
enumB.Reset();

bool aValid, bValid;
IComparable aElt = GetNext (enumA, out aValid);
IComparable bElt = GetNext (enumB, out bValid);

while (aValid && bValid)
{
int comp = aElt.CompareTo (bElt);
if (comp < 0)
{
ret.Add (aElt);
aElt = GetNext (enumA, out aValid);
}
else if (comp==0)
{
// No need to add both aElt and bElt,
// as they're the same
ret.Add (aElt);
aElt = GetNext (enumA, out aValid);
bElt = GetNext (enumB, out bValid);
}
else
{
ret.Add (bElt);
bElt = GetNext (enumB, out bValid);
}
}

while (aValid)
{
ret.Add (aElt);
aElt = GetNext (enumA, out aValid);
}

while (bValid)
{
ret.Add (bElt);
bElt = GetNext (enumB, out bValid);
}
return ret;
}

static IComparable GetNext (IEnumerator enumerator,
out bool valid)
{
valid = enumerator.MoveNext();
return valid ? (IComparable) enumerator.Current : null;
}
}

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #8
Jon,

Thanks a lot, that certainly clarified things for me. I made a few
tweaks however and managed to combine the two trailing while loops into
the main loop.

This was done by replacing the "and" in the loop termination condition
with an "or". Then having an if..if..else in the middle.

So..

iRefIndex = 1; // We are using COM collections
iRefCount = RefCollection.Count;
iTestIndex = 1;
iTestCount = TestCollection.Count;

while (iRefIndex <= iRefCount || iTestIndex <= iTestCount) {
if (iRefIndex > iRefCount) {
// We have an extra item in TestCollection
iTestIndex++;
}
else if (iTestIndex > iTestCount) {
// We have an extra item in RefCollection
iRefIndex++;
}
else {
// Do comparison logic here
string sRefItem = RefCollection.Item(iRefIndex);
string sTestItem = TestCollection.Item(iTestIndex);
int iCompare = sRefItem.CompareTo(sTestItem);

if (iCompare == 0) {
// We have a matching pair
iRefIndex++;
iTestIndex++;
}
else if (iCompare < 0) {
// We have added an item in RefCollection
iRefIndex++;
}
else {
// We have added an item in TestCollection
iTestIndex++
}
}
}

Simply really once it becomes clear. Thanks for your help Jon, please
remain on standby for my next algorithm 'angover!

Ben

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 15 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

41
by: Xah Lee | last post by:
here's another interesting algorithmic exercise, again from part of a larger program in the previous series. Here's the original Perl documentation: =pod merge($pairings) takes a list of...
11
by: velthuijsen | last post by:
I tried taking a list and pass it through std::sort like the following: sort(Unsorted.begin(), Unsorted.end()); I got an error back stating that the list iterator doesn't have a binary...
39
by: n00m | last post by:
Given a list of N arbitrarily permutated integers from set {1..N}. Need to find the ordering numbers of each integer in the LONGEST increasing sequence to which this number belongs. Sample: ...
2
by: barnesc | last post by:
>barnesc at engr.orst.edu wrote: > > So my question is: are there any other *practical* applications of a > > B-tree based list/set/dict ? In other words, is this module totally > > worth coding,...
20
by: William Stacey [MVP] | last post by:
int list = {1,2,3,4,5,6}; Function to randomize the list? Cheers! -- William Stacey, MVP
3
by: 2obvious | last post by:
I have a DataGrid containing a TextBox control and a CustomValidator in each row. The CustomValidator fires a function that compares all TextBoxes for equality. The algorithm for comparison is...
4
by: Amit Bhatia | last post by:
User-Agent: OSXnews 2.081 Xref: number1.nntp.dca.giganews.com comp.lang.c++:824790 Hi, I have a list of objects (instantiations of class A) in a class B: class B{ //other stuff;...
15
by: desktop | last post by:
If I have a sorted std::list with 1.000.000 elements it takes 1.000.000 operations to find element with value = 1.000.000 (need to iterator through the whole list). In comparison, if I have a...
12
by: Godzilla | last post by:
Hello, I'm trying to find a way to convert an integer (8-bits long for starters) and converting them to a list, e.g.: num = 255 numList = with the first element of the list being the...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.