473,387 Members | 1,578 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Best way to compare two sets of data

I've got a situation where I have a set of data, and later take another
snapshot to obtain a second set of data. There will be one or more
changes in the second set of data and I need to be able to tell which
items were in the first set missing from the second set and which items
were added to the second set.

Can anyone recommend an algorithm for this, or a collection class in C#
that may be of help?

My first inclination is the take two arrays and just start looping
through one at a time, but it seems like this is something that would
have to done all the time and that there would be a more efficient
algorithm for doing so.

In the end, I'd like to be able to say that between that time and this
time, "x" and "y" were added, and "z" was removed.

Thanks!
Nov 17 '05 #1
7 3006
Terry,

Is this stored in a DataSet? If it is, then you can use the GetChanges
method on the DataSet to create another DataSet that has only the changes
that have occured since the last time AcceptChange (or the creation of the
dataset) was called.

Hope this helps.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Terry" <ch**********@hotmail.com> wrote in message
news:OM**************@TK2MSFTNGP09.phx.gbl...
I've got a situation where I have a set of data, and later take another
snapshot to obtain a second set of data. There will be one or more
changes in the second set of data and I need to be able to tell which
items were in the first set missing from the second set and which items
were added to the second set.

Can anyone recommend an algorithm for this, or a collection class in C#
that may be of help?

My first inclination is the take two arrays and just start looping through
one at a time, but it seems like this is something that would have to done
all the time and that there would be a more efficient algorithm for doing
so.

In the end, I'd like to be able to say that between that time and this
time, "x" and "y" were added, and "z" was removed.

Thanks!

Nov 17 '05 #2
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the GetChanges
method on the DataSet to create another DataSet that has only the changes
that have occured since the last time AcceptChange (or the creation of the
dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList" which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot
of overhead for a simple operation?

Terry
Nov 17 '05 #3
Terry,

Well, that's something you will have to decide for yourself (whether or
not it is too big). Obviously, there is going to be some overhead, but only
you can determine if that overhead is tolerable. For this kind of
functionality, I would say it is, but then again, I don't know scope of it's
use.

If you don't want to use a data set, then you could easily determine
which elements changed in between the two iterations. The problem stems
from how you want to indicate there was a change. For example, if the
element at index 14 was deleted, is everything else shifted down or not?
Does the position matter? What if the array has more than one element with
the same value in it?

These questions pile up pretty quickly, which is why I opt for the data
set =)
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Terry" <ch**********@hotmail.com> wrote in message
news:Oo**************@TK2MSFTNGP10.phx.gbl...
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has only
the changes that have occured since the last time AcceptChange (or the
creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this. I
just noticed "SortedList" which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot of
overhead for a simple operation?

Terry

Nov 17 '05 #4
Terry wrote:
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has
only the changes that have occured since the last time AcceptChange
(or the creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList" which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot
of overhead for a simple operation?

Terry


How's this for a solution? This might be good enough, unless anyone
sees a better way of handling this?

private void processDifferences(SortedList ar1, SortedList ar2)
{
ArrayList itemsRemoved = new ArrayList();
ArrayList itemsAdded = new ArrayList();
foreach(DictionaryEntry de in ar1)
{
// If it's not in second set it was removed
if (!ar2.ContainsKey(de.Key))
{
itemsRemoved.Add(de.Key);
}
else // Otherwise it is still there, remove it from ar2
{
ar2.Remove(de.Key);
}

// Everything that's left in ar2 was added
itemsAdded.AddRange(ar2);
}
}
Nov 17 '05 #5
Terry wrote:
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has
only the changes that have occured since the last time AcceptChange
(or the creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList" which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot
of overhead for a simple operation?

Terry


How's this for a solution? This might be good enough, unless anyone
sees a better way of handling this?

private void processDifferences(SortedList ar1, SortedList ar2)
{
ArrayList itemsRemoved = new ArrayList();
ArrayList itemsAdded = new ArrayList();
foreach(DictionaryEntry de in ar1)
{
// If it's not in second set it was removed
if (!ar2.ContainsKey(de.Key))
{
itemsRemoved.Add(de.Key);
}
else // Otherwise it is still there, remove it from ar2
{
ar2.Remove(de.Key);
}

// Everything that's left in ar2 was added
itemsAdded.AddRange(ar2);
}
}
Nov 17 '05 #6
Terry,

Can your lists have multiple entries of the same value? If so, then
that might not work (unless you don't care which duplicate value is
retained).
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Terry" <ch**********@hotmail.com> wrote in message
news:Og**************@TK2MSFTNGP09.phx.gbl...
Terry wrote:
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has only
the changes that have occured since the last time AcceptChange (or the
creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList" which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot of
overhead for a simple operation?

Terry


How's this for a solution? This might be good enough, unless anyone sees
a better way of handling this?

private void processDifferences(SortedList ar1, SortedList ar2)
{
ArrayList itemsRemoved = new ArrayList();
ArrayList itemsAdded = new ArrayList();
foreach(DictionaryEntry de in ar1)
{
// If it's not in second set it was removed
if (!ar2.ContainsKey(de.Key))
{
itemsRemoved.Add(de.Key);
}
else // Otherwise it is still there, remove it from ar2
{
ar2.Remove(de.Key);
}

// Everything that's left in ar2 was added
itemsAdded.AddRange(ar2);
}
}

Nov 17 '05 #7
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Can your lists have multiple entries of the same value? If so, then
that might not work (unless you don't care which duplicate value is
retained).

No, they can't have duplicate values. The keys must be unique, and
since I'm checking for existance of keys, wouldn't that take care of it.
Even if two values were the same, the keys would mean that they are
two different items. The values can really be anything, but the keys
are how things are tracked, so it should be ok (I think :-)

Also, I realized I need to move that "addRange()" call outside of the
"foreach" loop. :-X

Terry
Nov 17 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Fazer | last post by:
Hello, I was wondering which would be the best way to compare a list? I was thinking of just using a for loop and testing the condition. What do you guys think? The best/fastest way of...
6
by: Robin Siebler | last post by:
I have two directory trees that I want to compare and I'm trying to figure out what the best way of doing this would be. I am using walk to get a list of all of the files in each directory. I...
2
by: Denise | last post by:
How do I get the report to not spend time pulling up the data when I'm going to bail out of the report anyway? Below is the code in my report. I first show the user a form where they can select...
0
by: Anonieko Ramos | last post by:
ASP.NET Forms Authentication Best Practices Dr. Dobb's Journal February 2004 Protecting user information is critical By Douglas Reilly Douglas is the author of Designing Microsoft ASP.NET...
0
by: Louis Aslett | last post by:
I hope this is the correct newsgroup for this query (if not please give me a pointer to where is best): I understand the theory of normalisation etc and am trying to follow best practices in the...
17
by: Mark A | last post by:
DB2 8.2 for Linux, FP 10 (also performs the same on DB2 8.2 for Windoes, FP 11). Using the SAMPLE database, tables EMP and EMLOYEE. In the followng stored procedure, 2 NULL columns (COMM) are...
4
by: Jim Andersen | last post by:
Is there anything built into .NET that is good (or rather easy) at comparing ? I have some data (in an array). I make a copy of this array, and the user changes some of the data, or maybe he...
19
by: Chaz Ginger | last post by:
I have a system that has a few lists that are very large (thousands or tens of thousands of entries) and some that are rather small. Many times I have to produce the difference between a large list...
0
by: anilkodali | last post by:
How to compare multiple result sets with a set of values? Here is the scenario.. My query returns me multiple results(one column of data) and I want compare all the data at once with a set of...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.