473,762 Members | 8,115 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Best way to compare two sets of data

I've got a situation where I have a set of data, and later take another
snapshot to obtain a second set of data. There will be one or more
changes in the second set of data and I need to be able to tell which
items were in the first set missing from the second set and which items
were added to the second set.

Can anyone recommend an algorithm for this, or a collection class in C#
that may be of help?

My first inclination is the take two arrays and just start looping
through one at a time, but it seems like this is something that would
have to done all the time and that there would be a more efficient
algorithm for doing so.

In the end, I'd like to be able to say that between that time and this
time, "x" and "y" were added, and "z" was removed.

Thanks!
Nov 17 '05 #1
7 3037
Terry,

Is this stored in a DataSet? If it is, then you can use the GetChanges
method on the DataSet to create another DataSet that has only the changes
that have occured since the last time AcceptChange (or the creation of the
dataset) was called.

Hope this helps.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard. caspershouse.co m

"Terry" <ch**********@h otmail.com> wrote in message
news:OM******** ******@TK2MSFTN GP09.phx.gbl...
I've got a situation where I have a set of data, and later take another
snapshot to obtain a second set of data. There will be one or more
changes in the second set of data and I need to be able to tell which
items were in the first set missing from the second set and which items
were added to the second set.

Can anyone recommend an algorithm for this, or a collection class in C#
that may be of help?

My first inclination is the take two arrays and just start looping through
one at a time, but it seems like this is something that would have to done
all the time and that there would be a more efficient algorithm for doing
so.

In the end, I'd like to be able to say that between that time and this
time, "x" and "y" were added, and "z" was removed.

Thanks!

Nov 17 '05 #2
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the GetChanges
method on the DataSet to create another DataSet that has only the changes
that have occured since the last time AcceptChange (or the creation of the
dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList " which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot
of overhead for a simple operation?

Terry
Nov 17 '05 #3
Terry,

Well, that's something you will have to decide for yourself (whether or
not it is too big). Obviously, there is going to be some overhead, but only
you can determine if that overhead is tolerable. For this kind of
functionality, I would say it is, but then again, I don't know scope of it's
use.

If you don't want to use a data set, then you could easily determine
which elements changed in between the two iterations. The problem stems
from how you want to indicate there was a change. For example, if the
element at index 14 was deleted, is everything else shifted down or not?
Does the position matter? What if the array has more than one element with
the same value in it?

These questions pile up pretty quickly, which is why I opt for the data
set =)
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard. caspershouse.co m

"Terry" <ch**********@h otmail.com> wrote in message
news:Oo******** ******@TK2MSFTN GP10.phx.gbl...
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has only
the changes that have occured since the last time AcceptChange (or the
creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this. I
just noticed "SortedList " which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot of
overhead for a simple operation?

Terry

Nov 17 '05 #4
Terry wrote:
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has
only the changes that have occured since the last time AcceptChange
(or the creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList " which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot
of overhead for a simple operation?

Terry


How's this for a solution? This might be good enough, unless anyone
sees a better way of handling this?

private void processDifferen ces(SortedList ar1, SortedList ar2)
{
ArrayList itemsRemoved = new ArrayList();
ArrayList itemsAdded = new ArrayList();
foreach(Diction aryEntry de in ar1)
{
// If it's not in second set it was removed
if (!ar2.ContainsK ey(de.Key))
{
itemsRemoved.Ad d(de.Key);
}
else // Otherwise it is still there, remove it from ar2
{
ar2.Remove(de.K ey);
}

// Everything that's left in ar2 was added
itemsAdded.AddR ange(ar2);
}
}
Nov 17 '05 #5
Terry wrote:
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has
only the changes that have occured since the last time AcceptChange
(or the creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList " which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot
of overhead for a simple operation?

Terry


How's this for a solution? This might be good enough, unless anyone
sees a better way of handling this?

private void processDifferen ces(SortedList ar1, SortedList ar2)
{
ArrayList itemsRemoved = new ArrayList();
ArrayList itemsAdded = new ArrayList();
foreach(Diction aryEntry de in ar1)
{
// If it's not in second set it was removed
if (!ar2.ContainsK ey(de.Key))
{
itemsRemoved.Ad d(de.Key);
}
else // Otherwise it is still there, remove it from ar2
{
ar2.Remove(de.K ey);
}

// Everything that's left in ar2 was added
itemsAdded.AddR ange(ar2);
}
}
Nov 17 '05 #6
Terry,

Can your lists have multiple entries of the same value? If so, then
that might not work (unless you don't care which duplicate value is
retained).
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard. caspershouse.co m

"Terry" <ch**********@h otmail.com> wrote in message
news:Og******** ******@TK2MSFTN GP09.phx.gbl...
Terry wrote:
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Is this stored in a DataSet? If it is, then you can use the
GetChanges method on the DataSet to create another DataSet that has only
the changes that have occured since the last time AcceptChange (or the
creation of the dataset) was called.

Hope this helps.


No. Currently the data is coming back as a simple array. These aren't
large sets of data, 64 max and most of the time less than 20, so I was
thinking that "DataSet" would be kind of a heavyweight control for this.
I just noticed "SortedList " which may be useful.

Or is DataSet not as big as I'm assuming it is? By "big" I mean a lot of
overhead for a simple operation?

Terry


How's this for a solution? This might be good enough, unless anyone sees
a better way of handling this?

private void processDifferen ces(SortedList ar1, SortedList ar2)
{
ArrayList itemsRemoved = new ArrayList();
ArrayList itemsAdded = new ArrayList();
foreach(Diction aryEntry de in ar1)
{
// If it's not in second set it was removed
if (!ar2.ContainsK ey(de.Key))
{
itemsRemoved.Ad d(de.Key);
}
else // Otherwise it is still there, remove it from ar2
{
ar2.Remove(de.K ey);
}

// Everything that's left in ar2 was added
itemsAdded.AddR ange(ar2);
}
}

Nov 17 '05 #7
Nicholas Paldino [.NET/C# MVP] wrote:
Terry,

Can your lists have multiple entries of the same value? If so, then
that might not work (unless you don't care which duplicate value is
retained).

No, they can't have duplicate values. The keys must be unique, and
since I'm checking for existance of keys, wouldn't that take care of it.
Even if two values were the same, the keys would mean that they are
two different items. The values can really be anything, but the keys
are how things are tracked, so it should be ok (I think :-)

Also, I realized I need to move that "addRange() " call outside of the
"foreach" loop. :-X

Terry
Nov 17 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
2226
by: Fazer | last post by:
Hello, I was wondering which would be the best way to compare a list? I was thinking of just using a for loop and testing the condition. What do you guys think? The best/fastest way of comparing lists. Thanks,
6
4546
by: Robin Siebler | last post by:
I have two directory trees that I want to compare and I'm trying to figure out what the best way of doing this would be. I am using walk to get a list of all of the files in each directory. I am using this code to compare the file lists: def compare_files(first_list, second_list, first_dir, second_dir): missing = in_first_only(first_list, second_list) for item in missing: index = first_list.index(item)
2
1645
by: Denise | last post by:
How do I get the report to not spend time pulling up the data when I'm going to bail out of the report anyway? Below is the code in my report. I first show the user a form where they can select some filtering parameters which I put in Getz's TaggedValues class. But they can hit a Cancel button if they decide they don't want any report. When I test the Cancel button I'd like the report to close immediately but it seems to spend some...
0
4248
by: Anonieko Ramos | last post by:
ASP.NET Forms Authentication Best Practices Dr. Dobb's Journal February 2004 Protecting user information is critical By Douglas Reilly Douglas is the author of Designing Microsoft ASP.NET Applications and owner of Access Microsystems. Doug can be reached at doug@accessmicrosystems.com. --------------------------------------------------------------------------------
0
1715
by: Louis Aslett | last post by:
I hope this is the correct newsgroup for this query (if not please give me a pointer to where is best): I understand the theory of normalisation etc and am trying to follow best practices in the design of the database for a new project, but I am unsure as to the best practice when one wants to store data relating to combinations of arbitrary numbers of sets of data. For example, take the following two groups of sets, each containing...
17
4534
by: Mark A | last post by:
DB2 8.2 for Linux, FP 10 (also performs the same on DB2 8.2 for Windoes, FP 11). Using the SAMPLE database, tables EMP and EMLOYEE. In the followng stored procedure, 2 NULL columns (COMM) are selected into 2 different SP variables and compared for equal. They are both NULL, but do not compare as equal. When the Not NULL columns (SALARY) are compared, they do compare as equal.
4
3292
by: Jim Andersen | last post by:
Is there anything built into .NET that is good (or rather easy) at comparing ? I have some data (in an array). I make a copy of this array, and the user changes some of the data, or maybe he doesn't. Then he clicks a button and my code runs. So now I want to see if the user made some changes to the data in the array, or if the user just looked at the data.
19
7829
by: Chaz Ginger | last post by:
I have a system that has a few lists that are very large (thousands or tens of thousands of entries) and some that are rather small. Many times I have to produce the difference between a large list and a small one, without destroying the integrity of either list. I was wondering if anyone has any recommendations on how to do this and keep performance high? Is there a better way than Thanks.
0
2326
by: anilkodali | last post by:
How to compare multiple result sets with a set of values? Here is the scenario.. My query returns me multiple results(one column of data) and I want compare all the data at once with a set of data. For example my query returns 1,2,3 and I want compare the result set with (1,3), can I do that using a query with out using stored procedures? Obviously using the in clause isn't working(i.e., (1,2,3) in (1,3)), any other alternative. Thanks,
0
9378
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9989
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9927
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9812
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8814
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7360
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6640
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5268
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3914
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.