473,388 Members | 1,064 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

Dataset. Most efficient approach.

bob
Hi,
My app needs to read a text file, compare the data to that already
stored in the DB and generate a text file of the differences.

The UI displays the text file data and the db data in a series of
Datagridviews.
Piclk a row in the text file master table dgv and the other dgvs move
to the appropriate records so you can eyeball the differences before
generating the difference file.

I have used a dataset that contains two 'arms'
The text file arm and the database arm.
BindingSources are used to make the dataset relationships available
to the code.

Works OK but when I feed it the full text file it takes a long time to
do the comparison work.
Profiling shows the most expensive part is moving through the dataset
with the bindingSource.Movenext.

So I figured that maybe using the threadpool to do the iteration and
comparison would speed things up.

Sort of
while(position + 1 < bindingSource.Count)
{
assign all the bindingsources and the dataset to a helperclass object
'u'
ThreadPool.QueueUserWorkItem(new WaitCallback (MyComparisionFunction),
u);
MasterBindingSource.MoveNext();
position++;
}

However the comparision code fails when it tries to get one of the
datarelaionships back from one of the bindingsources. "Relation not
found"
Intellisense indicates it is there but you can't argue with the
executing code.
Maybe it is lost in the casting from object to helper class .

Anyway I started to get the feeling that maybe my whole approach is
inefficient.

So before I go completely off into the weeds what is the general
opinion on how to handle this task?

I am starting to think that maybe two collections of widgets maybe the
way to go. Then use tree views instead of dgvs

textfile -Collection A.
db -Collection B. (one B widget for each widget in A)
for( int i =0;i<A.Count;i++)
{
if (A.items[i] != B.items[i]
AddtoDiffFile(A.Items[i])
}

Thanks
Bob

Jul 6 '07 #1
5 2723
Bob,

Are you using a database server for this? You might want to consider
uploading the contents of the text file into a temp tabl on the server and
then find the differences using a query. I would say that it has the
potential of being much faster than you doing all the parsing and comparison
on the client side.

This would especially be the case if the text file is a delimited or
positional file of some sort, as you can use the bulk loader on SQL Server
(if that is what you are using) to import the data.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"bob" <st**************@cutthis.adriley.co.nzwrote in message
news:m1********************************@4ax.com...
Hi,
My app needs to read a text file, compare the data to that already
stored in the DB and generate a text file of the differences.

The UI displays the text file data and the db data in a series of
Datagridviews.
Piclk a row in the text file master table dgv and the other dgvs move
to the appropriate records so you can eyeball the differences before
generating the difference file.

I have used a dataset that contains two 'arms'
The text file arm and the database arm.
BindingSources are used to make the dataset relationships available
to the code.

Works OK but when I feed it the full text file it takes a long time to
do the comparison work.
Profiling shows the most expensive part is moving through the dataset
with the bindingSource.Movenext.

So I figured that maybe using the threadpool to do the iteration and
comparison would speed things up.

Sort of
while(position + 1 < bindingSource.Count)
{
assign all the bindingsources and the dataset to a helperclass object
'u'
ThreadPool.QueueUserWorkItem(new WaitCallback (MyComparisionFunction),
u);
MasterBindingSource.MoveNext();
position++;
}

However the comparision code fails when it tries to get one of the
datarelaionships back from one of the bindingsources. "Relation not
found"
Intellisense indicates it is there but you can't argue with the
executing code.
Maybe it is lost in the casting from object to helper class .

Anyway I started to get the feeling that maybe my whole approach is
inefficient.

So before I go completely off into the weeds what is the general
opinion on how to handle this task?

I am starting to think that maybe two collections of widgets maybe the
way to go. Then use tree views instead of dgvs

textfile -Collection A.
db -Collection B. (one B widget for each widget in A)
for( int i =0;i<A.Count;i++)
{
if (A.items[i] != B.items[i]
AddtoDiffFile(A.Items[i])
}

Thanks
Bob

Jul 6 '07 #2
bob

Hi Nicholas,
Good thought.
I'll give it a go.
Thanks
Bob

On Fri, 6 Jul 2007 12:00:52 -0400, "Nicholas Paldino [.NET/C# MVP]"
<mv*@spam.guard.caspershouse.comwrote:
>Bob,

Are you using a database server for this? You might want to consider
uploading the contents of the text file into a temp tabl on the server and
then find the differences using a query. I would say that it has the
potential of being much faster than you doing all the parsing and comparison
on the client side.

This would especially be the case if the text file is a delimited or
positional file of some sort, as you can use the bulk loader on SQL Server
(if that is what you are using) to import the data.
Jul 7 '07 #3
On Sat, 07 Jul 2007 03:47:33 +1200, bob
<st**************@cutthis.adriley.co.nzwrote:
>Hi,
My app needs to read a text file, compare the data to that already
stored in the DB and generate a text file of the differences.

The UI displays the text file data and the db data in a series of
Datagridviews.
Piclk a row in the text file master table dgv and the other dgvs move
to the appropriate records so you can eyeball the differences before
generating the difference file.

I have used a dataset that contains two 'arms'
The text file arm and the database arm.
BindingSources are used to make the dataset relationships available
to the code.

Works OK but when I feed it the full text file it takes a long time to
do the comparison work.
Profiling shows the most expensive part is moving through the dataset
with the bindingSource.Movenext.

So I figured that maybe using the threadpool to do the iteration and
comparison would speed things up.

Sort of
while(position + 1 < bindingSource.Count)
{
assign all the bindingsources and the dataset to a helperclass object
'u'
ThreadPool.QueueUserWorkItem(new WaitCallback (MyComparisionFunction),
u);
MasterBindingSource.MoveNext();
position++;
}

However the comparision code fails when it tries to get one of the
datarelaionships back from one of the bindingsources. "Relation not
found"
Intellisense indicates it is there but you can't argue with the
executing code.
Maybe it is lost in the casting from object to helper class .

Anyway I started to get the feeling that maybe my whole approach is
inefficient.

So before I go completely off into the weeds what is the general
opinion on how to handle this task?

I am starting to think that maybe two collections of widgets maybe the
way to go. Then use tree views instead of dgvs

textfile -Collection A.
db -Collection B. (one B widget for each widget in A)
for( int i =0;i<A.Count;i++)
{
if (A.items[i] != B.items[i]
AddtoDiffFile(A.Items[i])
}

Thanks
Bob
I've never found datasets to be very efficient in using resources. I
prefer to use a custom list like a List<T>, but there are better
Generics such as DataBindingList<T(but I think these are all custom).

I can't understand Nicholas' solution as you specifically say the
differences need to be eye-balled by the client.

How big are these datasets anyway? You could do it entirely on the
client by downloading the data to javascript arrays or objects and
posting back only the differences using AJAX.
Jul 7 '07 #4
On Sat, 07 Jul 2007 08:55:34 GMT, mark4asp <ma******@gmail.comwrote:
>Datagridviews
javascript ... AJAX.
- Ooops. Sorry. I didn't fully read you post. But I still don't like
using datasets.
Jul 7 '07 #5
bob
Hi Mark,
My take on Nicholas's solution is that the database engine is more
efficent at set based comparisions than client code iterating through
the dataset. The result set could be displayed giving an ordinary list
of the differences.
On the client side I too lean towards lists of custom objects.

This project started life as a quick and dirty proof of concept.
The main concern was to see if the resultant 'difference' files would
be accepted by another app that maintains the database further
downstream.

The idea of the client being able to view the differences was based on
the notion that there wouldn't be too many differences. It was a bit
of a shock to to find a large disparity between the two sets.
So the point and click advantages of the linked Datagridviews, while
pretty, is not much use.
30000 parent records, approx 70% differences.

Thanks for your thoughts.
Regards
Bob
>
I've never found datasets to be very efficient in using resources. I
prefer to use a custom list like a List<T>, but there are better
Generics such as DataBindingList<T(but I think these are all custom).

I can't understand Nicholas' solution as you specifically say the
differences need to be eye-balled by the client.

How big are these datasets anyway? You could do it entirely on the
client by downloading the data to javascript arrays or objects and
posting back only the differences using AJAX.
Jul 8 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Frnak McKenney | last post by:
I'm using an in-core DataSet as an image of my application's 'database' (a multi-table Access97 mdb file). Updates are made to the DataTables within the DataSet via forms with bound TextBoxes,...
4
by: Bas Hamer | last post by:
I guess I don't know how to word it better than that. Our company has machines that generate log file in our own proprietary language. A while back I wrote a class that took one of these files...
5
by: Wayne Wengert | last post by:
I am using VB ASP.NET. In my page I convert an uploaded XML file to a dataset as follows: Dim ds1 As DataSet = New DataSet ds1.ReadXml(strPathName, XmlReadMode.Auto) Now I want to append...
2
by: epigram | last post by:
I'd like to know when a DataSet is the preferred way to retrieve, and specifically update/delete, data in an ASP.NET application. I've been using straight SQL by using the SqlCommand and...
22
by: Arne | last post by:
How do I pass a dataset to a webservices? I need to submit a shoppingcart from a pocket PC to a webservice. What is the right datatype? II have tried dataset as a datatype, but I can't get it to...
2
by: Carl Summers | last post by:
I have a table in an Access database that has no sort applied in Access. When I fill a dataset with data from that table (the entire one dimensional table) my dataset is sorted differently than...
0
by: anonieko | last post by:
This approach I found very efficient and FAST when compared to the rowcount, or Subquery Approaches. This is before the advent of a ranking function from DB such as ROW_NUMBER() in SQL Server...
1
by: | last post by:
I'm querying Index Server to return search results, both regular properties and some custom properties I've created. Index Server has this preference for thinking about information as strings...
3
by: Ken Fine | last post by:
This is a question that someone familiar with ASP.NET and ADO.NET DataSets and DataTables should be able to answer fairly easily. The basic question is how I can efficiently match data from one...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.