469,138 Members | 1,409 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,138 developers. It's quick & easy.

Dataset. Most efficient approach.

bob
Hi,
My app needs to read a text file, compare the data to that already
stored in the DB and generate a text file of the differences.

The UI displays the text file data and the db data in a series of
Datagridviews.
Piclk a row in the text file master table dgv and the other dgvs move
to the appropriate records so you can eyeball the differences before
generating the difference file.

I have used a dataset that contains two 'arms'
The text file arm and the database arm.
BindingSources are used to make the dataset relationships available
to the code.

Works OK but when I feed it the full text file it takes a long time to
do the comparison work.
Profiling shows the most expensive part is moving through the dataset
with the bindingSource.Movenext.

So I figured that maybe using the threadpool to do the iteration and
comparison would speed things up.

Sort of
while(position + 1 < bindingSource.Count)
{
assign all the bindingsources and the dataset to a helperclass object
'u'
ThreadPool.QueueUserWorkItem(new WaitCallback (MyComparisionFunction),
u);
MasterBindingSource.MoveNext();
position++;
}

However the comparision code fails when it tries to get one of the
datarelaionships back from one of the bindingsources. "Relation not
found"
Intellisense indicates it is there but you can't argue with the
executing code.
Maybe it is lost in the casting from object to helper class .

Anyway I started to get the feeling that maybe my whole approach is
inefficient.

So before I go completely off into the weeds what is the general
opinion on how to handle this task?

I am starting to think that maybe two collections of widgets maybe the
way to go. Then use tree views instead of dgvs

textfile -Collection A.
db -Collection B. (one B widget for each widget in A)
for( int i =0;i<A.Count;i++)
{
if (A.items[i] != B.items[i]
AddtoDiffFile(A.Items[i])
}

Thanks
Bob

Jul 6 '07 #1
5 2558
Bob,

Are you using a database server for this? You might want to consider
uploading the contents of the text file into a temp tabl on the server and
then find the differences using a query. I would say that it has the
potential of being much faster than you doing all the parsing and comparison
on the client side.

This would especially be the case if the text file is a delimited or
positional file of some sort, as you can use the bulk loader on SQL Server
(if that is what you are using) to import the data.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"bob" <st**************@cutthis.adriley.co.nzwrote in message
news:m1********************************@4ax.com...
Hi,
My app needs to read a text file, compare the data to that already
stored in the DB and generate a text file of the differences.

The UI displays the text file data and the db data in a series of
Datagridviews.
Piclk a row in the text file master table dgv and the other dgvs move
to the appropriate records so you can eyeball the differences before
generating the difference file.

I have used a dataset that contains two 'arms'
The text file arm and the database arm.
BindingSources are used to make the dataset relationships available
to the code.

Works OK but when I feed it the full text file it takes a long time to
do the comparison work.
Profiling shows the most expensive part is moving through the dataset
with the bindingSource.Movenext.

So I figured that maybe using the threadpool to do the iteration and
comparison would speed things up.

Sort of
while(position + 1 < bindingSource.Count)
{
assign all the bindingsources and the dataset to a helperclass object
'u'
ThreadPool.QueueUserWorkItem(new WaitCallback (MyComparisionFunction),
u);
MasterBindingSource.MoveNext();
position++;
}

However the comparision code fails when it tries to get one of the
datarelaionships back from one of the bindingsources. "Relation not
found"
Intellisense indicates it is there but you can't argue with the
executing code.
Maybe it is lost in the casting from object to helper class .

Anyway I started to get the feeling that maybe my whole approach is
inefficient.

So before I go completely off into the weeds what is the general
opinion on how to handle this task?

I am starting to think that maybe two collections of widgets maybe the
way to go. Then use tree views instead of dgvs

textfile -Collection A.
db -Collection B. (one B widget for each widget in A)
for( int i =0;i<A.Count;i++)
{
if (A.items[i] != B.items[i]
AddtoDiffFile(A.Items[i])
}

Thanks
Bob

Jul 6 '07 #2
bob

Hi Nicholas,
Good thought.
I'll give it a go.
Thanks
Bob

On Fri, 6 Jul 2007 12:00:52 -0400, "Nicholas Paldino [.NET/C# MVP]"
<mv*@spam.guard.caspershouse.comwrote:
>Bob,

Are you using a database server for this? You might want to consider
uploading the contents of the text file into a temp tabl on the server and
then find the differences using a query. I would say that it has the
potential of being much faster than you doing all the parsing and comparison
on the client side.

This would especially be the case if the text file is a delimited or
positional file of some sort, as you can use the bulk loader on SQL Server
(if that is what you are using) to import the data.
Jul 7 '07 #3
On Sat, 07 Jul 2007 03:47:33 +1200, bob
<st**************@cutthis.adriley.co.nzwrote:
>Hi,
My app needs to read a text file, compare the data to that already
stored in the DB and generate a text file of the differences.

The UI displays the text file data and the db data in a series of
Datagridviews.
Piclk a row in the text file master table dgv and the other dgvs move
to the appropriate records so you can eyeball the differences before
generating the difference file.

I have used a dataset that contains two 'arms'
The text file arm and the database arm.
BindingSources are used to make the dataset relationships available
to the code.

Works OK but when I feed it the full text file it takes a long time to
do the comparison work.
Profiling shows the most expensive part is moving through the dataset
with the bindingSource.Movenext.

So I figured that maybe using the threadpool to do the iteration and
comparison would speed things up.

Sort of
while(position + 1 < bindingSource.Count)
{
assign all the bindingsources and the dataset to a helperclass object
'u'
ThreadPool.QueueUserWorkItem(new WaitCallback (MyComparisionFunction),
u);
MasterBindingSource.MoveNext();
position++;
}

However the comparision code fails when it tries to get one of the
datarelaionships back from one of the bindingsources. "Relation not
found"
Intellisense indicates it is there but you can't argue with the
executing code.
Maybe it is lost in the casting from object to helper class .

Anyway I started to get the feeling that maybe my whole approach is
inefficient.

So before I go completely off into the weeds what is the general
opinion on how to handle this task?

I am starting to think that maybe two collections of widgets maybe the
way to go. Then use tree views instead of dgvs

textfile -Collection A.
db -Collection B. (one B widget for each widget in A)
for( int i =0;i<A.Count;i++)
{
if (A.items[i] != B.items[i]
AddtoDiffFile(A.Items[i])
}

Thanks
Bob
I've never found datasets to be very efficient in using resources. I
prefer to use a custom list like a List<T>, but there are better
Generics such as DataBindingList<T(but I think these are all custom).

I can't understand Nicholas' solution as you specifically say the
differences need to be eye-balled by the client.

How big are these datasets anyway? You could do it entirely on the
client by downloading the data to javascript arrays or objects and
posting back only the differences using AJAX.
Jul 7 '07 #4
On Sat, 07 Jul 2007 08:55:34 GMT, mark4asp <ma******@gmail.comwrote:
>Datagridviews
javascript ... AJAX.
- Ooops. Sorry. I didn't fully read you post. But I still don't like
using datasets.
Jul 7 '07 #5
bob
Hi Mark,
My take on Nicholas's solution is that the database engine is more
efficent at set based comparisions than client code iterating through
the dataset. The result set could be displayed giving an ordinary list
of the differences.
On the client side I too lean towards lists of custom objects.

This project started life as a quick and dirty proof of concept.
The main concern was to see if the resultant 'difference' files would
be accepted by another app that maintains the database further
downstream.

The idea of the client being able to view the differences was based on
the notion that there wouldn't be too many differences. It was a bit
of a shock to to find a large disparity between the two sets.
So the point and click advantages of the linked Datagridviews, while
pretty, is not much use.
30000 parent records, approx 70% differences.

Thanks for your thoughts.
Regards
Bob
>
I've never found datasets to be very efficient in using resources. I
prefer to use a custom list like a List<T>, but there are better
Generics such as DataBindingList<T(but I think these are all custom).

I can't understand Nicholas' solution as you specifically say the
differences need to be eye-balled by the client.

How big are these datasets anyway? You could do it entirely on the
client by downloading the data to javascript arrays or objects and
posting back only the differences using AJAX.
Jul 8 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Bas Hamer | last post: by
5 posts views Thread by Wayne Wengert | last post: by
2 posts views Thread by epigram | last post: by
22 posts views Thread by Arne | last post: by
2 posts views Thread by Carl Summers | last post: by
1 post views Thread by CARIGAR | last post: by
1 post views Thread by Mortomer39 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.