Bri <no*@here.com> wrote in news:fiZZf.9217$nf7.1185@pd7tw1no:
David W. Fenton wrote: Bri <no*@here.com> wrote in news:m6VZf.9870$gO.808@pd7tw3no:
So, to avoid this confusion I reran the tests with 10 loops for
all three methods on the smallest table using the unique long
field index: FindFirst (10) - 0.2500000
Query (10) - 0.0195313
Seek (10) - 0.0078125
Are you changing the searched-for value for each loop? Or just
searching for the same value 10 times? Or doing a .MoveFirst
before the find operation?
Each loop creates the recordset, finds the record (except the
query where the query does the finding), assigns the value to a
variable, closes and sets the recordset to nothing. I am only
doing multiple loops so that there was something to measure for
the Seek and query which can so a single loop in less than the
measureable value of the Single Timer().
OK. I tried to look up your original code, and it was too far back
in the references chain for me to easily find it.
It does suggest to me even more strongly that perhaps there's an
issue with Access's data caching that's causing the FindFirst to be
so much more inefficient.
OK, after writing that, I finally just got around to doing a test on
my app that uses the persistent totals recordset. Here's what I
found out:
Recordset initialized: 24.112
First FindFirst: 1.655
Next FindFirst: 0.282
Next : 1.471
Next : 0.39
Next : 1.348
Next : 0.173
Next : 1.333
Next : 1.386
Next : 1.723
Next : 1.5
Next : 0.365
Next : 1.39
There is no pattern. My guess is that the value is proportional to
the distance traversed in the recordset, which is in FK order (while
the form calling the recordset navigation is in a different order;
the FK is a long, BTW).
I'm surprised it's taking this long (it's not enough to matter in
the UI for users, though -- other processes can take much longer,
e.g., the de-duping routines, which can be turned off, though they
are on by default; I tested with it off, of course).
The recordset that's being navigated has 367,983 records in it
(summed on the foreign key that is the PK of the table that is the
base data source for the form it's being called from, though that
form is not in PK order, but in name order; that's irrelevant, of
course).
The table that the recordset summarizes has 531,488 records.
Each call to the navigation will attempt to find a matching record
with one FindFirst. If none are found, it stops and returns to the
original context. If a first record is found, it does a FindNext
until there are no more matches. On average, just eyeballing my test
data, there was only one matching record in most cases (0 or 2 or
more in a handful).
I can see no proportional relationship at all between the number of
records matched in the FindFirst/FindNext operations. There isn't
any really clear relationship between location in the recordset.
Just eyeballing it, jumping to a high FK number, then jumping to a
low FK number is no slower or faster than jumping between nearby FK
values. There is also no apparent speedup from caching, since
jumping to a record, then to another and back to the original can be
longer or shorter, in no obvious pattern.
So, I"m rather puzzled about all of this.
In a recordset with 368K records, the index *has* to be in use or
the search times would be much longer, and linear to the location in
the recordset. I can't really test without the index, as it would
require deleting RI on a production database. I'd do that if it were
a smaller file, but it's 328MBs and I just don't want to muck around
with that.
Just for completeness, I'm running this on a Windows Terminal Server
with all data local in Access 2000 SR1 with SP8.
I realized that the data file hasn't been recently compacted,
either, so since it was Sunday afternoon and I knew I was the only
logged on, I compacted and re-ran the tests:
Initialize Recordset: 29.586
First FindFirst: 0.436
Next FindFirst: 1.631
Next : 0.184
Next : 1.336
Next : 0.179
Next : 1.378
Next : 1.703
Next : 0.095
Aha! A pattern *has* emerged. The lower FK values have lower values,
which shows that the recordset pointer is being reset to the start,
so there's a linear relationship between the magnitude of FK values
and the time it takes to find them. Values in the 500K range are
taking about 1.5 seconds, while values in the 100K range are taking
less than .5 seconds or less.
In any event, my results are not very consistent with yours, seems
to me.
But I think it does show that SEEK is more efficient in part because
it just repositions the record pointer, rather than always returning
to the beginning of the index and scanning through it.
Of course, that's exactly what the Help file says it does. D'oh!
I wonder if it would be more efficient to use FindNext, and if
NoMatch, then try FindPrevious? Statistically speaking, surely this
would reduce the distant traversed by an average of 50% (I think),
assuming there is no performance penalty for .NoMatch.
Instead of:
rs.FindFirst
If rs.NoMatch Then GoTo End
rs.FindNext <- in a loop until no matches
I would instead do this:
rs.FindNext
If rs.NoMatch Then
rs.FindPrevious
If rs.NoMatch Then GoTo End
End If
rs.FindNext <- in a loop until no matches
I was going to quit working on this at this point, but now I've just
got to test this!
Well, here are the results avoiding FindFirst -- it's actually quite
encouraging (though I'm not sure where the negative result is coming
from -- an innacuracy in GetTickCount?):
FindNext
FK Amt/Count /FindPrevious FindFirst
237552 600 (9) 0.39 0.398
521142 150 (1) 1.128 1.389
191276 1,000 (1) 1.553 0.202
521538 25 (1) 1.203 1.334
520956 75 (2) 0.359 1.394
593669 100 (1) 0.329 1.766
579508 100 (1) 0.036 1.767
105368 0 (0) 1.796 2.208
520611 20 (2) 0.375 1.464
521274 70 (3) 0.002 1.474
522271 50 (1) -0.012 1.468
574826 100 (1) 0.239 1.724
173916 700 (2) 1.793 0.098
148957 0 (0) 1.826 1.82
190670 2,000 (2) 0.077 0.165
177070 600 (2) 1.801 0.109
216224 400 (2) 0.147 0.256
179895 150 (1) 1.757 0.116
563124 1,000 (1) 1.618 1.67
522265 25 (1) 0.337 1.48
Totals: 16.754 22.302
Averages: 1.595619 2.124
So it looks like it's just less than a 1/3 reduction in the amount
of time it takes, simply because there's a linear relationship
between the order of the index being searched and the amount of time
it takes.
[note that the COUNT column does not indicate the number of
recordset repositionings, because any single record of the summary
recordset might be a total of more than one record from the source
table, because the summary recordset is grouped on the FK given and
subgrouped on a second foreign key. So for any DonorID (the FK
that's being navigated) there is a subgroup on FundID and that
subgroup could have one or more records totalled in it, even though
it is only one record in the summary query. This has some effect on
the timings listed here, as when there are subsequent records those
are jumped to with FindNext. As you can see there is no apparent
difference between the cases with 0 matching records or any other
count, so I don't expect this has any major effect on the elapsed
time because you'll note that nearby record navigation is very fast,
e.g., in the case of the jump from 520611 to 521274, at .002
seconds. For adjacent records, the time would surely be less than a
millisecond. But I could rewrite my code to test this if anyone
thinks its a major issue that makes my results too flawed to be
dependable]
The only time the FindNext/FindPrevious takes longer is when you
need to find the previous record. This could be optimized by caching
the previously searched value and choosing FindFirst or FindNext
based on whether or not the newly requested value is greater or less
than the previous one.
Well, it turns out doing that results in a fairly impressive
performance improvement (the test results from above are repeated in
the columns under FIRST TEST):
|----First Test---| |---------Second
|Test----------|
Amount FindNext FindNext
FK /Count /Previous FindFirst /Previous Conditional FindFirst
237552 600 (9) 0.39 0.398 0.405 0.412 0.4
521142 150 (1) 1.128 1.389 1.038 1.035 1.557
191276 1,000 (1) 1.553 0.202 1.555 1.209 0.202
521538 25 (1) 1.203 1.334 1.242 1.239 1.312
520956 75 (2) 0.359 1.394 0.351 0.014 1.377
593669 100 (1) 0.329 1.766 0.301 0.336 1.729
579508 100 (1) 0.036 1.767 0.047 0.039 1.716
105368 0 (0) 1.796 2.208 1.806 1.784 1.813
520611 20 (2) 0.375 1.464 0.384 0.315 1.454
521274 70 (3) 0.002 1.474 0.006 0.03 1.506
522271 50 (1) -0.012 1.468 0.006 -0.014 1.448
574826 100 (1) 0.239 1.724 0.249 0.254 1.78
173916 700 (2) 1.793 0.098 1.78 1.727 0.078
148957 0 (0) 1.826 1.82 1.835 0.089 1.808
190670 2,000 (2) 0.077 0.165 0.094 0.064 0.164
177070 600 (2) 1.801 0.109 1.737 0.067 0.103
216224 400 (2) 0.147 0.256 0.564 0.128 0.248
179895 150 (1) 1.757 0.116 1.748 0.139 0.141
563124 1,000 (1) 1.618 1.67 1.609 1.596 1.658
522265 25 (1) 0.337 1.48 0.343 0.224 1.472
Totals: 16.754 22.302 17.1 10.687 21.966
Averages: 1.595619 2.124 1.628571 1.0178095 2.092
As you see, FindFirst took around 22 seconds total for these 20
navigation operations, whereas conditionally choosing which to use
based on the previous value cut that in half, almost exactly, at
just under 11 seconds.
Well, this has been quite an eye opener.
Obviously, if SEEK were an option, I'd use it, but since this
particular operation is on a summary query, SEEK wouldn't be
available. But it does show that you can definitely get significant
improvement out of the DAO Find operations by writing your code
carefully.
--
David W. Fenton
http://www.dfenton.com/
usenet at dfenton dot com
http://www.dfenton.com/DFA/