On Thu, 17 Jan 2008 18:44:44 -0800, Jesse McGrew <jm*****@gmail.comwrote:
>Why should a page file be slower than a ny other disk file ?
An in-memory collection whose contents are being paged to and from the
disk by the OS will have worse performance than a collection designed
to operate off the disk, as soon as you do any kind of search on it.
Why? There's no a priori reason to believe this is true, even though it's
true that _some_ in-memory collections may not be as effecient as a
disk-based database.
A collection that's designed to run off the disk will probably have an
indexing system so it doesn't have to load the entire file to find a
single element. But searching through a massive memory-based
collection will cause many pages to be swapped in, possibly causing
other useful pages to be swapped out and lowering performance down the
road.
You're assuming that the in-memory structure would not have a similar
indexing mechanism.
Now, I don't know the implementation of DataTable. But as a general
concept, there's absolutely no reason it couldn't be indexed in basically
the same way as a database. Conversely, if a database implements (for
example) an index as a simple sorted array that uses a binary search, it's
going to have the exact same liability that an in-memory structure paged
to the disk using the same indexing scheme would have.
For example, a binary search on a memory-based collection might end up
having to load half the file into memory, one page at a time, while a
disk-based collection could keep all the necessary indexing data in a
single page that never gets swapped out.
If that's a concern, why wouldn't someone just have a similar "index only"
data section for their data structure for the in-memory implementation?
It seems to me that if all you know is that one implementation is a
disk-based database and another is an in-memory data structure, that that
is not nearly enough information to tell you which will perform better.
Pete