Caching Queries in DataTable

jehugaleahsa

Hello:

I am writing a cute little class that will cache queries against a
database. Currently, I am implementing this by storing the command
text, parameter values and generated DataRows.

First of all, is there something that does this already?

Second, right now I use a struct called Result that looks like this:

private struct Result
{
public object[] ParameterValues;
public DataRow[] Rows;
}

I am looking up past queries by storing the queries in a
Dictionary<string, List<Result>>, where the string key is the command
text and the List<Resultvalue is a collection of all the parameter
values and the generated DataRows that have been queried thus far.

Is there a way to improve the performance of this search? I am looking
for ideas. Currently, I have to loop though all the Results in the
List<Resultand compare against each of the parameter values. Is
there a way to optimize this algorithm while keeping things generic? I
can't even go against a primary key, since this class is meant to work
with anything.

Essentially, this class is worthless if it takes longer to find the
cached results than it would be to just rehit the database.

Thanks,
Travis

Aug 14 '08 #1

Subscribe Post Reply

3610

Pavel Minaev

On Aug 14, 7:32*am, "jehugalea...@gmail.com" <jehugalea...@gmail.com>
wrote:

Hello:

I am writing a cute little class that will cache queries against a
database. Currently, I am implementing this by storing the command
text, parameter values and generated DataRows.

First of all, is there something that does this already?

If you're doing it solely for performance reasons, then yes. The
database itself is likely to cache query results.

Second, right now I use a struct called Result that looks like this:

private struct Result
{
* * public object[] ParameterValues;
* * public DataRow[] Rows;

}

I am looking up past queries by storing the queries in a
Dictionary<string, List<Result>>, where the string key is the command
text and the List<Resultvalue is a collection of all the parameter
values and the generated DataRows that have been queried thus far.

Is there a way to improve the performance of this search? I am looking
for ideas. Currently, I have to loop though all the Results in the
List<Resultand compare against each of the parameter values. Is
there a way to optimize this algorithm while keeping things generic? I
can't even go against a primary key, since this class is meant to work
with anything.

Since you're effectively looking up by CommandText and Parameters, why
not aggregate them together into a single class, and use that as a key
of your Dictionary? Then your search will be just a single hashtable
lookup, without the following linear scan (though GetHashCode() would
be somewhat slower).

Aug 14 '08 #2

Ignacio Machin ( .NET/ C# MVP )

On Aug 13, 11:32*pm, "jehugalea...@gmail.com" <jehugalea...@gmail.com>
wrote:

Hello:

I am writing a cute little class that will cache queries against a
database. Currently, I am implementing this by storing the command
text, parameter values and generated DataRows.

First of all, is there something that does this already?

Second, right now I use a struct called Result that looks like this:

private struct Result
{
* * public object[] ParameterValues;
* * public DataRow[] Rows;

}

I am looking up past queries by storing the queries in a
Dictionary<string, List<Result>>, where the string key is the command
text and the List<Resultvalue is a collection of all the parameter
values and the generated DataRows that have been queried thus far.

Is there a way to improve the performance of this search? I am looking
for ideas. Currently, I have to loop though all the Results in the
List<Resultand compare against each of the parameter values. Is
there a way to optimize this algorithm while keeping things generic? I
can't even go against a primary key, since this class is meant to work
with anything.

Essentially, this class is worthless if it takes longer to find the
cached results than it would be to just rehit the database.

Thanks,
Travis

honestly, unless that you have a serious performance hit by accessing
the DB you should go to it.

if your approach is faster depend of several things like how many
combinations of command/parameters you will have, how fast is your DB
connection, etc

Aug 14 '08 #3

jehugaleahsa

On Aug 14, 2:15 am, Pavel Minaev <int...@gmail.comwrote:

On Aug 14, 7:32 am, "jehugalea...@gmail.com" <jehugalea...@gmail.com>
wrote:

Hello:

I am writing a cute little class that will cache queries against a
database. Currently, I am implementing this by storing the command
text, parameter values and generated DataRows.

First of all, is there something that does this already?

If you're doing it solely for performance reasons, then yes. The
database itself is likely to cache query results.

Second, right now I use a struct called Result that looks like this:

private struct Result
{
public object[] ParameterValues;
public DataRow[] Rows;

}

I am looking up past queries by storing the queries in a
Dictionary<string, List<Result>>, where the string key is the command
text and the List<Resultvalue is a collection of all the parameter
values and the generated DataRows that have been queried thus far.

Is there a way to improve the performance of this search? I am looking
for ideas. Currently, I have to loop though all the Results in the
List<Resultand compare against each of the parameter values. Is
there a way to optimize this algorithm while keeping things generic? I
can't even go against a primary key, since this class is meant to work
with anything.

Since you're effectively looking up by CommandText and Parameters, why
not aggregate them together into a single class, and use that as a key
of your Dictionary? Then your search will be just a single hashtable
lookup, without the following linear scan (though GetHashCode() would
be somewhat slower).- Hide quoted text -

- Show quoted text -

How do I implement such a class? How can I be sure that my hash codes
don't overlap? I mean, I don't know how many parameter values there
may be. Won't this be an issue?

Or is it safe to iterate over each parameter and the command text and
just xor the hash codes?

Aug 14 '08 #4

Alvin Bruney [ASP.NET MVP]

You have a design issue here that you aren't addressing. Caching the queries
brings no conceivable advantage - your rdbms already does that for you by
preserving and optimizing the query. Also, query construction is simply a
matter of string concatenation which should, all things considered, rival a
hash table look up and load.

You should cache the results of the query since that is where the expense
lies. Opening a connection and transferring data is what is expensive.

--

Regards,
Alvin Bruney [MVP ASP.NET]

[Shameless Author plug]
Download OWC Black Book, 2nd Edition
Exclusively on www.lulu.com/owc $15.00
Need a free copy of VSTS 2008 w/ MSDN Premium?
http://msmvps.com/blogs/alvin/Default.aspx
-------------------------------------------------------
<je**********@gmail.comwrote in message
news:fd**********************************@d1g2000h sg.googlegroups.com...

On Aug 14, 2:15 am, Pavel Minaev <int...@gmail.comwrote:
>On Aug 14, 7:32 am, "jehugalea...@gmail.com" <jehugalea...@gmail.com>
wrote:

Hello:

I am writing a cute little class that will cache queries against a
database. Currently, I am implementing this by storing the command
text, parameter values and generated DataRows.

First of all, is there something that does this already?

If you're doing it solely for performance reasons, then yes. The
database itself is likely to cache query results.

Second, right now I use a struct called Result that looks like this:

private struct Result
{
public object[] ParameterValues;
public DataRow[] Rows;

}

I am looking up past queries by storing the queries in a
Dictionary<string, List<Result>>, where the string key is the command
text and the List<Resultvalue is a collection of all the parameter
values and the generated DataRows that have been queried thus far.

Is there a way to improve the performance of this search? I am looking
for ideas. Currently, I have to loop though all the Results in the
List<Resultand compare against each of the parameter values. Is
there a way to optimize this algorithm while keeping things generic? I
can't even go against a primary key, since this class is meant to work
with anything.

Since you're effectively looking up by CommandText and Parameters, why
not aggregate them together into a single class, and use that as a key
of your Dictionary? Then your search will be just a single hashtable
lookup, without the following linear scan (though GetHashCode() would
be somewhat slower).- Hide quoted text -

- Show quoted text -

How do I implement such a class? How can I be sure that my hash codes
don't overlap? I mean, I don't know how many parameter values there
may be. Won't this be an issue?

Or is it safe to iterate over each parameter and the command text and
just xor the hash codes?

Aug 14 '08 #5

jehugaleahsa

On Aug 14, 4:57*pm, "Alvin Bruney [ASP.NET MVP]" <vapor dan using hot
male spam filterwrote:

You have a design issue here that you aren't addressing. Caching the queries
brings no conceivable advantage - your rdbms already does that for you by
preserving and optimizing the query. Also, query construction is simply a
matter of string concatenation which should, all things considered, rivala
hash table look up and load.

You should cache the results of the query since that is where the expense
lies. Opening a connection and transferring data is what is expensive.

--

Regards,
Alvin Bruney [MVP ASP.NET]

I am caching the result of the query in a DataTable. But, how do you
know which results belong to which query? I could dump down everything
and then work entirely in my DataTable, but that is not a generic
solution. The table could be huge or I don't want to spend a
considerable time building the table at start up.

If I can reduce the time it takes to find older results, then I should
be able use a little more memory and save a hit or two to the database.

Aug 15 '08 #6

jehugaleahsa

An interesting thing to note: I went ahead and took Pavel Minaev's
suggestion to make a Hashable class. It was a great idea, once I
realized clashing didn't matter (just implement Equals). When running
a few examples, I was able to get some statistics. The cacher really
only helps if there is a moderate number of parameters (which is
usually the case) and the parameter values fall within a moderately-
sized range (which is also usual). The memory usage was actually very
low. I might enact a expiration policy.

So really, the value of the cacher hinges on the rate at which
parameter values repeat. It makes sense since the cacher can't cache
something unless it has already been queried. Duh.

Aug 15 '08 #7

Obaum1

On Aug 14, 6:32*am, "jehugalea...@gmail.com" <jehugalea...@gmail.com>
wrote:

Hello:

I am writing a cute little class that will cache queries against a
database. Currently, I am implementing this by storing the command
text, parameter values and generated DataRows.

First of all, is there something that does this already?

Second, right now I use a struct called Result that looks like this:

private struct Result
{
* * public object[] ParameterValues;
* * public DataRow[] Rows;

}

I am looking up past queries by storing the queries in a
Dictionary<string, List<Result>>, where the string key is the command
text and the List<Resultvalue is a collection of all the parameter
values and the generated DataRows that have been queried thus far.

Is there a way to improve the performance of this search? I am looking
for ideas. Currently, I have to loop though all the Results in the
List<Resultand compare against each of the parameter values. Is
there a way to optimize this algorithm while keeping things generic? I
can't even go against a primary key, since this class is meant to work
with anything.

Essentially, this class is worthless if it takes longer to find the
cached results than it would be to just rehit the database.

Thanks,
Travis

Hello

If i understand right you need Linq to something
the select of the linq is very fast
I need more details

Aug 15 '08 #8

Pavel Minaev

On Aug 15, 6:11*am, "jehugalea...@gmail.com" <jehugalea...@gmail.com>
wrote:

An interesting thing to note: I went ahead and took Pavel Minaev's
suggestion to make a Hashable class. It was a great idea, once I
realized clashing didn't matter (just implement Equals).

Erm... you _do_ need to implement GetHashCode() if you provide your
own Equals(). Yes, your GetHashCode() can return equal values for non-
equal objects without breaking anything - but it should never return
non-equal values for equal objects, otherwise any hash-based container
will break! And if you override Equals() to use data equality, but
leave the default GetHashCode(), then that is precisely what will
happen...

Aug 15 '08 #9

jehugaleahsa

On Aug 15, 12:58*pm, Pavel Minaev <int...@gmail.comwrote:

On Aug 15, 6:11*am, "jehugalea...@gmail.com" <jehugalea...@gmail.com>
wrote:

An interesting thing to note: I went ahead and took Pavel Minaev's
suggestion to make a Hashable class. It was a great idea, once I
realized clashing didn't matter (just implement Equals).

Erm... you _do_ need to implement GetHashCode() if you provide your
own Equals(). Yes, your GetHashCode() can return equal values for non-
equal objects without breaking anything - but it should never return
non-equal values for equal objects, otherwise any hash-based container
will break! And if you override Equals() to use data equality, but
leave the default GetHashCode(), then that is precisely what will
happen...

I went ahead and did some additional analysis. The class itself is
really only useful in a very small number of cases. The class comes
with the requirement that data doesn't change a lot, since keeping a
local copy around will eventually lead outdated data. The class also
have to have an extremely limited number of inputs. The values of
parameters must cluster. As the values approach being truly random,
the range of values must be fractions smaller than the number of
reads. Also, the number of parameters also have to be small because as
each value is added, the probability of hitting a cached query grows
substantially. Say parameter one usually tends to fall between 0-99
and the next parameter falls between 0-9. The likelihood of a value
being the cache in a random environment approaches 99 * 9. Take into
account that missed caches still require a hash table hit. From my
analysis, half the processing time is performed by the caching
algorithm performing searches (which is amortized constant), rather
than simply hitting the database and allowing it to perform the
majority of the work, that is in a random environment. Finally,
deleted records need to be handled appropriately. It doesn't make
sense to return a deleted row, so the cacher probably should remove
them before returning the results. This requires a linear search every
time results are returned, since the cache keeps track of the DataRow
directly and doesn't have another means of keeping their values in
tact. The only other work around is to create an expression for
searching within the DataTable that would yield the same results.
However, creating such an expression would probably have even more of
a negative affect on runtime performance (as well as only be partially
implementable).

I think what I am getting at is this: it probably makes more sense to
use an Identity Map (a class for caching results by primary key) when
possible and to just hit the database directly otherwise. With the
primary key, seaches can be performed against the DataTable, which
means it will probably be fast and deal with deleted DataRows
automatically. Of course, I would like to do some more research on a
cacher that works with foreign keys, since that seems slightly more
useful.

Thanks for everyone's input.

Aug 18 '08 #10

Similar topics

caching the sql queries

by: JZ | last post by:

Is there any equivalent of caching sql queries for Python like ADOdb for PHP? (For those who did not hear about ADODb and its feature : http://phplens.com/adodb/caching.of.recordsets.html) -- JZ

Python

R: caching the sql queries

by: Luca Calderano | last post by:

The Only way is using Com S.G.A S.p.A. Nucleo Sistemi Informativi Luca Calderano -----Messaggio originale----- Da: python-list-admin@python.org Per conto di JZ

Python

I: caching the sql queries

by: Luca Calderano | last post by:

I MEAN YOU CAN ACCESS ADO TROUGHT COM AS EXPLAINED HERE: import win32com.client import locale from os import sys adCmdStoredProc = 4 # Evaluates CommandText as a stored procedure...

Python

Caching a dataset

by: Craig G | last post by:

im having my first play about with caching a dataset but seem to be having some problems in doing so!! basically i have a dataset which i add to the cache as follows Cache.Insert("myDS",...

ASP.NET

Page output caching for database driven pages?

by: Ben Fidge | last post by:

What are best practices for page output caching on pages that are dynamically generated from database tables. Our site has left-hand navigation that is comprised of dynamically generated menus...

ASP.NET

how to cache part of a page without using fragment caching

by: sam | last post by:

Hi, I know fragment caching where you can use a cached user control in a non-cached page (or the opposite) But without using a user control, how can I cache just a part of my page (two...

ASP.NET

Caching query results: PHP caching vs MySQL caching

by: Hermann | last post by:

My site is a bit slow showing the main page so I thought caching query result in PHP will improve performace. Then I read MySQL documentation and saw that MySQL does have a caching feature. So......

PHP

Adding Queries within the DataSet Designer (VS2005)

by: daokfella | last post by:

Hi all, I added a dataset to my project and dragged over a table from server explorer. I modified the dataadapter to use existing stored procedures for the Get, Insert, Update and Delete. I've...

ASP.NET

Data caching question

by: Gary W. Smith | last post by:

I had a couple questions about data caching. We have a site that gets a huge amount of traffic to a few specific pages that have a lot of data on them (300k hits/hour during peak, about 6-10 data...

ASP.NET

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++