Lookup caching

Andrea Griffini

Hello,

I implemented that crazy idea and seems working... in its
current hacked state can still pass the test suite (exluding
the tests that don't like self generated output on stdout
from python) and the stats after the quicktest are IMO
impressing:

LOAD_GLOBAL = 13666473
globals miss = 58988
builtins = 8246184
builtins miss = 32001

LOAD_GLOBAL is the total number of times the pseudocode
instruction was executed.

globals miss is the number of time the actual lookup
on globals had to be perfomed. Note that if the lookup
wasn't done because the name was known to be absent
from globals still it's considered a cache hit, not miss.

builtins is the total number of times the builtins dict
has to be searched (because the name was not in globals)

builtin miss is the number of real builtin searches
that were performed (in other cases the lookup cache for
builtins found the answer - either positive or negative).

To me seems a promising idea, i've still to clean up
the code and to make serious speed tests (the "make test"
itself seems a lot more "fluid", but it could be just
self hypnotization :-D ).

The LOAD_GLOBAL code is actually simpler because I
resorted to a regular non-inlined lookup in case of
a cache miss. There's no reason to do that however...

Also the same approach could be used for other lookups
that get the name from co->co_names.
Andrea

Dec 10 '06 #1

Subscribe Post Reply

1470

Gabriel Genellina

At Saturday 9/12/2006 23:04, Andrea Griffini wrote:

>I implemented that crazy idea and seems working... in its
current hacked state can still pass the test suite (exluding

What crazy idea? And what is this supposed to do?
--
Gabriel Genellina
Softlab SRL

__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar

Dec 11 '06 #2

Andrea Griffini

Gabriel Genellina wrote:

At Saturday 9/12/2006 23:04, Andrea Griffini wrote:

>I implemented that crazy idea and seems working... in its
current hacked state can still pass the test suite (exluding

What crazy idea? And what is this supposed to do?

The idea is to avoid looking up constants several times
into dictionaries that didn't change (e.g. builtins).

Reading a bit about other optimization proposals I didn't
find a similar one so I decided to invest some spare time
in it. The idea is

1) Add a "timestamp" to dictionaries, so when a dictionary
is changed the timestamp gets updated

2) Store a cached lookup for constants; the cached lookup
is stored as a timestamp value and a naked pointer to
the result. The algorithm for the lookup of a given
constant is:

if ( <<cached_timestamp>== d->timestamp)
{
x = <<cached_value>>;
}
else
{
x = PyDict_GetItem(d, key);
<<cached_timestamp>= d->timestamp;
<<cached_value>= x;
}

using a naked pointer is safe because it will be used
only if the dictionary wasn't touched, hence the value
is surely still alive.

The original place I thought about where to store the
cached lookup was the bytecode, however after reading
python sources I resorted instead to a dedicated space
inside the code object. The code for LOAD_GLOBAL uses
something like

if (co->co_cachedtstamps[oparg] == d->timestamp)
...

i.e. I used an array indexed by the index of the co_name
used for lookups.

The patched code is currently working, however I found
that while the hit/miss ratios are impressive (as I
expected) the speedup is simply absent. Moreover there
is no difference at all between paying for the timestamp
handling and NOT using the cached lookups or instead
paying AND using the cached lookups (!).
Absurdely python on my PC runs faster if I in addition
to the cached lookup code also leave in place the
hit/miss statistics (a few static ints increment and
a static atexit-ed output function).
Also it made a lot of difference about where the
timestamp was placed inside the dictobject structure...

In addition to the not impressive results (my patched
python now is just a bit *slower* than original one :-D)
there is also another complication. The LOAD_GLOBAL
actually needs TWO lookups, so I used two cached results
(one for globals, one for builtins).
The ideal solution however IMO would be in this case
to have two timestamps and one cached value instead...
(if neither dict was touched since last lookup then the
result will be the cached one).

The complication is that a lot of lookups are done by
the LOAD_ATTR instead and thinking to the ideal solution
for new classes made my brain explode (mro, descriptor
and stuff...). It would be simple to do something for
classic classes, but would that be worth (i mean...
aren't those going to disappear ?).

Probably something can be done for using caches for
LOAD_ATTR for modules (to speed up a bit things like
math.sin or mod1.mod2.mod3.func).

Any suggestion is welcome...

Andrea

Dec 11 '06 #3

MRAB

Andrea Griffini wrote:

Gabriel Genellina wrote:
At Saturday 9/12/2006 23:04, Andrea Griffini wrote:

I implemented that crazy idea and seems working... in its
current hacked state can still pass the test suite (exluding
What crazy idea? And what is this supposed to do?
The idea is to avoid looking up constants several times
into dictionaries that didn't change (e.g. builtins).

Reading a bit about other optimization proposals I didn't
find a similar one so I decided to invest some spare time
in it. The idea is

1) Add a "timestamp" to dictionaries, so when a dictionary
is changed the timestamp gets updated

2) Store a cached lookup for constants; the cached lookup
is stored as a timestamp value and a naked pointer to
the result. The algorithm for the lookup of a given
constant is:

if ( <<cached_timestamp>== d->timestamp)
{
x = <<cached_value>>;
}
else
{
x = PyDict_GetItem(d, key);
<<cached_timestamp>= d->timestamp;
<<cached_value>= x;
}

using a naked pointer is safe because it will be used
only if the dictionary wasn't touched, hence the value
is surely still alive.

The original place I thought about where to store the
cached lookup was the bytecode, however after reading
python sources I resorted instead to a dedicated space
inside the code object. The code for LOAD_GLOBAL uses
something like

if (co->co_cachedtstamps[oparg] == d->timestamp)
...

i.e. I used an array indexed by the index of the co_name
used for lookups.

The patched code is currently working, however I found
that while the hit/miss ratios are impressive (as I
expected) the speedup is simply absent. Moreover there
is no difference at all between paying for the timestamp
handling and NOT using the cached lookups or instead
paying AND using the cached lookups (!).
Absurdely python on my PC runs faster if I in addition
to the cached lookup code also leave in place the
hit/miss statistics (a few static ints increment and
a static atexit-ed output function).
Also it made a lot of difference about where the
timestamp was placed inside the dictobject structure...

In addition to the not impressive results (my patched
python now is just a bit *slower* than original one :-D)
there is also another complication. The LOAD_GLOBAL
actually needs TWO lookups, so I used two cached results
(one for globals, one for builtins).
The ideal solution however IMO would be in this case
to have two timestamps and one cached value instead...
(if neither dict was touched since last lookup then the
result will be the cached one).

[snip]
What are you using for the timestamp? Are you calling a function to
read a timer?

If so, you could try something that's 'cheaper' like a modification
counter instead, ie a counter that's incremented each time the dict is
modified.

Dec 12 '06 #4

Andrea Griffini

MRAB wrote:

....

What are you using for the timestamp? Are you calling a function to
read a timer?

For timestamp I used a static variable; to update the timestamp for
a dictionary I used

d->timestamp = ++global_dict_timestamp;

I'm using a single counter for all dicts so that when doing the check
for cached value validity I'm checking at the same time that the dict
is the same dict was used and that it wasn't touched since when the
lookup result was stored.

Using this approach and tweaking the LOAD_GLOBAL double lookup for
using a "two timestamps one value" cache I got an 8%-10% speedup
(depending on the compiler options when building python) in a real
application of mines and about 5% in a small test.

I've yet to try more complex applications (the biggest real
application I have however requires a lot of external modules
so testing the speed gain with that will require a lot more work
or just downgrading the python version to 2.4).

Also I'm using an 32 bit int for timestamp... I wonder if I should
listen to the paranoid in my head that is crying for 64 instead.
Andrea

Dec 12 '06 #5

Similar topics

Need an alternative to Lookup fields

by: google | last post by:

I have a database with four table. In one of the tables, I use about five lookup fields to get populate their dropdown list. I have read that lookup fields are really bad and may cause problems...

Microsoft Access / VBA

Fragment Caching inside page caching?

by: Troy Simpson | last post by:

Hi, I have a website which is made up of dynamic pages. Each page that's loaded has some code which looks at which template to load amongst other things, which causes the page to take a little...

ASP.NET

Help with a reverse dictionary lookup

by: rh0dium | last post by:

Hi all, I have a dict which looks like this.. dict={'130nm': {'umc': }, '180nm': {'chartered': , 'tsmc': }, '250nm': {'umc': , 'tsmc': } }

Python

db2 File system caching ..DIO or CIO??

by: Raj | last post by:

What is the purpose of file system caching while creating a tablespace? Memory on the test server gets used up pretty quickly after a user executes a complex query(database is already activated),...

DB2 Database

Cross Session Error? Session state and kernel-mode output caching don't mix

by: jason | last post by:

hi experts, support.microsoft.com/kb/917072 and http://msdn.microsoft.com/msdnmag/issues/06/07/WebAppFollies/ As pointed out in these articles, users might get session variables belong to...

ASP.NET

Caching query results: PHP caching vs MySQL caching

by: Hermann | last post by:

My site is a bit slow showing the main page so I thought caching query result in PHP will improve performace. Then I read MySQL documentation and saw that MySQL does have a caching feature. So......

PHP

Reverse DNS Lookup - Need Ability to specify which DNS Server is contacted.

by: Screenbert | last post by:

I have a web page that displays and creates DNS entries. It displays DNS entries if they exist for certain IP ranges. (Usually about 20 IP address at a time) If a DNS entry needs to be created then...

C# / C Sharp

Autocomplete with big lookup table

by: Andrus | last post by:

I'm creating a database Winforms application using VCS Express 2005 I have some large lookup tables (may be up to 500000 records) which contains name and id and are stored in sql server. I...

C# / C Sharp

.NEt remote event and DNS reverse lookup

by: schaf | last post by:

Hi NG! I have a problem in my remote application. After calling a remote function the calculation will be done by the service. The calculation result will be sent to the caller (client) via...

C# / C Sharp

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

One-click Importing Excel Data into a*Database

by: ryjfgjl | last post by:

In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...

Microsoft Excel

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++