473,394 Members | 1,867 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Serious problem with Shelve

Hi - this mysterious behavior with shelve is just about to kill me. I
hope someone here can shed some light. First of all, I have this piece
of code which uses shelve to save instances of some class I define. It
works perfectly on an old machine (PII-400) running Python 2.2.1 under
RedHat Linux 8.0. When I try to run it under Python for windows ME on a
P-4 1.4 GHz, however, it keeps crashing on reading from the shelved file
the second time I try to access it. The Windows machine was originally
running python 1.5.2, so I upgraded to 2.2.3, thinking that would solve
the problem, but it didn't!

This is what the error looks like:
tmprec = myrecs[key]
File "D:\PROGRAMS\PYTHON22\lib\shelve.py", line 70, in __getitem__
f = StringIO(self.dict[key])
KeyError: A_G_08631616188
^

Notes:
Here's what my program does (it is too much code to include here).
I have 4 related modules: one containing the class definitions (in all
other modules I use from classfile import ___); the second module builds
the shelve file by parsing a large text file containing the data,
building classes; the third re-opens the file later to do reading and
writing operations; and the 4th module is a GUI controller that simple
calls the appropriate functions from the other 2 modules.

The main breakdown occurs in module 3. Significantly, I initially had
this module set up as a script in which everything was done on the
module level, and it was working fine (apparently). The problems
started appearing when I wrapped code inside functions (I need to do
that since I want to call it from other modules, and I have about 4000
lines of code altogether!). I spent painstaking hours trying to isolate
the problem - I pass the open shelve file as a parameter to all the
functions that need it, and I close it properly using try: finally
statements after every use. I also make sure all the keys that go in
there are unique.

What module 3 does is a series of short reads and writes to the shelve
file. First I test if a particular key is in there - if it is not, I
add an item, if it is, I read the existing item, update it, then write
it back like this:

tmprec = myrecs[key] # I read a particular instance from the shelve
file
tmprec.field = 1 # I update one field
#del myrevs[key] # Commented lines are things I tried while
debugging
#myrecs.sync() #
myrecs[key] = tmprec # Then I write it back to the shelve file
#myrecs.sync()

This one function apppears to be the guilty party. When I comment it
out the crash stops. However it is a vital function for my program and
I need to do it. Note that deleting the original item before reqwriting
it helped reduce the frequency of crashes, but didn't eliminate it
completely. The other possibility (which is why I unsuccessfully tried
the .sync() lines) is that it has to do with the timing of writing to
disk. The library reference is vague about this, saying that shelve is
incapable of simultanteous reads and writes, so the file shouldn't be
opened twice for write. However it does not say whether this implies we
cannot read and write like this in quick succession.

More details:
* The first run of module 3 after creating the shelve file doesn't
crash, although I suspect it is doing something funny.
* The second time I get that error above, keeping in mind I am supposed
to have a key in there called "A_G_0863161618" (without the extra '8' at
the end), so the database is already corrupted. So the key
'A_G_08631616188' is in myshelvefile.keys(), the original is no more,
yet NEITHER can be accesed using myshelvefile[key]!
* After creation, the shelve file size is only 71 kB. After running
module 3 - which is supposed to mostly read and not really change the
file much - the size jumps to 110 kB!
* If I open the file in a text editor, I notice all sorts of things that
are not supposed to be there (like directory paths, etc), indicating it
is corrupted. I do not see those things when I open the file on the
good (Linux) machine.
* I did a scandisk to ensure the disk is OK and it is.
Jul 18 '05 #1
6 4705
On Mon, 2003-08-18 at 03:04, Rami A. Kishek wrote:
Hi - this mysterious behavior with shelve is just about to kill me. I
hope someone here can shed some light. First of all, I have this piece
of code which uses shelve to save instances of some class I define. It
works perfectly on an old machine (PII-400) running Python 2.2.1 under
RedHat Linux 8.0. When I try to run it under Python for windows ME on a
P-4 1.4 GHz, however, it keeps crashing on reading from the shelved file
the second time I try to access it. The Windows machine was originally
running python 1.5.2, so I upgraded to 2.2.3, thinking that would solve
the problem, but it didn't!


In Python 2.2 or earlier, by default, shelve uses the Berkeley database
1.8 libraries, which we have found to be seriously broken on all
platforms we have tried them on. Upgrading to a later version of the
Berkeley libraries and using the pybsddb module fixed the mysterious,
inconsistent crashes and segfaults we were seeing with shelve (and which
were also driving us crazy). The easiest way to upgrade is to move to
Python 2.3, which includes these later versions, but you can also
easily install them under earlier version of Python (at least under
2.2).
--

Tim C

PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
or at http://members.optushome.com.au/tchur/pubkey.asc
Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQA/P+VBeJFGqer5k9ARAkuAAKD3bR7ei6rB4XT+Mk9ifT64gUEM5g CeIBwO
96YcIZ0DQ7H74iRHLkzcVlc=
=RXEg
-----END PGP SIGNATURE-----

Jul 18 '05 #2
Well - I installed Python 2.3, but it still doesn't. My program now
crashes on the first pass. After deleting the old databases and
creating new ones, I opened them for read and this is what I get:

self.revs = shelve.open(os.path.join(tgtdir, dbfn))
File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 231, in open
return DbfilenameShelf(filename, flag, protocol, writeback, binary)
File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 212, in __init__
Shelf.__init__(self, anydbm.open(filename, flag), protocol,
writeback, binary)
File "D:\PROGRAMS\PYTHON23\lib\anydbm.py", line 82, in open
mod = __import__(result)
ImportError: No module named bsddb185
I will try enclosing that import bsddb185 in anydbm.py in try: except:,
though I hate messing around with source files, and there may be many
more such problems. Python developers, be aware of this glitch.
Tim Churches wrote:
of code which uses shelve to save instances of some class I define.
it keeps crashing on reading from the shelved file
the second time I try to access it.


In Python 2.2 or earlier, by default, shelve uses the Berkeley database
1.8 libraries, which we have found to be seriously broken on all
platforms we have tried them on. Upgrading to a later version of the
Berkeley libraries and using the pybsddb module fixed the mysterious,
inconsistent crashes and segfaults we were seeing with shelve (and which
were also driving us crazy). The easiest way to upgrade is to move to
Python 2.3, which includes these later versions, but you can also
easily install them under earlier version of Python (at least under
2.2).
--

Jul 18 '05 #3
On Tue, 19 Aug 2003, Rami A. Kishek wrote:
File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 231, in open
return DbfilenameShelf(filename, flag, protocol, writeback, binary)
File "D:\PROGRAMS\PYTHON23\lib\shelve.py", line 212, in __init__
Shelf.__init__(self, anydbm.open(filename, flag), protocol,
writeback, binary)
File "D:\PROGRAMS\PYTHON23\lib\anydbm.py", line 80, in open
raise error, "db type could not be determined"
error: db type could not be determined

Incidentally, on the other machine I mentioned (the one on which shelve
worked perfectly with 2.2.3) shelve still works perfectly after
upgrading to 2.3. Since that is a Linux 2 machine, I figure perhaps it
is using a different db like gdbm or something ...


Your shelve file is in DB v1.85 format. Commenting out the lines in
which.py didn't do anything except deny the shelve module information
about what the format actually _is_.

You'll need to find/build a v1.85 compatible module to read the shelve
then write it out in a later format.

--
Andrew I MacIntyre "These thoughts are mine alone..."
E-mail: an*****@bullseye.apana.org.au (pref) | Snail: PO Box 370
an*****@pcug.org.au (alt) | Belconnen ACT 2616
Web: http://www.andymac.org/ | Australia

Jul 18 '05 #4

Rami> Well - I installed Python 2.3, but it still doesn't. My program
Rami> now crashes on the first pass. After deleting the old databases
Rami> and creating new ones, I opened them for read and this is what I
Rami> get:

How did you create those new databases, using an older version of Python
perhaps? What's happening is that whichdb.whichdb() determined that the
file you passed into anydbm.open() was an old hash style database, which can
only be opened in Python 2.3 by the old v 1.85 library, which is only
exposed through the bsddb185 module.

Rami> I will try enclosing that import bsddb185 in anydbm.py in try:
Rami> except:, though I hate messing around with source files, and there
Rami> may be many more such problems. Python developers, be aware of
Rami> this glitch.

That won't work. What's anydbm.open() going to use to open the file?

Can you explain how the files were created? (Sorry if you explained
already. I'm just coming to this thread.)

If you have Python 2.1 or 2.2 laying around with a bsddb module which can
read the file in question, use Tools/scripts/db2pickle.py to convert the
file to a pickle, then with Python 2.3, run Tools/scripts/pickle2db.py to
convert the pickle back to a db file, using the new bsddb. Those two
scripts are in the Python 2.3 distribution, but not the Python 2.2
distribution. They should work with Python 2.1 or 2.2, however. This
problem is exactly why I wrote them.

Synopsis:

python2.2 db2pickle.py olddbfile pickle.pck
python2.3 pickle2db.py newdbfile pickle.pck

Skip

Jul 18 '05 #5

Rami> Incidentally, on the other machine I mentioned (the one on which
Rami> shelve worked perfectly with 2.2.3) shelve still works perfectly
Rami> after upgrading to 2.3. Since that is a Linux 2 machine, I figure
Rami> perhaps it is using a different db like gdbm or something ...

Try this using python 2.2.3 and python 2.3:

import whichdb
whichdb.whichdb(os.path.join(tgtdir, dbfn))

and see what it prints. That will keep you from guessing about the nature
of the file.

Skip

Jul 18 '05 #6
Thanks. With your help, I figured out one of the databases accessed WAS
created with an older Python, so I simply cleaned up that one and now
everything works!

Skip Montanaro wrote:

Andrew MacIntyre wrote:

Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: ex laguna | last post by:
Hi, I have ran into a problem with py2exe 0.5.0 and shelve in python 2.3.3. The script works fine standalone, but not with py2exe. Does anyone have a solution of workaround for this? Thanks...
1
by: Stuart Hungerford | last post by:
Hi all, I have a python module foo.py which contains: class A(object): ... class B(object): ... As well as a class that uses shelve (which in turn is using dumbdbm):
0
by: Michael Mulcahy | last post by:
Hi All, Problem: Shelve module doesn't like me OS: Win2000 version: 2.3.3 Here is simple reproduction code and the error that occurs: import shelve, anydbm
0
by: Glenn R Williams | last post by:
Has anybody gotten shelve to work under SuSE 9.1? I have SuSe 9.1, Python 2.3.3.85, db 4.2.52, and bsddb3-4.2.4. When I try to create a shelve, I get errors galore, Here's the traceback: ...
8
by: Paul Rubin | last post by:
Shelve uses dbm and pickle to make a persistent object store. The "db" in "dbm" stands for "database" and while I didn't expect full ACID capability, I'd have thought there'd be at least some...
1
by: Paul Rubin | last post by:
class x: pass z = x() z.a = 'a' d = {'a': z} for i in range(5): print id(d) prints the same id 5 times as you'd expect.
3
by: Michele Petrazzo | last post by:
Hi, I'm trying a script on a debian 3.1 that has problems on shelve library. The same script work well on a fedora 2 and I don't know why it create this problem on debian: #extract from my code...
13
by: 7stud | last post by:
test1.py: -------------------- import shelve s = shelve.open("/Users/me/2testing/dir1/aaa.txt") s = "red" s.close() --------output:------ $ python test1.py
0
by: Gabriel Genellina | last post by:
En Mon, 28 Apr 2008 02:08:31 -0300, tarun <tarundevnani@gmail.comescribió: By default, each time you do d you get a *different* object. A shelve isn't very smart: it stores a string...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.