persistent deque (continued)

castironpi

Some time ago, I was asking about the feasibility of a persistent
deque, a double-ended queue.

It runs into the typical space allocation problems. If you're storing
a pickle, you have to allocate and fragment the file you've opened,
since pickles can be variable-length strings; i.e. if the new data is
too long, blank out its entry, and grow the file. If you're storing a
data-type, you lose Python's dynamic-type advantages, as well as its
long integers, as they can be any length. If you change the object in
the deque, such as when using any mutable type, you have to update the
container too.

Does anyone have any experience storing pickles (I am aware of the
similarities to shelf) to a database?
Can the file system facilitate the variable-length string problem?
How feasible is a length-of-length - length - data solution to the
unbounded integer problem?
Is there any alternative to completely re-pickling a large (say 1k
pickled) object you only change slightly?
What other issues are there?
Is a hybrid-storage type possible, that stores the contents of its
Python-allocated memory block to disk, including reference count, even
if it's a lot slower? The object could not contain any references to
objects not allocated on disk.

A first approach is for the file to look like this:

00 data 01 data 02
01 data 03 data 04
02 data 05 data 06

Append would add:

03 data 07 data 08

AppendLeft would add:

-01 data 09 data 0a

Pop would remove 03, PopLeft would remove -01. You would need a
length-and-head index to make 'rotate' available. Remove would run a
worst-case risk of renumbering half of the indices stored, plus a
rotate.

Jul 21 '08 #1

Subscribe Post Reply

2379

Larry Bates

castironpi wrote:

Some time ago, I was asking about the feasibility of a persistent
deque, a double-ended queue.

It runs into the typical space allocation problems. If you're storing
a pickle, you have to allocate and fragment the file you've opened,
since pickles can be variable-length strings; i.e. if the new data is
too long, blank out its entry, and grow the file. If you're storing a
data-type, you lose Python's dynamic-type advantages, as well as its
long integers, as they can be any length. If you change the object in
the deque, such as when using any mutable type, you have to update the
container too.

Does anyone have any experience storing pickles (I am aware of the
similarities to shelf) to a database?
Can the file system facilitate the variable-length string problem?
How feasible is a length-of-length - length - data solution to the
unbounded integer problem?
Is there any alternative to completely re-pickling a large (say 1k
pickled) object you only change slightly?
What other issues are there?
Is a hybrid-storage type possible, that stores the contents of its
Python-allocated memory block to disk, including reference count, even
if it's a lot slower? The object could not contain any references to
objects not allocated on disk.

A first approach is for the file to look like this:

00 data 01 data 02
01 data 03 data 04
02 data 05 data 06

Append would add:

03 data 07 data 08

AppendLeft would add:

-01 data 09 data 0a

Pop would remove 03, PopLeft would remove -01. You would need a
length-and-head index to make 'rotate' available. Remove would run a
worst-case risk of renumbering half of the indices stored, plus a
rotate.

It is so much easier to implement this using a database table that IMHO most
people would go that route.

-Larry

Jul 21 '08 #2

M.-A. Lemburg

On 2008-07-21 21:08, castironpi wrote:

Some time ago, I was asking about the feasibility of a persistent
deque, a double-ended queue.

You might want to have a look at mxBeeBase:

http://www.egenix.com/products/python/mxBase/mxBeeBase/

Using the integer index you could probably write an on-disk
dequeue.

The B*Tree indexes available in mxBeeBase keep the indexes sorted,
so you'd only have to keep incrementing the index as you add new
values and keep a record of the highest and lowest index currently
in use.

Details are left as exercise for the interested reader ;-)

It runs into the typical space allocation problems. If you're storing
a pickle, you have to allocate and fragment the file you've opened,
since pickles can be variable-length strings; i.e. if the new data is
too long, blank out its entry, and grow the file. If you're storing a
data-type, you lose Python's dynamic-type advantages, as well as its
long integers, as they can be any length. If you change the object in
the deque, such as when using any mutable type, you have to update the
container too.

Does anyone have any experience storing pickles (I am aware of the
similarities to shelf) to a database?
Can the file system facilitate the variable-length string problem?
How feasible is a length-of-length - length - data solution to the
unbounded integer problem?
Is there any alternative to completely re-pickling a large (say 1k
pickled) object you only change slightly?
What other issues are there?
Is a hybrid-storage type possible, that stores the contents of its
Python-allocated memory block to disk, including reference count, even
if it's a lot slower? The object could not contain any references to
objects not allocated on disk.

A first approach is for the file to look like this:

00 data 01 data 02
01 data 03 data 04
02 data 05 data 06

Append would add:

03 data 07 data 08

AppendLeft would add:

-01 data 09 data 0a

Pop would remove 03, PopLeft would remove -01. You would need a
length-and-head index to make 'rotate' available. Remove would run a
worst-case risk of renumbering half of the indices stored, plus a
rotate.
--
http://mail.python.org/mailman/listinfo/python-list

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Jul 21 2008)

>>Python/Zope Consulting and Support ... http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/

__________________________________________________ ______________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611

Jul 21 '08 #3

Raymond Hettinger

On Jul 21, 12:08*pm, castironpi <castiro...@gmail.comwrote:

Some time ago, I was asking about the feasibility of a persistent
deque, a double-ended queue.

It runs into the typical space allocation problems. *

Try starting with a dict-based implementation of a double-ended queue
( http://aspn.activestate.com/ASPN/Coo.../Recipe/259179 )
and then replace the dict with a shelf. Presto, you've got a
persistent deque.

Raymond

Jul 21 '08 #4

castironpi

On Jul 21, 5:20*pm, Raymond Hettinger <pyt...@rcn.comwrote:

On Jul 21, 12:08*pm, castironpi <castiro...@gmail.comwrote:

Some time ago, I was asking about the feasibility of a persistent
deque, a double-ended queue.

It runs into the typical space allocation problems. *

Try starting with a dict-based implementation of a double-ended queue
(http://aspn.activestate.com/ASPN/Coo.../Recipe/259179)
and then replace the dict with a shelf. *Presto, you've got a
persistent deque.

Raymond

We have the cookbook recipe Raymond brings up, and an automatic
pickler that inhahe brought up in May.

I've initiated the approach to the solution I have in mind, but the
dilemma comes early and isn't pleasant. I can either (i) recompile
the whole interpreter, replacing malloc with a custom function that
allocates from disk; or (ii) duplicate every class I want to be
persistable, making every change to the allocations by hand.

Disk-resident objects can clearly not contain references to any
objects not on disk (arguably at finalization-stage only), and
references to ones that are need to be wrapped by offset, and 'made
live' at initialization-time, so that references to the mapped file
resume succeeding after termination and restart--- the mapped memory
segment won't be the same from run to run.

Further, if I can't hijack the existing arena-pool model native in
Python, it would take writing a dedicated memory manager.

Assuming the whole memory isn't moved to disk, which is feasible, the
usage would look like this, to ensure that 'disk-resident' objects
don't refer to 'live' ones.

listA= perst( [ perst( 0 ), perst( 1 ), perst( "abc" ) ] )

to create the list [ 0, 1, "abc" ]. Modifications, e.g. append "def",
would look like this:

listA.append( perst( "def" ) )

and would execute the addition directly to mapped memory. The memory
file would then have five objects: 0, 1, "abc", "def", and this list.

It is not clear that normal assignments would still work, self.b= c,
even for disk-resident c, since on a later run, 'self.b' would refer
to a different place in memory. It might take either trapping updates
to self.__dict__, or something deeper at the level of the ast and
Assm't statements. Or an extra level of indirection in accessing it.

>>self.b

<C instance at index 12 in mem-mapped file 'data.dat'>

The indirection would consist of an offset-finder and a delegation
layer.

>>self.b.append( perst( set( ) ) )

It would first have to look-up 'self.b', as opposed to accessing it
directly by memory address.

What I want from the group is to know if I've assessed the options for
a model of persistence accurately. Or if I haven't depicted what I'm
going for clearly. It seems enormous and if it is this hard, I doubt
I'll produce anything other than a few correct persistent types.

Jul 27 '08 #5

Similar topics

queue, deque, priority_queue

by: newsock | last post by:

What differences between queue, deque and priority_queue? And under what situations choose one of them to use? Thanks for your help!

C / C++

how to store different types of data in deque?

by: Jenny | last post by:

Hi, I have a class foo which will construct some objects in my code. some of the objects store int values into the data deque, while others store float values to the deque. template <class...

C / C++

how much memory does an empty STL deque occupy?

by: R.Z. | last post by:

i was wondering whether it pays off in terms of memory use to maintain lots of empty deques (it would be convenient for my algorithms but memory use is more important). and does the size of a deque...

C / C++

std::deque< class > member variable within class - A little help please

by: Dan Trowbridge | last post by:

He everyone, I am just getting started with .NET and I am having a porting problem. I get and error in code that lookssomething like this (really stripped down but you get the idea)... class...

.NET Framework

Popping from the middle of a deque + deque rotation speed

by: Russell Warren | last post by:

Does anyone have an easier/faster/better way of popping from the middle of a deque than this? class mydeque(deque): def popmiddle(self, pos): self.rotate(-pos) ret = self.popleft()...

Python

should we prefer deque over vector

by: yancheng.cheok | last post by:

after reading http://www.codeproject.com/vcpp/stl/vector_vs_deque.asp, i realize that deque has its own speed advantage over vector in all the aspect (except for memory deallocation). does it...

C / C++

Program with deque using Visual C++

by: Homeworkboy | last post by:

Can anyone help me with this program? Look at the bottom of this program for help methods: /*1. Make a program that uses numbers from 1 to 100 including each ends which puts the even...

C / C++

A deque containing different types of objects (with a common baseclass)

by: Juha Nieminen | last post by:

I'm sure this is not a new idea, but I have never heard about it before. I'm wondering if this could work: Assume that you have a common base class and a bunch of classes derived from it, and...

C / C++

persistent deque

by: castironpi | last post by:

I'd like a persistent deque, such that the instance operations all commit atomicly to file system.

Python

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA