By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,077 Members | 1,309 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,077 IT Pros & Developers. It's quick & easy.

Database vs Data Structure?

P: n/a
Hi,

I'm working on a web application where each user will be creating
several "projects" in there account, each with 1,000-50,000 objects.
Each object will consist of a unique name, an id, and some meta data.

The number of objects will grow and shrink as the user works with
their project.

I'm trying to decided whether to store the objects in the database
(each object gets it's own row) or to use some sort of data-structure
(maybe nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).

A few requirements:
-Fast/scalable (web app)
-able to query objects based on name and id.
-will play nicely with versioning (undo/redo)

Any input on the best way to go?

Thanks!
Erik
Jun 27 '08 #1
Share this Question
Share on Google+
7 Replies


P: n/a
On Apr 17, 9:30*pm, erikcw <erikwickst...@gmail.comwrote:
Hi,

I'm working on a web application where each user will be creating
several "projects" in there account, each with 1,000-50,000 objects.
Each object will consist of a unique name, an id, and some meta data.

The number of objects will grow and shrink as the user works with
their project.

I'm trying to decided whether to store the objects in the database
(each object gets it's own row) or to use some sort of data-structure
(maybe nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).

A few requirements:
-Fast/scalable (web app)
-able to query objects based on name and id.
-will play nicely with versioning (undo/redo)

Any input on the best way to go?

Thanks!
Erik
When you change an object, what will you do?

1) Changes on disk only.
2) Changes in memory only & flush.

Databases cache a binary in memory, which I find underrated.
Jun 27 '08 #2

P: n/a
I V
On Thu, 17 Apr 2008 19:30:33 -0700, erikcw wrote:
use some sort of data-structure (maybe
nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).
Why would you want to do this? I don't see what you would hope to gain by
doing this, over just using a database.
Jun 27 '08 #3

P: n/a
erikcw a écrit :
Hi,

I'm working on a web application where each user will be creating
several "projects" in there account, each with 1,000-50,000 objects.
Each object will consist of a unique name, an id, and some meta data.

The number of objects will grow and shrink as the user works with
their project.

I'm trying to decided whether to store the objects in the database
(each object gets it's own row) or to use some sort of data-structure
(maybe nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).
Yuck.

Fighting against the tool won't buy you much - except for
interoperability and maintainance headeaches. Either use your relational
database properly, or switch to an object db - like ZODB or Durus - if
you're ok with the implications (no interoperability, no simple query
langage, and possibly bad performances if your app does heavy data
processing).
A few requirements:
-Fast/scalable (web app)
-able to query objects based on name and id.
-will play nicely with versioning (undo/redo)
Versionning is a somewhat othogonal problem.
Any input on the best way to go?
My very humble opinion - based on several years of working experience
with both the Zodb and many RDBMS - is quite clear : use a RDBMS and use
it properly.
Jun 27 '08 #4

P: n/a
On Apr 18, 12:23*am, I V <ivle...@gmail.comwrote:
On Thu, 17 Apr 2008 19:30:33 -0700, erikcw wrote:
use some sort of data-structure (maybe
nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).

Why would you want to do this? I don't see what you would hope to gain by
doing this, over just using a database.
Are databases truly another language from Python, fundamentally?
Jun 27 '08 #5

P: n/a
My very humble opinion - based on several years of working experience
with both the Zodb and many RDBMS - is quite clear : use a RDBMS and use
it properly.
Yes, somewhere down the line you will want
to get a report of all the customers in Ohio,
ordered by county and zip code, who have a
"rabbit cage" project -- and if you just pickle
everything you will end up traversing the entire
database, possibly multiple times
to find it. A little old fashioned
database design up front can save you a lot of pain.

-- Aaron Watters

===
http://www.xfeedme.com/nucular/pydis...?FREETEXT=ouch
Jun 27 '08 #6

P: n/a
ca********@gmail.com wrote:
On Apr 18, 12:23 am, I V <ivle...@gmail.comwrote:
>On Thu, 17 Apr 2008 19:30:33 -0700, erikcw wrote:
>>use some sort of data-structure (maybe
nested dictionaries or a custom class) and store the pickled
data-structure in a single row in the database (then unpickle the data
and query in memory).
Why would you want to do this? I don't see what you would hope to gain by
doing this, over just using a database.

Are databases truly another language from Python, fundamentally?
Yes. A fair amount of study went into them. Databases are about
information that survives the over an extended period of time (months
or years, not hours).

Classic qualities for a database that don't normally apply to Python
(all properties of a "transaction" -- bundled set of changes):
* Atomicity:
A transaction either is fully applied or not applied at all.
* Consistency:
Transactions applied to a database with invariants preserve
those invariants (things like balance sheets totals).
* Isolation:
Each transactions happens as if it were happening at its own
moment in time -- tou don't worry about other transactions
interleaved with your transaction.
* Durability:
Once a transaction actually makes it into the database, it stays
there and doesn't magically fail a long time later.

-Scott David Daniels
Sc***********@Acm.Org
Jun 27 '08 #7

P: n/a
On Apr 19, 10:56*pm, Dennis Lee Bieber <wlfr...@ix.netcom.comwrote:
On Sat, 19 Apr 2008 11:27:20 -0700, Scott David Daniels
<Scott.Dani...@Acm.Orgdeclaimed the following in comp.lang.python:

* * * * Hijacking as with the gmail kill filter I had to apply...
castiro...@gmail.com wrote:
Are databases truly another language from Python, fundamentally?

* * * * Databases predate Python by decades... Though getting hardware fast
enough to implement the current darling -- relational -- did take a few
years.

* * * * In my college days, database textbooks introduced: Hierarchical (IBM
IMS, I believe was the archetype used, though there were many others);
DBTG (Data Base Task Group) Network (the DBMS on the Xerox Sigma-6 at my
campus was a network model); and then gave Relational as an
experimental/theoretical format. About two years after I graduated, the
revised versions of the textbooks started with Relational, and then
listed hierarchical and network as "historical" formats.

* * * * In hierarchical and network, one had to explicitly code for the way
the data was stored... In simple form: hierarchical required one to
access from a top-level record, which then had "fields" comprising
related data (and could have multiple occurrences).

Invoice: * * * *has customer number, name, address, etc. and a "field" for
line items... The line items were a subtree: item number, description,
quantity, price, extended price...

* * * * Network extended the hierarchical model by allowing accessto the
"subtrees" from multiple different types of parent trees.

* * * * Relational started life as a theory of "how to view the data --
independent of how it is stored" -- comprising relations (which are NOT
the links between tables. In relational theory the terms equate as:

Common/Lay * * * * * * * * * * *Theory
table * * * * * * * * * * * * * * * * * relation
column * * * * * * * * * * * * * * * * *domain
row/record * * * * * * * * * * * * * * *tuple

"relation" meant that all the data in each tuple was related to the
others. The SQL "relationship operators" that are used to link separate
tables are not where "relational database" comes from.

* * * * SQL started life as a query language -- also independent of how the
data is stored. however, it fit into relational theory easily... Maybe
because it sort of combines relational algebra and relational calculus.
Classic qualities for a database that don't normally apply to Python
(all properties of a "transaction" -- bundled set of changes):

* * * * Examples might have been useful <G>
* * ** Atomicity:
* * * * A transaction either is fully applied or not applied at all.

* * * * Well... self-explanatory...
* * ** Consistency:
* * * * Transactions applied to a database with invariants preserve
* * * * those invariants (things like balance sheets totals).

* * * * One of the key ones...

* * * * update accounts set
* * * * * * * * balance = balance - 100
* * * * where accountnum = "from account";
* * * * update accounts set
* * * * * * * * balance = balance + 100
* * * * where accountnum = "to account";

* * * * A failure between the two update statements MUST ensure that no
changes were made to the database... Otherwise, one would lose 100 into
the vapor. (This example does link back to the "A" and is more on the
user side -- the code needs to specify that both updates are part of the
same transaction).
* * ** Isolation:
* * * * Each transactions happens as if it were happening at itsown
* * * * moment in time -- tou don't worry about other transactions
* * * * interleaved with your transaction.

* * * * Though how various RDBMs implement this feature gets confusing. One
has everything from locking the entire database (basically meaning that
"losing" transactions don't get applied at all and the code has to
reexecute the transaction logic) down to those that can lock on
individual records -- so overlapping transactions that don't need those
records complete with no failures.
* * ** Durability:
* * * * Once a transaction actually makes it into the database, it stays
* * * * there and doesn't magically fail a long time later.

* * * * Assuming a disk failure is not "magic" and one doesn't have a recent
backup <G>

--
* * * * Wulfraed * * * *Dennis Lee Bieber * * * * * * * KD6MOG
* * * * wlfr...@ix.netcom.com * * * * * * wulfr...@bestiaria.com
* * * * * * * * HTTP://wlfraed.home.netcom.com/
* * * * (Bestiaria Support Staff: * * * * * * * web-a...@bestiaria.com)
* * * * * * * * HTTP://www.bestiaria.com/
I'm holding the premise that money can be made different ways, also
and as technique is scarce, and exploration in programming is a non-
negative utility. I have a soft-coded script I can show, I'm just not
in the space program.
Jun 27 '08 #8

This discussion thread is closed

Replies have been disabled for this discussion.