473,398 Members | 2,525 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

Can I use PostgreSQL to develop a self-organizing database system?

Hi,

at NC State University, my students and I are working on a project called
"self-organizing databases," please see description below. I would like to
use an open-source database system for implementation and would really
appreciate your opinion on whether PostgreSQL is suitable for the project.
In general, I am very impressed by the quality of PostgreSQL code and
documentation, as well as by the support of the developer community.

For the project, I need a cost-based query optimizer with exhaustive join
enumeration and use of statistics on stored relations; PostgreSQL has that.
I also need the ability to process SQL queries with aggregation, extensive
indexing capabilities, view mechanisms, and possibly integrity constraints;
it seems that PostgreSQL has all that. We will modify the query optimizer to
incorporate rewriting queries using views, and we will create
view-generating and view-manipulating modules.

Please let me know if you have comments.

Sincerely,

Rada Chirkova

======================
Self-Organizing Databases

The goal of this project is to develop new effective methods to improve the
performance of sets of frequent and important queries on large relational
databases at all times, which could improve the efficiency of user
interactions with data-management systems. Solving the problem will have the
most effect in query optimization, data warehousing, and information
integration, which are important research topics with direct practical
applications.

The project focuses on the methodology of evaluating queries using views;
views are relations that are defined by auxiliary queries and can be used to
rewrite and answer user queries. One way to improve query performance is
precompute and store (i.e., materialize) views.

To truly optimize query performance, it is critical to materialize the
"right" views. The current focus of the project is on demonstrating that, by
designing and materializing views, it is possible to ensure optimal or
near-optimal performance of frequent and important queries, for common and
important query types. We consider this problem in the broader context of
designing self-organizing databases: A self-organizing database periodically
determines, without human intervention, a representative set of frequent and
important queries on the data, and incrementally designs and precomputes the
optimal (or near-optimal) views for that representative query workload. As
the representative query workload and the stored data change over time,
self-organizing databases adapt to the changes by changing the set of
materialized views that are used to improve the query-answering performance
in the database.

For building self-organizing databases, we consider an end-to-end solution –
that is, we consider all aspects of handling and using views, including:
· designing and materializing views and indexes to improve query
performance;
· exploring the effects of materialized views on the process of query
optimization;
· adapting view design to the changing query workload, including the process
of retiring views that are no longer useful;
· developing methods for auomatically updating existing materialized views
over time, to reflect the changes in the stored data;
· developing methods to collect database statistics to reliably estimate the
sizes of the views the system considers for materialization;
· analyzing the use of system resources and allocating an appropriate amount
of resources to view management in the system.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 11 '05 #1
10 3223
"Rada Chirkova" <ch******@csc.ncsu.edu> writes:
at NC State University, my students and I are working on a project called
"self-organizing databases," please see description below. I would like to
use an open-source database system for implementation and would really
appreciate your opinion on whether PostgreSQL is suitable for the project.


FWIW, I think Postgres will do very well as a base for your project.
But some might say I'm biased ;-)

Postgres was originally designed and built as a research tool, and it
still bears obvious traces of that heritage, even though the current
developers are almost single-mindedly focused on turning it into a
production tool. You might find this entertaining:
http://archives.postgresql.org/pgsql...6/msg00085.php

I'm not personally convinced that materialized views would be a great
thing to add to Postgres, but I look forward to being proven wrong.
Please keep us posted on your results.

regards, tom lane

PS: BTW, pgsql-hackers is probably a better list than pgsql-general for
discussing research work with Postgres.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #2
> Please let me know if you have comments.

The project focuses on the methodology of evaluating queries using views;
views are relations that are defined by auxiliary queries and can be used to
rewrite and answer user queries. One way to improve query performance is
precompute and store (i.e., materialize) views. self-organizing -> automagically materializing the "right" views


I also use views to simplify application code by moving
complicated queries onto the backend. I wonder what, if any,
place this aspect of using views has in your project.

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #3
On Mon, 2003-09-08 at 15:22, Karsten Hilbert wrote:
[snip]
I also use views to simplify application code by moving
complicated queries onto the backend. I wonder what, if any,


Are you referring to Access' tendency to do the joins on the client?

--
-----------------------------------------------------------------
Ron Johnson, Jr. ro***********@cox.net
Jefferson, LA USA

"What other evidence do you have that they are terrorists, other
than that they trained in these (terrorist training) camps?"
17-Sep-2002 Katie Couric to an FBI agent regarding the 5 men
arrested near Buffalo NY
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #4

Ron Johnson <ro***********@cox.net> writes:
On Mon, 2003-09-08 at 15:22, Karsten Hilbert wrote:
I also use views to simplify application code by moving
complicated queries onto the backend. I wonder what, if any,


Are you referring to Access' tendency to do the joins on the client?


No, I think he doesn't mean move execution to the backend, just lexically
hoisting the complex repeated code out of the front-end code.

A self-organizing materialized view system might be able to use normal
non-materialized views as hints to help it find useful views to materialize.

It also might help the programmer avoid implementing functionally equivalent
but non-identical forms of the same view in different queries, defeating the
automated recognition of similar views.

Materialized views are a godsend for DSS systems, and self-organized
materialized views sound too good to be true. Is this really feasible?
It sounds awfully ambitious considering postgres doesn't even have
materialized views yet.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #5
On Mon, Sep 08, 2003 at 05:55:43PM -0400, Greg Stark wrote:
Materialized views are a godsend for DSS systems, and self-organized
materialized views sound too good to be true. Is this really feasible?
It sounds awfully ambitious considering postgres doesn't even have
materialized views yet.


There is also the question of how would one actually use self-organizing
views... what would be the syntax for them? Should the parser recognize
that some join you are giving to it is a previously self-organized view
and contract the sentence?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Puedes vivir solo una vez, pero si lo haces bien, una vez es suficiente"

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 11 '05 #6
>> I also use views to simplify application code by moving
complicated queries onto the backend. I wonder what, if any,

Are you referring to Access' tendency to do the joins on the client?

No. Usually, when I need some query quite often that joins and
subselects a lot of tables I set up a view covering the
constant parts of that query. Thus I can write much simpler
queries in the frontend code selecting from the view. This may
trade efficiency for convenience at times.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #7
Alvaro Herrera <al******@dcc.uchile.cl> writes:
On Mon, Sep 08, 2003 at 05:55:43PM -0400, Greg Stark wrote:
Materialized views are a godsend for DSS systems, and self-organized
materialized views sound too good to be true. Is this really feasible?
It sounds awfully ambitious considering postgres doesn't even have
materialized views yet.


There is also the question of how would one actually use self-organizing
views... what would be the syntax for them? Should the parser recognize
that some join you are giving to it is a previously self-organized view
and contract the sentence?


In an ideal world the database would just magically notice that I do a lot of
queries like: "SELECT count(*) FROM foo WHERE bar = ?"
and decide to materialize: "SELECT count(*),bar FROM foo GROUP BY bar"
(and figure out where to put the index.)

Then it would optimize all my queries to be a simple indexed lookup.

Of course it would also have to add rewrite rules or triggers to update the
materialized view on every update/insert/delete. So the code to decide when to
create the materialized view would have to keep statistics on these queries as
well to decide when it's worthwhile to create the materialized view... Nobody
said it was easy :)

--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 11 '05 #8
> Materialized views are a godsend for DSS systems, and self-organized
materialized views sound too good to be true. Is this really feasible?
It sounds awfully ambitious considering postgres doesn't even have
materialized views yet.

The concept of self-organization can be tested regardless of
whether those views are materialized or just written
automatically like common views, I suppose. This won't, of
course, bring about the self-organizing efficiency but that's
just a matter of adding materialized views in general and
connecting both ends, unless I am mistaken.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #9

If anyone knows where I can get a self-organizing desk, please let me
know. :-)

---------------------------------------------------------------------------

Tom Lane wrote:
"Rada Chirkova" <ch******@csc.ncsu.edu> writes:
at NC State University, my students and I are working on a project called
"self-organizing databases," please see description below. I would like to
use an open-source database system for implementation and would really
appreciate your opinion on whether PostgreSQL is suitable for the project.


FWIW, I think Postgres will do very well as a base for your project.
But some might say I'm biased ;-)

Postgres was originally designed and built as a research tool, and it
still bears obvious traces of that heritage, even though the current
developers are almost single-mindedly focused on turning it into a
production tool. You might find this entertaining:
http://archives.postgresql.org/pgsql...6/msg00085.php

I'm not personally convinced that materialized views would be a great
thing to add to Postgres, but I look forward to being proven wrong.
Please keep us posted on your results.

regards, tom lane

PS: BTW, pgsql-hackers is probably a better list than pgsql-general for
discussing research work with Postgres.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #10

If anyone knows where I can get a self-organizing desk, please let me
know. :-)

---------------------------------------------------------------------------

Tom Lane wrote:
"Rada Chirkova" <ch******@csc.ncsu.edu> writes:
at NC State University, my students and I are working on a project called
"self-organizing databases," please see description below. I would like to
use an open-source database system for implementation and would really
appreciate your opinion on whether PostgreSQL is suitable for the project.


FWIW, I think Postgres will do very well as a base for your project.
But some might say I'm biased ;-)

Postgres was originally designed and built as a research tool, and it
still bears obvious traces of that heritage, even though the current
developers are almost single-mindedly focused on turning it into a
production tool. You might find this entertaining:
http://archives.postgresql.org/pgsql...6/msg00085.php

I'm not personally convinced that materialized views would be a great
thing to add to Postgres, but I look forward to being proven wrong.
Please keep us posted on your results.

regards, tom lane

PS: BTW, pgsql-hackers is probably a better list than pgsql-general for
discussing research work with Postgres.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 11 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
by: Sarah Tanembaum | last post by:
Beside its an opensource and supported by community, what's the fundamental differences between PostgreSQL and those high-price commercial database (and some are bloated such as Oracle) from...
4
by: Ryan Mack | last post by:
First, I want to confirm my understanding that the "without fee" clause in the PostgreSQL license means that a party may not sell PostgreSQL or works derived from the PostgreSQL source code. ...
0
by: greg | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This is a PGP-signed copy of the checksums for PostgreSQL version 7.4. The latest copy of the checksums for this and other versions, as well...
3
by: Bob.Henkel | last post by:
I write this to tell you why we won't use postgresql even though we wish we could at a large company. Don't get me wrong I love postgresql in many ways and for many reasons , but fact is fact. If...
2
by: Peter | last post by:
Can i simply use Web Matrix to develop ASP.NET applications professionally, & not need to worry about my self with licensing issues? I just need to be able to develop professionally for my...
14
by: Manuel Tejada | last post by:
Hi My box: RedHat 9.0, Pentium III Recently I upgraded from PostgreSQL 7.3.2 to PostgreSQL 7.4.1. The PostgreSQL 7.3.2's rpms were installed from RehHat CDs The PostgreSQL 7.4.1's rpms I used...
1
by: Brandon Craig Rhodes | last post by:
I now have SSL encryption working between psql and my PostgreSQL server, but want to enable client verification of the server's SSL certificate (and eventually vice-versa). My problem is that...
2
by: agus liem | last post by:
Hai.. I introduce myself, my name : Agus, from Indonesia. Im develop application using POwerbuilder 7 and postgresql 7.3. I have serveral error, and cannot solve yet. Anybody help me? This is...
0
by: Greg Sabino Mullane | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 This is a PGP-signed copy of the checksums for following PostgreSQL versions: 7.4.5 7.4.4 7.3.7
1
by: Devrim GUNDUZ | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --------------------------------------------------------------------- PostgreSQL RPM Set Update 2004-10-26 Version(s): 7.3.8, 7.4.6
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.