By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
434,870 Members | 2,347 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 434,870 IT Pros & Developers. It's quick & easy.

Improving date(timestamp) searches

P: 2
Let's say I have a table, data_tbl, with the following columns:

object_id of type integer
mytimestamp of type timestamp with time zone
data_1 of type integer
data_2 of type integer

And each object within this table gets a new record every second. So if there are 4 different object_ids, then 4 new records are inserted each second.

Now, my problem is bad performance on a particular query. Given an object_id, this query retrieves x number of distinctive dates associated with that object_id. For example:

select date(mytimestamp)
from data_tbl
where object_id = 1
order by date(mytimestamp) desc
group by date(mytimestamp)
limit 5;

This query takes on the orders of 2-6 seconds. I've already created a table index on (objec_id, mytimestamp) but that doesn't seem to help since the query is actually searching for date(mytimestamp). And my table grows by roughly 350K rows everyday.

Any suggestions on performance tuning would be greatly appreciated!
Jun 25 '07 #1
Share this Question
Share on Google+
3 Replies


Expert 100+
P: 534
I'm sure the index you created must be helpful.
On my underpowered Windows machine with Postgresql 8.2.3 and about 2.5 mil. records this query takes anywhere from 20 to 150 milliseconds.
Expand|Select|Wrap|Line Numbers
  1. select distinct date(mytimestamp) from data_tbl 
  2.         where object_id = 1 order by 1 desc
  3.  
Having more information may be helpful to find a good solution for you;
What version of Postgresql are you running?
What platform?
How many records are there on the table?
Did you try to run "explain analyze" on your query?
Do you ever run "vacuum analyze" on the database? (newer versions do it automatically)

On a different note with 350 thou. new records every day this table is bound to become a drag.
Do you need to keep all this data?
How often do you run this query?
Are you interested in querying old records, those created days or months ago?
Sometimes the answer is to have a separate reporting table which holds just a subset of all historical data.
Such table can be periodically updated.

I guess my point is that more information will be really helpful.
Jun 26 '07 #2

P: 2
I am running Postgresql 8.2 under Red Hat 5 on an extremely fast server (I don't know the exact specs but I know it's powerful). So it's not hardware at all.

The table currently has 4.3 million records and will probably continue to grow by 100K to 350K records everyday.

The query is tied to a web front-end for users to download csv based on object_id and timestamp, so pretty much whenever a user decides that he/she wants to download data.

You mentioned that people often keep a table for historical data. How does that actually work and do you have any references to good table design for such application? I'm a semi-newbie with database schemes.
Jun 26 '07 #3

Expert 100+
P: 534
I still don't know enough about how your table is being accessed, etc., but here's just one idea.

You can have either a RULE or a TRIGGER on INSERT.
The code there would insert the copy of the new record into a different table, say "data_history_tbl", where no record is ever deleted.
Once a day, or once a week, depending on your usage pattern you delete all old records from data_tbl.
"Old" being older than 12 hours. or 1 day, or something you define.

This way the main table never grows too big, but the other table, which I suppose you don't need to query too often holds all historical data in case you ever need it.

If you don't need the historical data then just trim your table once a day, leaving just a handful of the latest records.
Jul 1 '07 #4

Post your reply

Sign in to post your reply or Sign up for a free account.