473,888 Members | 1,636 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

"Interestin g" query planning for timestamp ranges in where clause?

I have a table with a few million rows of temperature data keyed
by timestamp. I want to group these rows by timestamp intervals
(e.g. every 32 seconds), compute aggregate functions on the columns,
and ultimately feed the result into a graph-drawing web thingy.
I'm trying a few different ways to get what seems to be the same data,
and seeing some odd behavior from the query planner.

The table looks like this:

ilt=# \d+ temp
Table "public.tem p"
Column | Type | Modifiers | Description
--------+--------------------------------+-----------+-------------
t | timestamp(0) without time zone | not null |
t1 | double precision | |
t2 | double precision | |
t3 | double precision | |
t4 | double precision | |
t5 | double precision | |
t6 | double precision | |
t7 | double precision | |
t8 | double precision | |
Indexes:
"temp_pkey" primary key, btree (t)

ilt=# select count(*) from temp;
count
---------
3316004
(1 row)

Time: 18144.953 ms


I have a function for computing rounded timestamps which uses abstime
(any better suggestions gladly appreciated...) :

ilt=# \df+ time_bucket
List of functions
Result data type | Schema | Name | Argument data types | Owner | Language | Source code | Description
------------------+--------+-------------+-----------------------------------+----------+----------+-----------------------------------------------------+-------------
abstime | public | time_bucket | timestamp with time zone, integer | zblaxell | sql | select abstime(int4(ex tract(epoch from $1)/$2)*$2); |
(1 row)

Now suppose I want a table of the last 24 hours' data at 32 second
intervals. I might try a WHERE clause with now() and now() - interval
'1 day':

ilt=# explain analyze select time_bucket(t, 32), avg(t1), avg(t2), avg(t3), avg(t4), avg(t5), avg(t6), avg(t7), avg(t8) from temp where t between now() - interval '1 day' and now() group by time_bucket(t, 32) order by time_bucket(t, 32) desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=210842.95 ..212603.60 rows=30620 width=72) (actual time=39835.357. .41150.006 rows=2697 loops=1)
-> Sort (cost=210842.95 ..210919.50 rows=30620 width=72) (actual time=39835.064. .40002.349 rows=85360 loops=1)
Sort Key: ((int4((date_pa rt('epoch'::tex t, (t)::timestamp with time zone) / 32::double precision)) * 32))::abstime
-> Seq Scan on "temp" (cost=0.00..207 797.15 rows=30620 width=72) (actual time=35275.231. .38323.524 rows=85360 loops=1)
Filter: (((t)::timestam p with time zone >= (now() - '1 day'::interval) ) AND ((t)::timestamp with time zone <= now()))
Total runtime: 41165.330 ms
(6 rows)

41 seconds is a long time for a 86400-row query, especially if I want to
draw a graph once per sampling interval. This is way too slow, so I gave
up on this for a while, and I spent a day playing around with other stuff.
During that day I accidentally discovered that there's a very different
way to do this query. Actually, I think it's equivalent, but the query
planner disagrees:

ilt=# explain analyze select time_bucket(t, 32), avg(t1), avg(t2), avg(t3), avg(t4), avg(t5), avg(t6), avg(t7), avg(t8) from temp where t between time_bucket(now () - interval '1 day', 1) and time_bucket(now (), 1) group by time_bucket(t, 32) order by time_bucket(t, 32) desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=5197.30.. 6958.64 rows=30632 width=72) (actual time=4170.763.. 5611.031 rows=2697 loops=1)
-> Sort (cost=5197.30.. 5273.88 rows=30632 width=72) (actual time=4169.819.. 4336.096 rows=85360 loops=1)
Sort Key: ((int4((date_pa rt('epoch'::tex t, (t)::timestamp with time zone) / 32::double precision)) * 32))::abstime
-> Index Scan using temp_pkey on "temp" (cost=0.00..215 0.53 rows=30632 width=72) (actual time=0.278..209 0.791 rows=85360 loops=1)
Index Cond: ((t >= (((int4((date_p art('epoch'::te xt, (now() - '1 day'::interval) ) / 1::double precision)) * 1))::abstime):: timestamp without time zone) AND (t <= (((int4((date_p art('epoch'::te xt, now()) / 1::double precision)) * 1))::abstime):: timestamp without time zone))
Total runtime: 5639.385 ms
(6 rows)

Another query that is slightly faster uses timestamps that are constants
supplied by the application. I did two variations, one using the 'today'
and 'yesterday' keywords, and one with literal dates. There wasn't
significant difference between those, and they look like this:

ilt=# select now();
now
-------------------------------
2004-06-15 22:41:46.507174-04
(1 row)

ilt=# explain analyze select time_bucket(t, 32), avg(t1), avg(t2), avg(t3), avg(t4), avg(t5), avg(t6), avg(t7), avg(t8) from temp where t between 'yesterday 22:41:46' and 'today 22:41:46' group by time_bucket(t, 32) order by time_bucket(t, 32) desc;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=34083.64. .43162.37 rows=157891 width=72) (actual time=3533.097.. 4738.877 rows=2697 loops=1)
-> Sort (cost=34083.64. .34478.37 rows=157891 width=72) (actual time=3532.455.. 3679.959 rows=85362 loops=1)
Sort Key: ((int4((date_pa rt('epoch'::tex t, (t)::timestamp with time zone) / 32::double precision)) * 32))::abstime
-> Index Scan using temp_pkey on "temp" (cost=0.00..594 0.88 rows=157891 width=72) (actual time=0.227..190 7.830 rows=85362 loops=1)
Index Cond: ((t >= '2004-06-14 22:41:46'::time stamp without time zone) AND (t <= '2004-06-15 22:41:46'::time stamp without time zone))
Total runtime: 4755.107 ms
(6 rows)


Generally the last form is slightly faster than the second one,
but that's reasonable assuming extra computation overhead to
calculate the functions in the WHERE clause.

My question is: what makes the first of those queries so much slower
than the other two, and more than two times slower than a full table
scan? The only thing I can think of that distinguishes the cases is the
lack of fractional component in the timestamps for the two fast variants,
but I can't imagine _why_ this would blow up the planner like this.

--
Zygo Blaxell (Laptop) <zb******@feedm e.hungrycats.or g>
GPG = D13D 6651 F446 9787 600B AD1E CCF3 6F93 2823 44AD

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #1
0 2603

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

23
5702
by: ian justice | last post by:
Before i post actual code, as i need a speedyish reply. Can i first ask if anyone knows off the top of their head, if there is a likely obvious cause to the following problem. For the moment i've reduced my form request to a simple text string entry, instead of my desired optional parameters. As i have been stuck with a single unfathomable glitch for over a year. Basically, if i enter queries such as ; "select * from table" "select * from...
3
2206
by: Warren Oates | last post by:
I ran into an interesting gotcha with unix timestamps. I've got a page where the user inputs the date with drop-down boxes (easy to deal with), that my script sees as (say) $d $m $y, not necessarily in that order, but you get the idea, nicely formatted with leading zeroes and so on. Then I run it through checkdate($m, $d, $y) just to make sure the user doesn't think there's 31 days in September or whatnot.
36
6432
by: Andrea Griffini | last post by:
I did it. I proposed python as the main language for our next CAD/CAM software because I think that it has all the potential needed for it. I'm not sure yet if the decision will get through, but something I'll need in this case is some experience-based set of rules about how to use python in this context. For example... is defining readonly attributes in classes worth the hassle ? Does duck-typing scale well in complex
4
8983
by: Starbuck | last post by:
OK, first let me say that I am no DB person. But I have a user here who keeps getting this error whenever she does, whatever it is she does, with databases... A google search takes me to various forums where I am forced to sign up before I can read any answers. Interesting note here is that the guy in the office next
16
2122
by: WindAndWaves | last post by:
Hi there I have $initstartdate = date("d-m-Y"); in my code How can I get it to be date() + 1 or 7 for that matter. Because my server is in the US and I am in New Zealand, they are always a day behind....
6
1683
by: Rennie deGraaf | last post by:
In the last few days, I have discovered two "interesting" behaviours of the C language, both of which are apparently correct. Could someone please explain the reasoning behind them? 1. The operators '^', '&', and '|' have lower precedance than '==', '!=', '>=", etc. I discovered this when the statement "if (array1 ^ array2 == 0xff)" failed to do what I expected. To me, it doesn't make any sense to give the bitwise operators lower...
2
720
by: Zygo Blaxell | last post by:
I have a table with a few million rows of temperature data keyed by timestamp. I want to group these rows by timestamp intervals (e.g. every 32 seconds), compute aggregate functions on the columns, and ultimately feed the result into a graph-drawing web thingy. I'm trying a few different ways to get what seems to be the same data, and seeing some odd behavior from the query planner. The table looks like this:
10
15479
by: David Garamond | last post by:
The Postgres manual says: The AT TIME ZONE construct allows conversions of time stamps to different time zones. I'd guess most people would think what's meant here is something like "unit conversion", and that the timestamp value stays the same (much like 2 feet becomes 24 inches when it's being "converted"). But: # SELECT NOW() = NOW() AT TIME ZONE 'UTC';
20
3500
by: Wes Groleau | last post by:
I was doing update statements in SQL Server 2000. I have a table with over 16 million rows. It came from several hundred delimited text files, and two of the columns are file ID (int) and Line # (int) Structure is X12 (835). For those unfamiliar with that, each file has one to many BPR lines; each BPR line has zero to many CLP lines, each of those has zero to many
19
10774
by: Daniel Pitts | last post by:
I have std::vector<Base *bases; I'd like to do something like: std::for_each(bases.begin(), bases.end(), operator delete); Is it possible without writing an adapter? Is there a better way? Is there an existing adapter? Thanks, Daniel.
0
10777
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10882
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10438
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9597
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7148
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5817
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
6014
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4642
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
4243
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.