By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,358 Members | 1,238 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,358 IT Pros & Developers. It's quick & easy.

select statement against pg_stats returns inconsistent data

P: n/a
The select statements return different data for
most_commons_vals depending on whether n_distinct is
included in the select clause or not.

I only seem to get the behavior below against int8
columns - but I haven't interated through every
conceivable data type either.

Is this expected behavior or perhaps a bug?

Regards,

Shelby Cain

================================================== =======

c1scain=# select version();
version

---------------------------------------------------------------------------------------
PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC
gcc (GCC) 3.3.1 (cygming special)
(1 row)

c1scain=# create table test_table (lastname
varchar(20), firstname varchar(20), userid int8,
testid int8);
CREATE TABLE
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015123 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015124 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015125 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015126 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015127 1
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 5
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 10
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 20
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 40
c1scain=# analyze test_table;
ANALYZE
c1scain=# select distinct userid from test_table;
userid
--------
211
2641
4333
7642
8053
(5 rows)
c1scain=# select distinct testid from test_table;
testid
--------
73
834
1399
2315
4511
(5 rows)
c1scain=# select tablename, attname, most_common_vals
from pg_stats where tablename = 'test_table';
tablename | attname | most_common_vals
------------+-----------+---------------------------
test_table | lastname | {cain}
test_table | firstname | {shelby}
test_table | userid | {211,2641,4333,7642,8053}
test_table | testid | {73,834,1399,2315,4511}
(4 rows)
c1scain=# select tablename, attname, n_distinct,
most_common_vals from pg_stats where tablename =
'test_table';
tablename | attname | n_distinct |
most_common_vals
------------+-----------+------------+------------------------------------------------------
test_table | lastname | 1 | {cain}
test_table | firstname | 1 | {shelby}
test_table | userid | 5 |
{211,18610093293568,32822140076032,34587371634688, 0}
test_table | testid | 5 |
{73,6008659247104,9942849290240,19374597472256,0}
__________________________________
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.
http://antispam.yahoo.com/tools

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #1
Share this Question
Share on Google+
3 Replies

P: n/a
Shelby Cain <al******@yahoo.com> writes:
The select statements return different data for
most_commons_vals depending on whether n_distinct is
included in the select clause or not.
I only seem to get the behavior below against int8
columns - but I haven't interated through every
conceivable data type either.


Hoo, I'm surprised no one noticed this during 7.4 development/testing.
The problem applies for any datatype that requires double alignment,
which includes int8, float8, and timestamp as well as most of the
geometric types. pg_statistic is declared as using type "anyarray",
and this type really needs to be marked as requiring double alignment
so that arrays of double-aligned datatypes will come out correctly.

The correct source fix is a one-line change in pg_type.h, but this will
not propagate into existing databases without an initdb. It looks like
what you'd need to do to fix an existing database is

-- clear out broken data in pg_statistic
DELETE FROM pg_statistic;
-- this should update 1 row:
UPDATE pg_type SET typalign = 'd' WHERE oid = 2277;
-- this should update 6 rows:
UPDATE pg_attribute SET attalign = 'd' WHERE atttypid = 2277;
-- might be a good idea to start a fresh backend at this point
-- repopulate pg_statistic
ANALYZE;

Ugh :-(

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #2

P: n/a
Joe Conway <ma**@joeconway.com> writes:
anyarray has been defined this way since 7.3 -- any concerns there?


I don't think so --- we weren't trying to use it as an actual column
datatype back then.

7.4 has a problem though :-( ... this is one of the "damn I wish we'd
caught that before release" ones, since it can't easily be fixed without
initdb. Reminds me that I need to get to work on making pg_upgrade
viable again.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 22 '05 #3

P: n/a

--- Tom Lane <tg*@sss.pgh.pa.us> wrote:
Hoo, I'm surprised no one noticed this during 7.4
development/testing.
The problem applies for any datatype that requires
double alignment,
which includes int8, float8, and timestamp as well
as most of the
geometric types. pg_statistic is declared as using
type "anyarray",
and this type really needs to be marked as requiring
double alignment
so that arrays of double-aligned datatypes will come
out correctly.

The correct source fix is a one-line change in
pg_type.h, but this will
not propagate into existing databases without an
initdb. It looks like
what you'd need to do to fix an existing database is

-- clear out broken data in pg_statistic
DELETE FROM pg_statistic;
-- this should update 1 row:
UPDATE pg_type SET typalign = 'd' WHERE oid = 2277;
-- this should update 6 rows:
UPDATE pg_attribute SET attalign = 'd' WHERE
atttypid = 2277;
-- might be a good idea to start a fresh backend at
this point
-- repopulate pg_statistic
ANALYZE;

Ugh :-(

regards, tom lane

Works for me. Thanks!

__________________________________
Do you Yahoo!?
Get better spam protection with Yahoo! Mail.
http://antispam.yahoo.com/tools

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.