473,378 Members | 1,175 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

select statement against pg_stats returns inconsistent data

The select statements return different data for
most_commons_vals depending on whether n_distinct is
included in the select clause or not.

I only seem to get the behavior below against int8
columns - but I haven't interated through every
conceivable data type either.

Is this expected behavior or perhaps a bug?

Regards,

Shelby Cain

================================================== =======

c1scain=# select version();
version

---------------------------------------------------------------------------------------
PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC
gcc (GCC) 3.3.1 (cygming special)
(1 row)

c1scain=# create table test_table (lastname
varchar(20), firstname varchar(20), userid int8,
testid int8);
CREATE TABLE
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015123 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015124 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015125 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015126 1
c1scain=# insert into test_table values ('cain',
'shelby', random()*10000, random()*10000);
INSERT 13015127 1
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 5
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 10
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 20
c1scain=# insert into test_table (select * from
test_table);
INSERT 0 40
c1scain=# analyze test_table;
ANALYZE
c1scain=# select distinct userid from test_table;
userid
--------
211
2641
4333
7642
8053
(5 rows)
c1scain=# select distinct testid from test_table;
testid
--------
73
834
1399
2315
4511
(5 rows)
c1scain=# select tablename, attname, most_common_vals
from pg_stats where tablename = 'test_table';
tablename | attname | most_common_vals
------------+-----------+---------------------------
test_table | lastname | {cain}
test_table | firstname | {shelby}
test_table | userid | {211,2641,4333,7642,8053}
test_table | testid | {73,834,1399,2315,4511}
(4 rows)
c1scain=# select tablename, attname, n_distinct,
most_common_vals from pg_stats where tablename =
'test_table';
tablename | attname | n_distinct |
most_common_vals
------------+-----------+------------+------------------------------------------------------
test_table | lastname | 1 | {cain}
test_table | firstname | 1 | {shelby}
test_table | userid | 5 |
{211,18610093293568,32822140076032,34587371634688, 0}
test_table | testid | 5 |
{73,6008659247104,9942849290240,19374597472256,0}
__________________________________
Do you Yahoo!?
Yahoo! Mail SpamGuard - Read only the mail you want.
http://antispam.yahoo.com/tools

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 22 '05 #1
3 2322
Shelby Cain <al******@yahoo.com> writes:
The select statements return different data for
most_commons_vals depending on whether n_distinct is
included in the select clause or not.
I only seem to get the behavior below against int8
columns - but I haven't interated through every
conceivable data type either.


Hoo, I'm surprised no one noticed this during 7.4 development/testing.
The problem applies for any datatype that requires double alignment,
which includes int8, float8, and timestamp as well as most of the
geometric types. pg_statistic is declared as using type "anyarray",
and this type really needs to be marked as requiring double alignment
so that arrays of double-aligned datatypes will come out correctly.

The correct source fix is a one-line change in pg_type.h, but this will
not propagate into existing databases without an initdb. It looks like
what you'd need to do to fix an existing database is

-- clear out broken data in pg_statistic
DELETE FROM pg_statistic;
-- this should update 1 row:
UPDATE pg_type SET typalign = 'd' WHERE oid = 2277;
-- this should update 6 rows:
UPDATE pg_attribute SET attalign = 'd' WHERE atttypid = 2277;
-- might be a good idea to start a fresh backend at this point
-- repopulate pg_statistic
ANALYZE;

Ugh :-(

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #2
Joe Conway <ma**@joeconway.com> writes:
anyarray has been defined this way since 7.3 -- any concerns there?


I don't think so --- we weren't trying to use it as an actual column
datatype back then.

7.4 has a problem though :-( ... this is one of the "damn I wish we'd
caught that before release" ones, since it can't easily be fixed without
initdb. Reminds me that I need to get to work on making pg_upgrade
viable again.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 22 '05 #3

--- Tom Lane <tg*@sss.pgh.pa.us> wrote:
Hoo, I'm surprised no one noticed this during 7.4
development/testing.
The problem applies for any datatype that requires
double alignment,
which includes int8, float8, and timestamp as well
as most of the
geometric types. pg_statistic is declared as using
type "anyarray",
and this type really needs to be marked as requiring
double alignment
so that arrays of double-aligned datatypes will come
out correctly.

The correct source fix is a one-line change in
pg_type.h, but this will
not propagate into existing databases without an
initdb. It looks like
what you'd need to do to fix an existing database is

-- clear out broken data in pg_statistic
DELETE FROM pg_statistic;
-- this should update 1 row:
UPDATE pg_type SET typalign = 'd' WHERE oid = 2277;
-- this should update 6 rows:
UPDATE pg_attribute SET attalign = 'd' WHERE
atttypid = 2277;
-- might be a good idea to start a fresh backend at
this point
-- repopulate pg_statistic
ANALYZE;

Ugh :-(

regards, tom lane

Works for me. Thanks!

__________________________________
Do you Yahoo!?
Get better spam protection with Yahoo! Mail.
http://antispam.yahoo.com/tools

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 22 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Jan | last post by:
I store sql-commands in a database table. In the first step I get the sql command out of the database table with embedded sql. In the second step I try to execute the command, which i got from the...
3
by: Hendry Taylor | last post by:
I have a problem where if I issue a select * from against a database it returns no data, but if I select column from it returns the data. Why would the * not be working as a wildcard?
12
by: TP | last post by:
Here is my problem. I need to display a table about which I have no information except the table name. Using metadata I can somehow show the column names and record values. But my table has 1...
3
by: Tcs | last post by:
My backend is DB2 on our AS/400. While I do HAVE DB2 PE for my PC, I haven't loaded it yet. I'm still using MS Access. And no, I don't believe this is an Access question. (But who knows? I...
6
by: Terentius Neo | last post by:
Is it possible to combine (in DB2 UDB 8.1) a stored procedure and a select statement? I mean something like this: Select c.number, call procedure( c.number ) as list from table c With best...
10
by: Larry Woods | last post by:
Look at this code: Dim i As Integer = 0 Dim j As Integer = 5 If (j / i = 7) Or i = 0 Then MessageBox.Show("In If") Else MessageBox.Show("In Else") End If
6
by: bill.mckinstry | last post by:
I have simple Select Query using two small tables. Table A includes Statement ID & Date. Table B includes Statement ID, Loan ID & Payment Amount. The Query selects payments and amounts from Table...
5
by: OldBirdman | last post by:
I am attempting to assign a .RowSource to a ListBox using an SQL statement. I want to build the SQL statement "On-the-Fly", based on values in several controls on the form. I would like to restrict...
4
by: DaveL | last post by:
how to uppercase a table columen in datatable select //dt is a Datatable i Need to uppercase control.text and table column data i dont know how to set loginname to upper case if...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.