ma***************@canada.com (Anthony) writes:
Our Database is having errors. We are currently using PostgreSQL to
store 2.5 Million records per day. The average addition to our primary
table is 4.5 Gigs of data.
We are doing this on a dual Opteron 244 system with 1 TeraByte of HDD
space. The drives are 250 Gig Western Digital. The Raid Controller is
LSI Logic MegaRaid 150-6.
We are getting an error after about 4-5 days worth of data being put
into the system.
************************************************** *****
ERROR: invalid page header in block 59305 of relation
"item_info_2004_04_leaf_category_1"
************************************************** *****
Our Base Server Configuration is as follows.
PostgreSQL Version= 7.4.2
x86_64-PC-Linux-GNU
Compiled with GCC 3.3.3
XFS File System
Running on Gentoo Linux 3.3.3 Propolice-3.3-7
Any help on how to solve this probelm would be extremely appreciated.
Even the potential that Tom Lane might respond to this is worth it.
May I point you to the pg_filedump utility?
<http://sources.redhat.com/rhdb/utilities.html>
It can give you a fair idea of just where the system is blowing up.
I experienced what sounds like the same problem with a system that was
fairly similarly appointed with hardware, albeit with a few
conspicuous differences...
1. PostgreSQL 7.4.1
2. FreeBSD 4.9
3. Berkeley FFS with soft updates
4. Quad-Xeon, 8GB RAM (only using 4GB of it :-()
5. AMI MegaRaid controller...
6. Slightly less disk; 12x74GB SCSI drives
[root@hathi scsi]# cat /proc/scsi/megaraid/1
LSI Logic MegaRAID 1.74 254 commands 16 targs 7 chans 7 luns
What I found in looking at the page with the "invalid page header" was
that it was full of ASCII NUL values.
We had previously had quite a bit of trouble with a different box with
the same hardware configuration running RHAT 7.3, although when I
replaced a 2.4.18 Linux kernel with 2.6.2, those problems evaporated.
The only thing that we have been able to point to on the box in
question is a hardware problem. In view of the disk being RAIDed, the
causes seem to fall to three things being most likely sorts of
culprits:
1. Perhaps the controller is "glitched;"
2. Perhaps the controller driver is "glitched;"
3. Perhaps there is a RAM problem.
Notice that the list of suspects doesn't include any that actually
relate to database software.
Your best bet is to look for hardware problems.
--
(reverse (concatenate 'string "gro.gultn" "@" "enworbbc"))
http://cbbrowne.com/info/linuxxian.html
Never take life seriously. Nobody gets out alive anyway.