By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
437,695 Members | 1,956 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 437,695 IT Pros & Developers. It's quick & easy.

table size/record limit

P: n/a
I am designing something that may be the size of yahoo, google, ebay, etc.

Just ONE many to many table could possibly have the following
characteristics:

3,600,000,000 records
each record is 9 fields of INT4/DATE

Other tables will have about 5 million records of about the same size.

There are lots of scenarios here to lessson this.

BUT, is postgres on linux, maybe necessarily a 64 bit system, cabable of
this? And there'd be 4-5 indexes on that table.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #1
Share this Question
Share on Google+
9 Replies


P: n/a
Hi,

Am Do, den 21.10.2004 schrieb Dennis Gearon um 1:30:
I am designing something that may be the size of yahoo, google, ebay, etc.

Just ONE many to many table could possibly have the following
characteristics:

3,600,000,000 records
each record is 9 fields of INT4/DATE

Other tables will have about 5 million records of about the same size.

There are lots of scenarios here to lessson this.

BUT, is postgres on linux, maybe necessarily a 64 bit system, cabable of
this? And there'd be 4-5 indexes on that table.


Sure. Why not? 3...5mio records is not really a problem.
We had bigger tables with historic commercial transactions
(even on an old dual PIII/1000) with fine performance.
I bet however, yahoo, google at least are much bigger :-)

Regards
Tino

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #2

P: n/a
Google probably is much bigger, and on mainframes, and probably Oracle or DB2.

But the table I am worried about is the one sized >= 3.6 GIGA records.

Tino Wildenhain wrote:
Hi,

Am Do, den 21.10.2004 schrieb Dennis Gearon um 1:30:
I am designing something that may be the size of yahoo, google, ebay, etc.

Just ONE many to many table could possibly have the following
characteristics:

3,600,000,000 records
each record is 9 fields of INT4/DATE

Other tables will have about 5 million records of about the same size.

There are lots of scenarios here to lessson this.

BUT, is postgres on linux, maybe necessarily a 64 bit system, cabable of
this? And there'd be 4-5 indexes on that table.

Sure. Why not? 3...5mio records is not really a problem.
We had bigger tables with historic commercial transactions
(even on an old dual PIII/1000) with fine performance.
I bet however, yahoo, google at least are much bigger :-)

Regards
Tino

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #3

P: n/a
Dennis Gearon wrote:
Google probably is much bigger, and on mainframes, and probably Oracle
or DB2.
Google uses a Linux cluster and there database is HUGE. I do not know
which database
they use. I bet they built their own specifically for what they do.

Sincerely,

Joshua D. Drake

But the table I am worried about is the one sized >= 3.6 GIGA records.

Tino Wildenhain wrote:
Hi,

Am Do, den 21.10.2004 schrieb Dennis Gearon um 1:30:
I am designing something that may be the size of yahoo, google,
ebay, etc.

Just ONE many to many table could possibly have the following
characteristics:

3,600,000,000 records
each record is 9 fields of INT4/DATE

Other tables will have about 5 million records of about the same size.

There are lots of scenarios here to lessson this.

BUT, is postgres on linux, maybe necessarily a 64 bit system,
cabable of this? And there'd be 4-5 indexes on that table.


Sure. Why not? 3...5mio records is not really a problem.
We had bigger tables with historic commercial transactions
(even on an old dual PIII/1000) with fine performance.
I bet however, yahoo, google at least are much bigger :-)

Regards
Tino

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #4

P: n/a
Actually, now that I think about it, they use a special table type that the INDEX is also the DATUM. It is possible to recover the data, out of the index listing. So go down the index, then decode the indexing value - voila, a whole step saved. I have no idea what engine these table types are in, however.

Joshua D. Drake wrote:
Dennis Gearon wrote:
Google probably is much bigger, and on mainframes, and probably Oracle
or DB2.

Google uses a Linux cluster and there database is HUGE. I do not know
which database
they use. I bet they built their own specifically for what they do.

Sincerely,

Joshua D. Drake

But the table I am worried about is the one sized >= 3.6 GIGA records.

Tino Wildenhain wrote:
Hi,

Am Do, den 21.10.2004 schrieb Dennis Gearon um 1:30:

I am designing something that may be the size of yahoo, google,
ebay, etc.

Just ONE many to many table could possibly have the following
characteristics:

3,600,000,000 records
each record is 9 fields of INT4/DATE

Other tables will have about 5 million records of about the same size.

There are lots of scenarios here to lessson this.

BUT, is postgres on linux, maybe necessarily a 64 bit system,
cabable of this? And there'd be 4-5 indexes on that table.


Sure. Why not? 3...5mio records is not really a problem.
We had bigger tables with historic commercial transactions
(even on an old dual PIII/1000) with fine performance.
I bet however, yahoo, google at least are much bigger :-)

Regards
Tino

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #5

P: n/a
On Wed, 2004-10-20 at 23:01 -0700, Joshua D. Drake wrote:
Dennis Gearon wrote:
Google probably is much bigger, and on mainframes, and probably Oracle
or DB2.


Google uses a Linux cluster and there database is HUGE. I do not know
which database
they use. I bet they built their own specifically for what they do.


....actually, I heard they were running it off a flat file database on 7
386 machines in some guys garage off a dsl connection. I could be wrong
though. ;-)

-Robby

--
/***************************************
* Robby Russell | Owner.Developer.Geek
* PLANET ARGON | www.planetargon.com
* Portland, OR | ro***@planetargon.com
* 503.351.4730 | blog.planetargon.com
* PHP/PostgreSQL Hosting & Development
****************************************/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQBBd3Y30QaQZBaqXgwRAm+AAKDT777dkqnCG7M1f5MHGJ mUv8p0YQCbBUFj
DWRKRa9k3vc93yudLQvGI2I=
=OjYA
-----END PGP SIGNATURE-----

Nov 23 '05 #6

P: n/a
On 21. okt 2004, at 01:30, Dennis Gearon wrote:
I am designing something that may be the size of yahoo, google, ebay,
etc.
Grrr. Geek wet-dream.
Just ONE many to many table could possibly have the following
characteristics:

3,600,000,000 records
each record is 9 fields of INT4/DATE
I don't do this myself (my data is only 3 gig, and most of that is in
blobs), but people have repeatedly reported such sizes on this list.

Check
http://archives.postgresql.org/pgsql...1/msg00188.php

.... but the best you can do is just to try it out. With a few commands
in the 'pql' query tool you can easily populate a ridiculously large
database ("insert into foo select * from foo" a few times).

In few hours you'll have some feel of it.
Other tables will have about 5 million records of about the same size.

There are lots of scenarios here to lessson this.


What you'll have to worry about most is the access pattern, and update
frequency.

There's a lot of info out there. You may need any of the following:
clustering, the 'slony' project seems to be popular around here.
concurrency of updating
connnection pooling, maybe via Apache or some java-thingey
securing yourself from hardware errors

This list is a goldmine of discussions. Search the archives for
discussions and pointers. Search interfaces at

http://archives.postgresql.org/pgsql-general/
http://archives.postgresql.org/pgsql-admin/

..... or download the list archive mbox files into your mail-program and
use that (which is what I do).

d.
--
David Helgason,
Business Development et al.,
Over the Edge I/S (http://otee.dk)
Direct line +45 2620 0663
Main line +45 3264 5049

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #7

P: n/a
Dennis Gearon wrote:
I am designing something that may be the size of yahoo, google, ebay, etc.

Just ONE many to many table could possibly have the following
characteristics:

3,600,000,000 records


This is a really huge monster one, and if you don't partition that
table in some way I think you'll have nightmares with it...

Regards
Gaetano Mendola

Nov 23 '05 #8

P: n/a
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dennis Gearon wrote:
| Gaetano Mendola wrote:
|
|> Dennis Gearon wrote:
|>
|>> I am designing something that may be the size of yahoo, google, ebay,
|>> etc.
|>>
|>> Just ONE many to many table could possibly have the following
|>> characteristics:
|>>
|>> 3,600,000,000 records
|>
|> This is a really huge monster one, and if you don't partition that
|> table in some way I think you'll have nightmares with it...
|>
|> Regards
|> Gaetano Mendola
|>
| thanks for the input, Gaetano.

For partion in some way I don't mean only split it in more tables. You
can use some available tools in postgres and continue to see this table
as one but implemented behind the scenes with more tables.
One usefull and impressive way is to use the inheritance in order to obtain
a vertical partition

0) Decide a partition policy ( based on time stamp for example )
1) Create an empty base table with the name that you want see as "public"
2) Create the partition using the empty table as base table
3) Create a rule on the base table so an insert or the update on it is
~ performed as a insert or an update on the right table ( using the partition
~ policy at step 0 )

in this way you are able to vacuum each partition, reindex each partition and
so on in a more "feseable way" I do not immagine vacuum full or reindex a
3,600,000,000 records table...

Regards
Gaetano Mendola








-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBeLiK7UpzwH2SGd4RAh+TAJ4w89SvkFWgt9DGhQx/aUR6j2wDtwCgtut5
FN0OuoycbI37a8Wouvo3icw=
=Wb6h
-----END PGP SIGNATURE-----
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #9

P: n/a
Great Idea! When I get that far, I will try it.

Gaetano Mendola wrote:

<snip>
For partion in some way I don't mean only split it in more tables. You
can use some available tools in postgres and continue to see this table
as one but implemented behind the scenes with more tables.
One usefull and impressive way is to use the inheritance in order to obtain
a vertical partition

0) Decide a partition policy ( based on time stamp for example )
1) Create an empty base table with the name that you want see as "public"
2) Create the partition using the empty table as base table
3) Create a rule on the base table so an insert or the update on it is
~ performed as a insert or an update on the right table ( using the
partition
~ policy at step 0 )

in this way you are able to vacuum each partition, reindex each
partition and
so on in a more "feseable way" I do not immagine vacuum full or reindex a
3,600,000,000 records table...

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #10

This discussion thread is closed

Replies have been disabled for this discussion.