473,793 Members | 2,742 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Large Databases


What is the linux and/or postgres limitation for very
large databases, if any? We are looking at 6T-20T.
My understanding is that if the hardware supports it,
then it can be done in postgres. But can hardware
support that?

--elein
=============== =============== =============== ===============
el***@varlena.c om Varlena, LLC www.varlena.com

PostgreSQL Consulting, Support & Training

PostgreSQL General Bits http://www.varlena.com/GeneralBits/
=============== =============== =============== =============== =
I have always depended on the [QA] of strangers.
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #1
7 1549
elein wrote:
What is the linux and/or postgres limitation for very
large databases, if any? We are looking at 6T-20T.
My understanding is that if the hardware supports it,
then it can be done in postgres. But can hardware
support that?


I've never had the pleasure to actually see one of those, but for
example IBM sells their storage servers, which can hold up to 224 SCSI
drives. The largest drive now s 146G, so that gives you 32T. With some
RAIDing you are looking at 16T of storage in one box (huge it may
be--actually occupies whole 90U rack). You can use more of them I guess,
since version 8 has tablespaces. Be prepared to spend some half a mil
for this beast and we are talking US dollars here (listing price is
$760k, but who would shop for listing prices, eh?). Beware, nuclear
powerplant not included. If you get one of those, let us know how
Postgres hums on it :)

--
Michal Taborsky
http://www.taborsky.cz
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #2
elein wrote:
What is the linux and/or postgres limitation for very
large databases, if any? We are looking at 6T-20T.
My understanding is that if the hardware supports it,
then it can be done in postgres. But can hardware
support that?


I've recently been going through a project to support what will become a
5 to 6 TB Postgres database (initially it will be about 300GB after
conversion from the source system). A few significant things I've
learned along the way:

1) The linux 2.4 kernel has a block device size limit of 2 TB.

2) The linux 2.6 kernel supports *huge* block device size -- I don't
have it in front of me, but IIRC it was in the peta-bytes range.

3) xfs, jfs, and ext3 all can handle more than the 6TB we needed them to
handle.

4) One of the leading SAN vendors initially claimed to be able to
support our desire to have a single 6TB volume. We found that when
pushed hard, we would get disk corruption (archives are down, but see
HACKERS on 8/21/04 for a message I posted on the topic). Now we are
being told that they don't support the linux 2.6 kernel, and
therefore don't support > 2TB volumes.

So the choices seem to be:
a) Use symlinks or Postgres 8.0.0beta tablespaces to split your data
across multiple 2 TB volumes.

b) Use NFS mounted NAS.

We are already a big NetApp shop, so NFS mounted NAS is the direction
we'll likely take. It appears (from their online docs) that NetApp can
have individual volumes up to 16 TB. We should be confirming that with
them in the next day or two.

HTH,

Joe

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #3
I thought NFS was not recommended. Did I misunderstand this
or is there some kind of limitation to using different kinds(?)
of NFS.

Thank you for the excellent info.

--elein

On Tue, Aug 31, 2004 at 01:54:41PM -0700, Joe Conway wrote:
elein wrote:
What is the linux and/or postgres limitation for very
large databases, if any? We are looking at 6T-20T.
My understanding is that if the hardware supports it,
then it can be done in postgres. But can hardware
support that?


I've recently been going through a project to support what will become a
5 to 6 TB Postgres database (initially it will be about 300GB after
conversion from the source system). A few significant things I've
learned along the way:

1) The linux 2.4 kernel has a block device size limit of 2 TB.

2) The linux 2.6 kernel supports *huge* block device size -- I don't
have it in front of me, but IIRC it was in the peta-bytes range.

3) xfs, jfs, and ext3 all can handle more than the 6TB we needed them to
handle.

4) One of the leading SAN vendors initially claimed to be able to
support our desire to have a single 6TB volume. We found that when
pushed hard, we would get disk corruption (archives are down, but see
HACKERS on 8/21/04 for a message I posted on the topic). Now we are
being told that they don't support the linux 2.6 kernel, and
therefore don't support > 2TB volumes.

So the choices seem to be:
a) Use symlinks or Postgres 8.0.0beta tablespaces to split your data
across multiple 2 TB volumes.

b) Use NFS mounted NAS.

We are already a big NetApp shop, so NFS mounted NAS is the direction
we'll likely take. It appears (from their online docs) that NetApp can
have individual volumes up to 16 TB. We should be confirming that with
them in the next day or two.

HTH,

Joe


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #4
elein wrote:
I thought NFS was not recommended. Did I misunderstand this
or is there some kind of limitation to using different kinds(?)
of NFS.


I've seen that sentiment voiced over and over. And a few years ago, I
would have joined in.

But the fact is *many* large Oracle installations now run over NFS to
NAS. When it was first suggested to us, our Oracle DBAs said "no way".
But when we were forced to try it due to hardware failure (on our
attached fibre channel array) a few years ago, we found it to be
*faster* than the locally attached array, much more flexible, and very
robust. Our Oracle DBAs would never give it up at this point.

I suppose there *may* be some fundamental technical difference that
makes Postgres less reliable than Oracle when using NFS, but I'm not
sure what it would be -- if anyone knows of one, please speak up ;-).
Early testing on NFS mounted NAS has been favorable, i.e. at least the
data does not get corrupted as it did on the SAN. And like I said, our
only other option appears to be spreading the data over multiple
volumes, which is a route we'd rather not take.

Joe

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 23 '05 #5
On Tue, 2004-08-31 at 15:07, Joe Conway wrote:
I suppose there *may* be some fundamental technical difference that
makes Postgres less reliable than Oracle when using NFS, but I'm not
sure what it would be -- if anyone knows of one, please speak up ;-).
Early testing on NFS mounted NAS has been favorable, i.e. at least the
data does not get corrupted as it did on the SAN. And like I said, our
only other option appears to be spreading the data over multiple
volumes, which is a route we'd rather not take.


I have been doing a *lot* of testing of PG 7.4 over NFS with a couple of
EMC Celerras and have had excellent results thus far.

My best NFS results were within about 15% of the speed of my best SAN
results.

However, my results changed drastically under the 2.6 kernel, when the
NFS results stayed about the same as 2.4, but the SAN jumped about 50%
in transactions per second.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #6
Cott Lang wrote:
My best NFS results were within about 15% of the speed of my best SAN
results.
Good info, and consistent with what I've seen.
However, my results changed drastically under the 2.6 kernel, when the
NFS results stayed about the same as 2.4, but the SAN jumped about 50%
in transactions per second.


Very interesting. Whose SAN are you using that supports the 2.6 kernel?

Thanks,

Joe

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 23 '05 #7
On Tue, 2004-08-31 at 20:37, Joe Conway wrote:
However, my results changed drastically under the 2.6 kernel, when the
NFS results stayed about the same as 2.4, but the SAN jumped about 50%
in transactions per second.


Very interesting. Whose SAN are you using that supports the 2.6 kernel?


I'm using EMC Clariions, but they do not officially support using the
2.6 kernel. Rumor (from them) has it that it will be supported in
October. I tested it because I felt like it would be useful knowledge
moving forward. :)

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2106
by: louis nguyen | last post by:
Hi All, My question is what are the best practices for administering large DBs. (My coworker is the DB administrator. I'm more of the developer. But slowly being sucked in.) My main concern is that we have some DBs that take approx 3 hrs a night just to rebuild the indexes. I know that with MSSQL 2000, I can use partitioned views to break out the table(s) into smaller databases and tables. But we also have an older server that...
6
2657
by: Greg | last post by:
I am working on a project that will have about 500,000 records in an XML document. This document will need to be queried with XPath, and records will need to be updated. I was thinking about splitting up the XML into several XML documents (perhaps 50,000 per document) to be more efficient but this will make things a lot more complex because the searching needs to go accross all 500,000 records. Can anyone point me to some best practices...
57
25529
by: Bing Wu | last post by:
Hi all, I am running a database containing large datasets: frames: 20 thousand rows, coordinates: 170 million row. The database has been implemented with: IBM DB2 v8.1
24
4046
by: Salad | last post by:
Every now and then I see ads that state something like "Experience with Large Databases ...multi-gig...blah-de-blah" And I have to laugh. What's the difference between a large or small database? A table is a table, a record is a record, a field is a field. All you are doing is manipulating data in tables. I wouldn't think it'd make much difference in working with a table with 10 records or a billion records...they're nothing more than...
19
7829
by: Chaz Ginger | last post by:
I have a system that has a few lists that are very large (thousands or tens of thousands of entries) and some that are rather small. Many times I have to produce the difference between a large list and a small one, without destroying the integrity of either list. I was wondering if anyone has any recommendations on how to do this and keep performance high? Is there a better way than Thanks.
2
5477
by: daniel | last post by:
I have the following scenario. A mysql database running 3 databases. It is version 5.0.27 on Windows XP Prof.. All innodb databases. The one database is particularly large (7.8GB of data)...pretty much held in 1 table....there are probabably 30 tables in the rest of the databases....combined they probably take up 200MB. The machine is pretty well spec'ed AMD X2 4600+, 2GB RAM, SATA RAID1. Normally the services that use the databases are...
2
1172
by: MGM | last post by:
I have a bunch of fairly large databases, each going anywhere from 1,000 rows to 50,000 rows. These databases, however, need some editing. They have extra columns that need to be removed and certain other columns that need to be populated. What would be the fastest way in doing this? I'll also need to add these database to another large database (a "total" database) that contains all of the data. Should I do the edits beforehand on the...
4
2732
by: raidvvan | last post by:
Hi there, We have been looking for some time now for a database system that can fit a large distributed computing project, but we haven't been able to find one. I was hoping that someone can point us in the right direction or give us some advice. Here is what we need. Mind you, these are ideal requirements so we do not expect to find something that fits entirely into what we need
2
1778
by: ARC | last post by:
I'm testing a user's db that contains a very large number of records. I have an invoice screen, with an invoice select dropdown box that shows all invoices, and the customer's name, etc. With 80,000+ invoices in this particular db, clicking the dropdown is painfully slow accross a network only (I'm testing on a wireless, so it's even slower still). All fields are properly indexed, such as the invoice number (sort descending), etc. I've...
22
3667
by: Jesse Burns | last post by:
I'm about to start working on my first large scale site (in my opinion) that will hopefully have 1000+ users a day. ok, this isn't on the google/facebook scale, but it's going to be have more hits than just family and friends. Either way, I'm planning on this site blowing up once I have enough of a feature set, so I'm concerned about performance and scalability in the long run. I've worked for a software company, but I've never...
0
9671
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10212
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10161
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9035
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7538
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5436
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5560
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
3720
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2919
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.