473,800 Members | 2,640 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Linux ready for high-volume databases?

On Mon, 2003-08-25 at 16:28, Gregory S. Williamson wrote:
One of our sysads sent this link ... wondering if there is any comment on it from the world of actual users of linux and a database.

<http://story.news.yaho o.com/news?tmpl=story &cid=1738&ncid= 738&e=9&u=/zd/20030825/tc_zd/55311>


"Weak points include lack of available tools, ease of use and ease
of installation"

Sounds like he needs point-and-drool tools...

On the other hand, could even a beefy Linux 2.4 *today* system handle
a 24x7 500GB db that must process 6-8M OLTP-style transactions per
day, while also getting hit by report queries?

Don't think of this as a troll, because I really don't know, even
though I do know that MVS, OpenVMS & Solaris can. (I won't even
ask about toys like Windows and FreeBSD.)

--
-----------------------------------------------------------------
Ron Johnson, Jr. ro***********@c ox.net
Jefferson, LA USA

"Knowledge should be free for all."
Harcourt Fenton Mudd, Star Trek:TOS, "I, Mudd"
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 11 '05
18 2095
Dennis Gearon <ge*****@firese rve.net> writes:
With the low cost of disks, it might be a good idea to just copy to disks, that
one can put back in.


Uh, sure, using hardware raid 1 and breaking one set of drives out of the
mirror to perform the backup is an old trick. And for small databases backups
are easy that way. Just store a few dozen copies of the pg_dump output on your
live disks for local backups and burn CD-Rs for offsite backups.

But when you have hundreds of gigabytes of data and you want to be able to
keep multiple snapshots of your database both on-site and off-site... No, you
can't just buy another hard drive and call it a business continuity plan.

As it turns out my current project will be quite small. I may well be adopting
the first approach. I'm thinking taking a pg_dump regularly (nightly if I can
get away with doing it that infrequently) keeping the past n dumps, and
burning a CD with those dumps.

This doesn't provide what online backups do, of recovery to the minute of the
crash. And I get nervous having only logical pg_dump output, no backups of the
actual blocks on disk. But is that what everybody does?

--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 11 '05 #11
On 26 Aug 2003 at 9:15, Al Hulaton wrote:
After seeing this article yesterday, I did a bit of research. One _big_ reason
why Sourceforge/VA/OSDN is moving over to IBM/Webshere/DB2 from PostgreSQL is
the resulting product will be jointly marketed by Sourceforge and IBM's
zillions of sales people. So not only will they get a shiny, new db, but
backend revenue.

"The companies will jointly market and sell the software as part of the
commercial agreement. "-- 4th paragraph, last sentence.
http://www.eweek.com/print_article/0...a=30025,00.asp


<From vague memory somewhere from some article>

One of the technical reasons sourceforge went to DB2 was that DB2 had
clustering. Postgresql could not scale beyond single machine and on single
machine it had limitations on scaling as well.

Note that this was done quite a while back. Today postgresql might be as
scalable as required by sourceforge but they needed it then and had to move.

</From vague memory somewhere from some article>

<rant>
However if DB clustering was the problem, personally I would have split the
data on two machines on two different databases and had an app consolidate that
data. The efforts in rewriting app. could have been well compensated for
performance hit sourceforge took immediately after the move.

But that's me..
<rant>

Bye
Shridhar

--
There are always alternatives. -- Spock, "The Galileo Seven", stardate 2822.3
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #12
On Tue, 2003-08-26 at 23:35, Greg Stark wrote:
Dennis Gearon <ge*****@firese rve.net> writes: [snip] This doesn't provide what online backups do, of recovery to the minute of the
crash. And I get nervous having only logical pg_dump output, no backups of the
actual blocks on disk. But is that what everybody does?


Gak!! It can never be guaranteed that the "actual blocks on disk"
are transactionally consistent. Thus, the pg_dump output is suff-
icient.

However, there is still the large problem of PITR. Unless you
double your h/w and run Postgresql-R, you can not guarantee recov-
ery to an exact point in time if there is a hardware failure that
destroys the database.

Therefore, you can only restore databases to the time that the
last pg_dump was taken.

--
-----------------------------------------------------------------
Ron Johnson, Jr. ro***********@c ox.net
Jefferson, LA USA

The difference between Rock&Roll and Country Music?
Old Rockers still on tour are pathetic, but old Country singers
are still great.
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 11 '05 #13
>>>>> "GS" == Greg Stark <gs*****@mit.ed u> writes:

GS> Vivek Khera <kh***@kcilink. com> writes:
I run a 24x7x365 db on FreeBSD which has *never* crashed in the 3
years it has been in production. Only downtime was the upgrade from
PG 7.1 to 7.2 and once for a switchover from RAID5 to RAID10.


GS> I would be interested to know what backup strategy you use for
GS> this. Without online backups this means that if you had crashed
GS> you would have lost data up to the last pg_dump you took? Had you
GS> done tests to see how long it would have taken to restore from the
GS> pg_dump?

Currently it is pg_dump. Once the new server is online this week,
we'll be using eRServer to keep a 'hot spare' slave ready for quick
switchover.

Both systems use RAID10 hardware arrays for the database.

Restore from dump takes about an hour for the data, and then the rest
of eternity (something like 18 hours last time I did it) for index
generation.

The pg_dump process takes about 52 minutes across the network.

GS> Oh, it's a really small database. That helps a lot with the backup
GS> problems of 24x7 operation. Still I would be interested.

Well, perhaps, but it is big enough and pounded on enough
(read/insert/update very often) that it saturates the disk. I have
memory to spare according to the system stats.

I personally *really* wonder how people run DBs that are much larger
and have high rate of read/insert/update across large tables with RI
checks and all that normal good stuff. The tuning recommendations I
have been through are insufficent to really help for my load. Perhaps
my current server hardware just isn't up to it.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: kh***@kciLink.c om Rockville, MD +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 11 '05 #14
>>>>> "GS" == Greg Stark <gs*****@mit.ed u> writes:

GS> the first approach. I'm thinking taking a pg_dump regularly
GS> (nightly if I can get away with doing it that infrequently)
GS> keeping the past n dumps, and burning a CD with those dumps.

Basically what I do. I burn a set of CDs from one of my dumps once a
week, and keep the rest online for a few days. I'm really getting
close to splurging for a DVD writer since my dumps are way too big for
a single CD.

GS> This doesn't provide what online backups do, of recovery to the
GS> minute of the crash. And I get nervous having only logical pg_dump
GS> output, no backups of the actual blocks on disk. But is that what
GS> everybody does?

Well, if you want backups of the blocks on disk, then you need to shut
down the postmaster so that it is a consistent copy. You can't copy
the table files "live" this way.

So, yes, having the pg_dump is pretty much your safest bet to have a
consistent dump. And using a replicated slave with, eg, eRServer, is
also another way, but that requires more hardware.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: kh***@kciLink.c om Rockville, MD +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 11 '05 #15
>>>>> "AH" == Alvaro Herrera <al******@dcc.u chile.cl> writes:

AH> On Wed, Aug 27, 2003 at 12:21:53PM -0400, Vivek Khera wrote:
>>>>> "GS" == Greg Stark <gs*****@mit.ed u> writes:

GS> Vivek Khera <kh***@kcilink. com> writes:

GS> Oh, it's a really small database. That helps a lot with the backup
GS> problems of 24x7 operation. Still I would be interested.
Well, perhaps, but it is big enough and pounded on enough
(read/insert/update very often) that it saturates the disk. I have
memory to spare according to the system stats.


AH> Well, was it really a 27 MB database, or it was a typo and you meant 27
AH> GB? The latter doesn't fit in my "really small database" category...

Yes, typo. It is 27GB *not* just 27MB. Heck, I could do 27MB on
my solid state drive at an incredible speed ;-)

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 11 '05 #16
>>>>> "GS" == Greg Stark <gs*****@mit.ed u> writes:
The DB is currently about 27Mb on disk (including indexes) and
processes several million inserts and updates daily, and a few million
deletes once every two weeks.


GS> Oh, it's a really small database. That helps a lot with the backup
GS> problems of 24x7 operation. Still I would be interested.

Ok... so I re-read my post. I mean 27Gb on disk. Duh. Sorry for the
confusion!
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: kh***@kciLink.c om Rockville, MD +1-240-453-8497
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 11 '05 #17
Hello all:

I'm building a web-based app that is purely a query tool: no data can be
added or edited. Postgres is the back end.

What steps do I need to take to make the setup as fast as possible for
read-only access? Are there any default settings I can disable because I
don't need them, and gain some speed that way? AFAIK there's no way to turn
transactions off, but what about something like f_sync? Will I get a
performance boost by turning that off?

I'm aware of the "standard" pgsql optimizations and I'll do my best to put
those in place. I'm wondering whether there's anything extra I can do, that
might not normally be "safe", but might become so in a read-only
environment.

All the data will be scrubbed out every night and refreshed from the
original source. Should I be running a VACUUM ANALYZE after each refresh?
Any other optimizations or hints I can pass along to the query processor
that reflect the fact that the data will NEVER change between VACUUM passes?

Thanks for any thoughts or advice.

-- sgl
=============== =============== =============== ==========
Steve Lane

Vice President
The Moyer Group
14 North Peoria St Suite 2H
Chicago, IL 60607

Voice: (312) 433-2421 Email: sl***@moyergrou p.com
Fax: (312) 850-3930 Web: http://www.moyergroup.com
=============== =============== =============== ==========
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #18

Steve Lane <sl***@moyergro up.com> writes:
Ron Johnson <ro***********@ cox.net> writes:
Dennis Gearon <ge*****@firese rve.net> writes:

This doesn't provide what online backups do, of recovery to the minute of the
crash. And I get nervous having only logical pg_dump output, no backups of the
actual blocks on disk. But is that what everybody does?


Gak!! It can never be guaranteed that the "actual blocks on disk"
are transactionally consistent. Thus, the pg_dump output is suff-
icient.

Hello all:

I'm building a web-based app that is purely a query tool: no data can be
added or edited. Postgres is the back end.


What does this have to do with online backups vs pg_dump ?
Please don't follow up to threads with unrelated questions.

In any case you're far more likely to see answers if you post a message
properly as your message won't show up buried inside old threads in people's
mail user agents.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #19

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
2535
by: Tux | last post by:
Is it possible and does someone try to make driver for Linux in Java, for some USB device? Could Java do that ? I saw jUSB package but it seems that works only with the devices which Linux already see.
34
5101
by: Maboroshi | last post by:
Hello My question has to do with python and linux - I was interested in finding out what it would take to reimplement the Linux Kernel in python basically just taking the source code from linux and rewriting it in python Would this idea make sense to do - if so what would be the benefits of doing this and in what way would this not be a good idea Cheers
26
2479
by: Simon | last post by:
I'm doing a survey. When do you think GNU/Linux will be ready for the average Joe? What obstacles must it overcome first?
7
2692
by: Dan V. | last post by:
Situation: I have to connect with my Windows 2000 server using VS.NET 2003 and C# and connect to a remote Linux server at another company's office and query their XML file. Their file may be updated every hour or so. How can I do this easily? I would like to use secure communication even encryption if possible. I would query and insert locally only the newest records found in that XML file to an xml or MS access db.
10
3278
by: Markus Enders | last post by:
Hi everybody, currently we use DB2 7.1 on several Solaris machines. Now we are planning, to migrate to SUSE Linux (newest version). I wonder, if we can keep our DB2 version 7.1, or if we need to upgrade to version 8. How are your experiences running DB2 in version 7.1 on Linux. How difficult is the migration from Solaris to Linux concerning DB2? Are there some pitfalls or can I simply install the DB2, create my database (tables and...
3
5132
by: mairhtin o'feannag | last post by:
Hello, Since DB2 Cluster Certification Guide is out of print and I cannot seem to get my hands on a copy, is there a publication that would give me a "cookbook" approach to setting up a cluster using Linux and multiple machines? I came across a couple of things, but they all assume one honking great AIX box with multiple SPs and that sort of thing.
4
2685
by: Abra | last post by:
I need to write a Windows application with GUI, that should be able to run also on Linux. I have already a .NET application (written in C#) that uses (among others) Windows.Forms, ADO.NET, and TCP/IP (sockets). Which is the best way to get it running under Linux ? I read about Mono, would this be a reliable way ? Is Mono already "production-ready", or still beta ? Are the .NET parts that I mentioned above already completely implemented in...
10
1739
by: stylecomputers | last post by:
Hey guys, I am absolutely new to Linux programming, with no w######s programming experience except a small amount of C++ console apps. Reasonably new to Linux, BSD etc, got good sound networking base of knowledge and dont have any problem working the command line etc. I want to learn a language that I can use in my networking duties that is most likely to be of use to me. I have a few choices I can think of being:
19
15013
by: John | last post by:
The table below shows the execution time for this code snippet as measured by the unix command `time': for i in range(1000): time.sleep(inter) inter execution time ideal 0 0.02 s 0 s 1e-4 4.29 s 0.1 s 1e-3 4.02 s 1 s
0
9550
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10501
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10250
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10032
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7574
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6811
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5603
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4149
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2944
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.