473,766 Members | 2,130 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

What is WAL used for?

I'm just trying to figure out the terminology that is used on this board and
wanted to know what is WAL and what roll does it play in Postgresql?

Thanks
Nov 12 '05 #1
13 9191
WAL is write-ahead logging. Basically, before the database actually
performs an operation, it writes in a log what it's about to do. Then, it
goes and does it. This ensures data consistency. Let's say that the
computer was powered off suddenly. There are several points that could
happen:

1) before a write - in this case the database would be fine with or
without write-ahead logging.

2) during a write - without write-ahead logging, if the machine is powered
off during a write, the database has no way of knowing what remained to be
written, or what was being written. WIth Postgres, this is furthere
broken down into two possibilities:

* The power-off occurred while it was writing to the log - in this
case, the log is rolled back. The database is unaffected because the data
was never written to the database proper.

* The power-off occurred after writing to the log, while writing to
disk - in this case, Postgres can simply read from the log what was
supposed to be written, and complete the write.

3) after a write - again, this does not affect Postgres either with or
without WAL.

In addition, WAL increases PostgreSQL's efficiency, because it can delay
random-access writes to disk, and just do sequential writes to the log for
a long time. This reduces the amount of head-seek the dissk are doing.
If you store your WAL files on a different disk, you get even more speed
advantages.

Jon

On Tue, 25 Nov 2003, Relaxin wrote:
I'm just trying to figure out the terminology that is used on this board and
wanted to know what is WAL and what roll does it play in Postgresql?

Thanks

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #2
Jonathan,

Could you tell me what is the real impact of "fsync=fals e" on the WAL and on the
database in the same catastrophic scenario ?

Thierry Missimilly

Jonathan Bartlett wrote:
WAL is write-ahead logging. Basically, before the database actually
performs an operation, it writes in a log what it's about to do. Then, it
goes and does it. This ensures data consistency. Let's say that the
computer was powered off suddenly. There are several points that could
happen:

1) before a write - in this case the database would be fine with or
without write-ahead logging.

2) during a write - without write-ahead logging, if the machine is powered
off during a write, the database has no way of knowing what remained to be
written, or what was being written. WIth Postgres, this is furthere
broken down into two possibilities:

* The power-off occurred while it was writing to the log - in this
case, the log is rolled back. The database is unaffected because the data
was never written to the database proper.

* The power-off occurred after writing to the log, while writing to
disk - in this case, Postgres can simply read from the log what was
supposed to be written, and complete the write.

3) after a write - again, this does not affect Postgres either with or
without WAL.

In addition, WAL increases PostgreSQL's efficiency, because it can delay
random-access writes to disk, and just do sequential writes to the log for
a long time. This reduces the amount of head-seek the dissk are doing.
If you store your WAL files on a different disk, you get even more speed
advantages.

Jon

On Tue, 25 Nov 2003, Relaxin wrote:
I'm just trying to figure out the terminology that is used on this board and
wanted to know what is WAL and what roll does it play in Postgresql?

Thanks

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #3
> Could you tell me what is the real impact of "fsync=fals e" on the WAL and on the
database in the same catastrophic scenario ?
I am not certain on this point, but I believe fsync=false messes up the
whole thing. The nice thing about WAL is that fsync is no longer as much
of a slowdown, because PG rarely has to do random-access writes to the
disk.

Jon

Thierry Missimilly

Jonathan Bartlett wrote:
WAL is write-ahead logging. Basically, before the database actually
performs an operation, it writes in a log what it's about to do. Then, it
goes and does it. This ensures data consistency. Let's say that the
computer was powered off suddenly. There are several points that could
happen:

1) before a write - in this case the database would be fine with or
without write-ahead logging.

2) during a write - without write-ahead logging, if the machine is powered
off during a write, the database has no way of knowing what remained to be
written, or what was being written. WIth Postgres, this is furthere
broken down into two possibilities:

* The power-off occurred while it was writing to the log - in this
case, the log is rolled back. The database is unaffected because the data
was never written to the database proper.

* The power-off occurred after writing to the log, while writing to
disk - in this case, Postgres can simply read from the log what was
supposed to be written, and complete the write.

3) after a write - again, this does not affect Postgres either with or
without WAL.

In addition, WAL increases PostgreSQL's efficiency, because it can delay
random-access writes to disk, and just do sequential writes to the log for
a long time. This reduces the amount of head-seek the dissk are doing.
If you store your WAL files on a different disk, you get even more speed
advantages.

Jon

On Tue, 25 Nov 2003, Relaxin wrote:
I'm just trying to figure out the terminology that is used on this board and
wanted to know what is WAL and what roll does it play in Postgresql?

Thanks

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #4
Jon,

I have tried a little bench with pgbench on my 2 proc 2.4 Gb with 4 GB RAM and Linux
RH 9.0.
The database size is 700 MB, so it can be loaded in memory.
Postgres 7.4 is on disk sda (Root disk)
Meta Data are on disk sdb
bench data are on disk sdc

When pgbench is running, i can see with top tool that the CPU are 53% in I/O wait. And
mainling because postgres is writting block on sdb disk. And the Transaction Per
Second (tps) are 222.

By setting "fsync=fals e", the CPU I/O wait decrease to 0.6%. And the result tps is :
466.

So, should i conclude that even if the whole database is in memory, the TPS result is
slow down by the WAL mecanism which wait for writting the log on disk ?
And the main thing to increase the TPS and preserve the consistency of data in case of
crash is to increase the I/O throughput of the Postgres WAL disk by creating RAID0 on
fiber channel subsystem (I will test that as soon asap).

Regards,
Thierry

Jonathan Bartlett wrote:
Could you tell me what is the real impact of "fsync=fals e" on the WAL and on the
database in the same catastrophic scenario ?


I am not certain on this point, but I believe fsync=false messes up the
whole thing. The nice thing about WAL is that fsync is no longer as much
of a slowdown, because PG rarely has to do random-access writes to the
disk.

Jon

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Nov 12 '05 #5
On Fri, 28 Nov 2003 15:19:36 +0100, Thierry Missimilly wrote:
I have tried a little bench with pgbench on my 2 proc 2.4 Gb with 4 GB RAM
and Linux RH 9.0.
...
Which filesystem in which mode? Yes, that's relevant and in fact the
make-or-break factor here, at least from the POV of the hard drive.
I guess RH9 uses ext3 in journaled mode by default, which does data as
well as metadata journaling. Retry your benchmarks with both ext2 and ext3
in data=writeback mode; both results should be much closer to each other.
So, should i conclude that even if the whole database is in memory, the
TPS result is slow down by the WAL mecanism which wait for writting the


No, you need to take the working of your filesystem into account. As soon
as data journaling comes into play, it is normal and in fact unavoidable
that performance drops, because everything is written effectively twice -
once into the log, once into the file, and to do so the drive has to move.
WAL with ext3's data journaling is quite unnecessary because the WAL
sort of IS the database's journal.

Holger
--
A: Maybe because some people are too annoyed by top-posting.
Q: Why do I not get an answer to my question(s)?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #6
Holger Hoffstaette wrote:
No, you need to take the working of your filesystem into account. As soon
as data journaling comes into play, it is normal and in fact unavoidable
that performance drops, because everything is written effectively twice -
once into the log, once into the file, and to do so the drive has to move.
WAL with ext3's data journaling is quite unnecessary because the WAL
sort of IS the database's journal.


Logically seems right but in practice may be untrue. I've found for my
apps, data=journal performs better. When I was picking filesystems, I
did a whole bunch of Googling and there were quite a few people who also
said data=journal performed faster for their Postgres or DB config.
Here's one explanation I found:

"If the database is seeking all over the filesystem and then running
fsync(), then ext3 in data=journal mode can make a huge difference,
because all the dirty data is written out *linearly* to the journal, for
later aysnchronous writeback. This can offer 10x speedups or more."

Nov 12 '05 #7
> WAL with ext3's data journaling is quite unnecessary because the WAL
sort of IS the database's journal.
I believe you are mistaken. ext3 data journalling only does the
filesystem. It has no concept of the structure of the database itself.
WAL is still necessary to keep consistency on the table itself.

Holger
--
A: Maybe because some people are too annoyed by top-posting.
Q: Why do I not get an answer to my question(s)?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 12 '05 #8
Jonathan Bartlett wrote:
WAL with ext3's data journaling is quite unnecessary because the WAL
sort of IS the database's journal.


I believe you are mistaken. ext3 data journalling only does the
filesystem. It has no concept of the structure of the database itself.
WAL is still necessary to keep consistency on the table itself.


What he means is that PostgreSQL doesn't need the file contents restore
pristine on crash recovery, just the directory structure and WAL can
recreate the file contents.

--
Bruce Momjian | http://candle.pha.pa.us
pg***@candle.ph a.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 12 '05 #9


Holger Hoffstaette wrote:
On Fri, 28 Nov 2003 15:19:36 +0100, Thierry Missimilly wrote:
I have tried a little bench with pgbench on my 2 proc 2.4 Gb with 4 GB RAM
and Linux RH 9.0.
...


Which filesystem in which mode? Yes, that's relevant and in fact the
make-or-break factor here, at least from the POV of the hard drive.
I guess RH9 uses ext3 in journaled mode by default, which does data as
well as metadata journaling. Retry your benchmarks with both ext2 and ext3
in data=writeback mode; both results should be much closer to each other.


You are right, my filesystem types are ext3.

With the data=writeback mode, I increase the TPS by 18% and dicrease the wait
I/O from 54% to 30%.
I did not change my filesystem to ext2 as I have to have to cancel the partition
and recreate all the database. Futhermore, i have understood that journaled
filesystem allowed better and faster fsck after a Power off crash and it is not
redundant with the WAL Crash recovery.
I think that "journaling " is at file system level and WAL is above in the
Database level. What happen if the xlog filesystem has been breakdown by a power
off. All the Data concisentcy done by PG will be lost. I hope that data stored
in the FS journal, can avoid that.

Thierry Missimilly
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 12 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
6814
by: Ravi Tallury | last post by:
Hi We are having issues with our application, certain portions of it stop responding while the rest of the application is fine. I am attaching the Java Core dump. If someone can let me know what the issue is. Thanks Ravi
54
6575
by: Brandon J. Van Every | last post by:
I'm realizing I didn't frame my question well. What's ***TOTALLY COMPELLING*** about Ruby over Python? What makes you jump up in your chair and scream "Wow! Ruby has *that*? That is SO FRICKIN' COOL!!! ***MAN*** that would save me a buttload of work and make my life sooooo much easier!" As opposed to minor differences of this feature here, that feature there. Variations on style are of no interest to me. I'm coming at this from a...
6
22272
by: Zhang Weiwu | last post by:
Hello. I am working with a php software project, in it (www.egroupware.org) Chinese simplified locate is "zh" while Traditional Chinese "tw". I wish to send correct language attribute in http header, I found "zh" is not standard. I found this line in apache2's default httpd.conf # Simplified Chinese (zh-CN) AddLanguage zh-CN .zh-cn
121
10150
by: typingcat | last post by:
First of all, I'm an Asian and I need to input Japanese, Korean and so on. I've tried many PHP IDEs today, but almost non of them supported Unicode (UTF-8) file. I've found that the only Unicode support IDEs are DreamWeaver 8 and Zend PHP Studio. DreamWeaver provides full support for Unicode. However, DreamWeaver is a web editor rather than a PHP IDE. It only supports basic IntelliSense (or code completion) and doesn't have anything...
2
3677
by: Martin Hst Normark | last post by:
Hi everyone Has anyone got the least experience in integrating the Digital Signature with an ASP.NET Web Application? Here in Denmark, as I supose in many other countries, they're promoting the digital signature. A lot of people already has one, to do their taxes, and much more. I have to use for a business-to-business e-commerce solution, where it's vital that the right user is being logged on, and not give his username and password...
4
3195
by: Charles Law | last post by:
I've been using monitors a bit lately (some of you may have heard ;-) ) and then up pop Manual and AutoResetEvents , and they look for all the world like the same thing. Are they interchangeable, or when should I use one over the other? TIA Charles
669
26185
by: Xah Lee | last post by:
in March, i posted a essay “What is Expressiveness in a Computer Language”, archived at: http://xahlee.org/perl-python/what_is_expresiveness.html I was informed then that there is a academic paper written on this subject. On the Expressive Power of Programming Languages, by Matthias Felleisen, 1990. http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
11
2493
by: Paul Brady | last post by:
Apparently, I have been living on the wrong planet. I have written 15 databases in Microsoft Access in the past 10 years, some of which are split, one uses ODBC interface with a SQL server, one has a many-to-many relationship, and all of which have Basic code to handle events and run functions which I have coded, and all this time have never heard of .Net -- until today. So, I looked it up on Google, asking for Access and .Net and got...
89
5762
by: Tubular Technician | last post by:
Hello, World! Reading this group for some time I came to the conclusion that people here are split into several fractions regarding size_t, including, but not limited to, * size_t is the right thing to use for every var that holds the number of or size in bytes of things. * size_t should only be used when dealing with library functions.
32
2737
by: Stephen Horne | last post by:
I've been using Visual C++ 2003 for some time, and recently started working on making my code compile in GCC and MinGW. I hit on lots of unexpected problems which boil down to the same template issue. A noddy mixin layer example should illustrate the issue... class Base { protected: int m_Field;
0
9571
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
9404
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10168
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9838
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8835
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7381
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6651
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5279
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
2806
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.