Write-Ahead Logging (WAL) is a standard method for ensuring data integrity. A detailed description can be
found in most (if not all) books about transaction processing. Briefly, WAL's central concept is that
changes to data files (where tables and indexes reside) must be written only after those changes have been
logged, that is, after log records describing the changes have been flushed to permanent storage. If we
follow this procedure, we do not need to flush data pages to disk on every transaction commit, because we
know that in the event of a crash we will be able to recover the database using the log: any changes that
have not been applied to the data pages can be redone from the log records. (This is roll-forward recovery,
also known as REDO.)
Using WAL results in a significantly reduced number of disk writes, because only the log file needs to be
flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the
transaction. The log file is written sequentially, and so the cost of syncing the log is much less than the
cost of flushing the data pages. This is especially true for servers handling many small transactions
touching different parts of the data store. Furthermore, when the server is processing many small concurrent
transactions, one fsync of the log file may suffice to commit many transactions.
With most installations and packages, 16 MB is the size of the wal segments. Unless your transaction rate is
through the roof, 16 MB size is good enough. You can change this size by adjusting the –with-wal-segsize
configure option.
So, as long as you keep making changes to the data in your database, WAL files will keep building up in your
pg_wal directory. (pg_wal directory is used to store WAL files in PostgreSQL 10. In the previous versions,
the directory was called pg_xlog). You can, of course, control the number of these files you want to store
by using wal_keep_segments. Or you can control the total size of WAL files by using max_wal_size.
As new records are written, they are appended to WAL logs. Its position is defined by a Log Sequence Number.
The Log Sequence Number (LSN) is a unique identifier in the transaction log. It represents a position in the
WAL stream. That is, as records are added to the Postgres WAL log, their insert positions are described by
the Log Sequence Number. pg_lsn is the datatype in which a Log Sequence Number is returned.
WAL Configuration
-
Shared Buffers. Inside the postgresql.conf file, there is a parameter called shared_buffers. They
are called ‘shared’ buffers because all of the background servers can access them. This parameter
determines the amount of memory allocated to PostgreSQL for caching data. Dealing with memory is faster
than reading from or writing to the files and hence it improves performance. A back-end process requests
the buffer manager for a page and the buffer manager either loads the required page from persistent
storage or returns it from the buffer pool if it is available. Now the back-end process can access the
page.
-
Dirty Pages. When data is modified in memory, the page in which this data is stored is called a
‘dirty page’, as long as the modifications are not written to disk. So if there is a page that is
different in the shared buffer and the disk, it is called a dirty page. The buffer manager flushes the
dirty pages to storage with the assistance of two subsystems called – checkpointer and background
writer.
-
Checkpointer. As the name suggests, checkpointer creates checkpoints in the Write-Ahead Log.
These checkpoints are used for recovery. A checkpoint indicates that all the information prior to the
checkpoint has been updated. So at every checkpoint, dirty pages are flushed to disk. How often a
checkpoint is begun depends on checkpoint_segments and checkpoint_timeout.
The integer, checkpoint_segments indicates the maximum number of log segments between two checkpoints.
The default value is 3 segments, where each segment is usually 16 MB. This value can be adjusted in the
postgresql.conf file.
On the other hand, checkpoint_timeout is an integer that indicates the maximum time between two
checkpoints. The default time is 5 minutes/ 300 seconds. This value can also be adjusted in the
postgresql.conf file.
-
Background Writer. The background writer is a subsystem that assists the checkpointer. What it
does is, move some of the modified data to the disk offloading the checkpointer. You should however be
careful before tuning it as it increases the I/O load, especially in cases where there is some page that
is being dirtied repeatedly. In such a case the background writer would write it repeatedly within a
checkpoint interval.
Performance of Postgres WAL
PostgreSQL WAL files are written one by one during the regular work of server. The functionality of HDD also
works conveniently, but this kind of workload is different from the data accessing load. It is always
recommended to store Postgres WAL files on a different physical disk mounted to the file system of your
server.
- Synchronous Commit: With this commit, you will not be able to continue our work until all the WAL files
are stored on the disk. With synchronous commit, you can get 900 transactions per second.
- Asynchronous Commit: It allows faster completion of the transaction at the cost of data loss. With this
commit, it returns success as soon as the transaction is completed logically, even before the WAL
records it, they make their way to disk. You get 1500 transactions per second with asynchronous commit
mode.
Benefits of Postgre WAL
- As only log files are flushed to disk during transnational commit, it reduces the number of disk
writes.
- The cost of syncing your log file is less as the log files are written sequentially.
- It adds data page consistency .
- Postgres WAL offers on-line backup and point-in-time recovery.