Chapter 8: Transactions, MVCC, WALs, and Checkpoints
Managing multiple operations using transactions in PostgreSQL.
Transactions #
A transaction is an atomic unit of work that either succeeds or fails.
Transactions are a key feature of any database system and are what allows a database to implement ACID properties:
- Atomicity - Transactions are composed of multiple statements. Atomicity guarantees that each transaction is treated as a single unit.
- Consistency - Ensures that the database is in a consistent state from one to another.
- Isolation - Ensures concurrent actions leave the database in the same state as if they were executed sequentially.
- Durability - Guarantees that once a transaction has been committed, it will remain committed, even if a case of system failure.
In PostgreSQL, a transaction is implicitly provided in each statement.
All transactions in PostgreSQL are assigned a unique number, called the
transaction identifier, or xid
for short.
PostgreSQL stores the xid
in a certain tuple within the tuple itself.
To inspect the current transaction xid
, query the txid_current()
function.
SELECT txid_current();
There is also the special xmin
hidden column available in all tables that can be used
to determine which transaction created the tuples:
SELECT xmin, * FROM my_table;
Implicit Transactions #
By default, each statements are wrapped within a transaction implicitly.
INSERT
statements for example will have their own unique transaction block.
INSERT INTO tags(tag) VALUES ('JavaScript');
INSERT INTO tags(tag) VALUES ('C');
INSERT INTO tags(tag) VALUES ('Python');
INSERT INTO tags(tag) VALUES ('Go');
INSERT INTO tags(tag) VALUES ('PHP');
Querying the tags
table with the xmin
column should derive a unique xid
per row.
SELECT xmin, tag from tags;
Explicit Transactions #
To explicitly define a transaction block for the next following statements, use the
BEGIN
and COMMIT
block to start and store the statements respectively.
BEGIN; -- Starts the transaction
INSERT INTO tags(tag) VALUES ('C#');
INSERT INTO tags(tag) VALUES ('C++');
COMMIT; -- Ends the transaction and commits the changes in storage.
Querying the tags
table now should provide the same xid
for the C#
and C++
rows.
SELECT xmin, tag from tags;
Another keyword that can be used within a transaction block is the ROLLBACK
keyword,
which ends the transaction block without committing the statements to storage.
BEGIN; -- Starts the transaction
INSERT INTO tags(tag) VALUES ('C#');
INSERT INTO tags(tag) VALUES ('C++');
ROLLBACK; -- Ends the transaction and throws away the changes.
Note: Errors that occur within any transactions will cause it to rollback the changes.
xid
Wrap Around Problem #
Since each transaction is identified with a transaction identifier xid
which is
incremented per transaction.
The xid
is used with a modulo operation, so that for any current xid
value there are
only transactions available.
Once this limit is reached, the xid
value wraps around, which then introduces a problem
within PostgreSQL where some tuple (or tuples) has another matching xid
value, which
PostgreSQL does not like.
A warning will be provided along with the action to take if the xid
value is nearing its
wraparound state.
WARNING: database "somedb" must be vacuumed within 177009986
transactions
HINT: To avoid a database shutdown, execute a database-wide VACUUM in
"somedb".
The PostgreSQL tool named VACUUM
will freeze old tuples so as to prevent the side effects
of the xid
wraparound.
Multi-Version Concurrency Control (MVCC) #
MVCC in PostgreSQL is a multiversion model where each SQL statement sees a snapshot of data as it was some time ago, regardless of the current state of the underlying data.
This prevents statements from viewing inconsistent data produced by concurrent transactions performing updates on the same data, providing transaction isolation for each database session.
MVCC minimizes lock contention in order to allow for reasonable performance in multiuser environments.
The main advantage of using the MVCC model of concurrency control rather than locking is that in MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data, and so reading never blocks writing and writing never blocks reading.
Learn more in PostgreSQL: Introduction to MVCC.