PostgreSQL Update Internals
I recently covered the internals of a row in PostgreSQL, but that was just the storage piece. I got more curious and decided that I would look into what happens when a row gets updated. There are a lot of complexities to data, after all, and it’s nice to know how our database is going to be affected by updates.
Getting Set Up
I started by using the customer table from the pagila sample database. Rather than come up with a set of sample data, I figured it would be easy to work within an existing set of data. The first trick was to find a customer to update. Since the goal is to look at an existing row, update it, and then see what happens to the row, we’ll need to be able to locate the row again. This is actually pretty easy to do. The first thing I did was retrieve the ctid
along with the rest of the data in the row. I did this by running:SELECT ctid, * FROM customer ORDER BY ctid LIMIT 10 ;
This gives us the primary key of a customer to mess with as well as the location of the row on disk. We’re going to be looking at the customer with acustomer_id
of 1: Mary Smith. Using that select
statement, we can see that Mary Smith’s data lives on page 0 and in row 1
Updating a Row
Now that we know who we’re going to update, we can go ahead and mess around with the data. We can take a look at the row on the disk using theget_raw_page
function to examine page 0 of the customer table. Mary Smith’s data is at the end of the page.
Why is Mary’s data the first row in the table but the last entry on the page? PostgreSQL starts writing data from the end of the page but writes item identifiers from the beginning of the page.
We already know that Mary’s row is in page 0, position 1 because of the ctid
we retrieved in our first query. Let’s see what happens when we update some of Mary’s data. Open up a connection to PostgreSQL using your favorite interactive querying tool. I use psql
on the command prompt, but there are plenty of great tools out there.```
BEGIN TRANSACTION
UPDATE customer
SET email = ‘[email protected]’
WHERE customer_id = 1 ;
Don’t commit the transaction yet! When we go to look for Mary’s data using the first `select` ordered by `ctid`, we won’t see her data anywhere. [![](http://facility9.com/wp-content/uploads/2011/04/The-First-10-Rows-After-the-Update-620x95.jpg)](http://facility9.com/wp-content/uploads/2011/04/The-First-10-Rows-After-the-Update.jpg) Where did her data go? Interestingly enough, it’s in two places right now because we haven’t committed the transaction. In the current query window, run the following command:
SELECT ctid, xmin, xmax, * FROM customer WHERE customer_id = 1;
[![](http://facility9.com/wp-content/uploads/2011/04/The-Row-After-Update-Running-Transaction.jpg)](http://facility9.com/wp-content/uploads/2011/04/The-Row-After-Update-Running-Transaction.jpg) After running this, we can see that the customer’s row has moved off of page 0 and is now on page 8 in slot 2. The other interesting thing to note is that the`xmin` value has changed. Transactions with a transaction id lower than `xmin`won’t be able to see the row. In another query window, run the previous select again. You’ll see that the row is still there with all of the original data present; the email address hasn’t changed. We can also see that both the `xmin` and `xmax` columns now have values. This shows us the range of transactions where this row is valid. [![](http://facility9.com/wp-content/uploads/2011/04/The-Row-After-Upate-Different-Transaction.jpg)](http://facility9.com/wp-content/uploads/2011/04/The-Row-After-Upate-Different-Transaction.jpg) Astute readers will have noticed that the row is on disk in two places at the same time. We’re going to dig into this in a minute, but for now go ahead and commit that first transaction. This is important because we want to look at what’s going on with the row _after_ the update is complete. Looking at rows during the update process is interesting, but the after effects are much more interesting. [![](http://facility9.com/wp-content/uploads/2011/04/customer_page_0_after_update.jpg)](http://facility9.com/wp-content/uploads/2011/04/customer_page_0_after_update.jpg) Looking at page 0 of the customer table, we can see that the original row is still present. It hasn’t been deleted yet. However, PostgreSQL has marked the row as being “old” by setting the `xmax` value as well as setting the `t_ctid` value to `00 00 00 08 00 02`. This tells us that if we look on page 8 in position 2 we’ll find the newest version of the data that corresponds to Mary Smith. Eventually this old row (commonly called a dead tuple) will be cleaned up by the vacuum process. [![](http://facility9.com/wp-content/uploads/2011/04/customer_page_8_after_update.jpg)](http://facility9.com/wp-content/uploads/2011/04/customer_page_8_after_update.jpg) If we update the row again, we’ll see that it moves to a new position on the page, from (8,2) to (8,3). If we did back in and look at the row, we’ll see that the`t_ctid` value in Mary Smith’s record at page 8, slot 2 is updated from `00 00 00 00 00 00` to `00 00 00 08 00 03`. We can even see the original row in the hex dump from page 8. We can see the same information much more legibly by using the `heap_page_items` function:
select * from heap_page_items(get_raw_page(‘customer’, 8));
```There are three rows listed on the page. The row with lp
1 is the row that was originally on this page before we started messing around with Mary Smith’s email address. lp
2 is the first update to Mary’s email address. Looking at t_infomask2
on row 2 we can immediately see two things… I lied, I can’t immediately see anything apart from some large number. But, once I applied the bitmap deciphering technology that I call “swear repeatedly”, I was able to determine that this row was HEAP_HOT_UPDATED
and contains 10 attributes. Refer to htup.h
for more info about the bitmap meanings.
The HOTness
PostgreSQL has a unique feature called heap only tuples (HOT for short). The HOT mechanism is designed to minimize the load on the database server in certain update conditions:
- A tuple is repeatedly updated
- The updates do not change indexed columns
For definition purposes, an “indexed column” includes any columns in the index definition, whether they are directly indexes or are used in a partial-index predicate. If your index definition mentions it, it counts. In our case, there are no indexes on the email
column of the customer
table. The updates we’ve done are going to be HOT updates since they don’t touch any indexed columns. Whenever we update a new row, PostgreSQL is going to write a new version of the row and update the t_ctid
column in the most current row. When we read from an index, PostgreSQL is going to read from the index and then follow the t_ctid
chains to find the current version of the row. This lets us avoid additional hits to disk when we’re updating rows. PostgreSQL just updates the row pointers. The indexes will still point to the original row, which points to the most current version of the row. We potentially take an extra hit on read, but we save on write. To verify this, we can look at the index page contents using thebt_page_items
function:SELECT * FROM bt_page_items('idx_last_name', 2) ORDER BY ctid DESC;
We can find our record by moving through the different pages of the index. I found the row on page 2. We can locate our index row by matching up thectid
from earlier runs. Looking at that row, we can see that it points to thectid
of a row with a forwarding ctid
. PostgreSQL hasn’t changed the index at all. Instead, when we do a look up based on idx_last_name
, we’ll read from the index, locate any tuples with a last name of ‘SMITH’, and then look for those rows in the heap page. When we get to the heap page, we’ll find that the tuple has been updated. We’ll follow the update chain until we get to the most recent tuple and return that data. If you want to find out more about the workings of the Heap Only Tuples feature of PostgreSQL, check out the README
.