Locking in CockroachDB, Part 1 - Locking Rows and Tables with SELECT FOR UPDATE

Locking rows in a database is a common way to prevent other transactions from modifying the data you are working on. But a transaction can hold onto rows for too long, or you can lock a whole table accidentally. These locks can be brief or indefinite if something has gone very wrong.

I have written about locking and timeouts in CockroachDB in previous posts -

Ideally, you want to lock as little as possible, and for as short a time as possible. This will help prevent other transactions from being blocked.

Locking rows vs locking the table

It is possible to lock as little as a row, or as much as an entire table, depending on the query you run and how you have configured your indexes.

Locking a row

If you perform a SELECT...FOR UPDATE statement, with the WHERE that matches an index, then you will only lock the rows that match the WHERE clause.

Say you have a table with 1,000 rows and the primary key is product_id. You run BEGIN; SELECT * FROM products WHERE product_id=1 FOR UPDATE, then only the row with product_id=1 will be locked because the primary key is indexed.

Other transactions can still read and write to the other 999 rows as long as the WHERE clause uses an index, and the rows they are trying to access are already not locked.

If the WHERE clause does not use an index, then the database will have to scan the entire table, and the lock on the single row will prevent the query from running.

Locking the entire table

If you perform a BEGIN; SELECT...FOR UPDATE statement, with a WHERE clause that does not use an index, then the entire table will be locked, even if only one row is being selected.

Examples

Here is a products table with the following rows -

 product_id |             name             | product_category | price |   code
-------------+------------------------------+------------------+-------+-----------
           1 | Rustic Steel Soap            |                4 | 84.77 | 98808106
           2 | Ergonomic Concrete Salad     |                4 | 66.71 | 40877471
           3 | Incredible Steel Bacon       |                2 | 39.41 | 30460492
           4 | Refined Concrete Bacon       |                1 | 63.69 | 77385970
           5 | Ergonomic Cotton Keyboard    |                2 | 51.27 | 60617194
           6 | Small Cotton Chicken         |                4 | 63.42 | 33429625
           7 | Licensed Concrete Salad      |                2 |  4.18 | 14572098
           8 | Incredible Steel Keyboard    |                4 | 52.95 | 22365590
           9 | Licensed Rubber Soap         |                2 | 67.85 | 91893635
          10 | Handmade Plastic Shirt       |                4 | 52.71 | 61433458

product_id is the primary key, and there is also an index on product_category.

Example 1 - locking a single row

If you run the following query, only the row with product_id=1 will be locked -

BEGIN; SELECT * FROM products WHERE product_id = 1 FOR UPDATE; -- don't commit or rollback yet

To see the locked rows, run the following query in another session -

select * from crdb_internal.cluster_locks;
It will report that a single row is locked.

Other queries can still read and write the other 9 rows.

You can still run the following query, and it will work fine* -

SELECT * FROM products WHERE product_category < 2;

You can also run the following query, and it will work fine* -

SELECT * FROM products WHERE product_id > 2;

The following might* not proceed until the first transaction is committed or rolled back -

SELECT * FROM products WHERE product_category = 2;

* This is where the Cockroach Labs teams would need to explain it, but my guess is that the query doesn’t always use the index. It can be forced to, and I have seen the query work in that case.

Here’s how to force the query to use the index -

SELECT * FROM products@products_product_category_idx WHERE product_category = 2;

Example 2 - locking the entire table

With the following query, the entire table will be locked, even though only one row is being selected -

BEGIN; SELECT * FROM products WHERE price > 50 FOR UPDATE; -- don't commit or rollback

To see what is locked, run the following query in another session -

select * from crdb_internal.cluster_locks;

It will report that all 10 rows in the table are locked. That’s it, the whole table is locked. This is because the WHERE clause does not use an index, so the database has to scan the entire table to find the rows that match the WHERE clause.

Any query on the table will have to wait for this lock to be released.

In the next post, I will give more details on looking at the locks, and how to forcibly release them if needed.

comments powered by Disqus

Related