Prevent Locking Issues For Updates On Counters

Counters, for e.g. page views or comment likes, can only be updated one after another. You must workaround this for large applications to do it in parallel.

When your application frequently updates counters, e.g. page views or comment likes, you will encounter performance problems and even deadlocks. Every time a row is updated, it is locked until the query or transaction finishes. Therefore, no parallel updates are possible and each query has to wait until all the prior ones are finished. But you can workaround this concurrency limitation by randomly spreading the counter to as many rows as you want to reduce the locking probability.

Usage

MySQL

INSERT INTO pageviews_counts (
  url, fanout, count
) VALUES (
  '/home', FLOOR(RAND() * 100), 1
) ON DUPLICATE KEY UPDATE count =
count + VALUES(count);

PostgreSQL

INSERT INTO pageviews_counts (
  url, fanout, count
) VALUES (
  '/home', FLOOR(RANDOM() * 100), 1
) ON CONFLICT (url, fanout) DO UPDATE SET count =
pageviews_counts.count + excluded.count;

Detailed Explanation

Many applications are updating counters in their database to store e.g. the total number of page views, likes on a tweet or an aggregated number of product reviews to not recalculate that value repeatedly. But for high-traffic applications, these counter columns lead to performance problems: The database will lock the row of the updated counter value for the time of the query (or transaction) and sometimes you even encounter deadlock issues. When you get a lot of page views or some tweet goes viral, there can't be any concurrent update to the counter because of the lock automatically acquired by the database. The response time for your application will increase dramatically because every update will be executed one after another as they have to wait for all earlier update queries to finish.

Clever database tricks can quickly solve this lock contention problem for frequently updated rows. As the only problem is updating a single counter concurrently, it can be duplicated many times that each query is updating a different one. The approach for doing this is simple:

  • Create a new table that will store many counters for e.g. each page view replacing the single one with a fanout column that will distribute the locks:
    CREATE TABLE pageviews_counts (
      url varchar(255) PRIMARY KEY,
      fanout smallint NOT NULL,
      count int
    );
    CREATE UNIQUE INDEX pageviews_slots ON pageviews (
      url, fanout
    );
  • Replace the update statement with an insert statement. To not insert thousands of new rows into that table, they are saved for a specific fanout parameter (e.g. 100 counters) and incremented when they already exist. The lock contention probability has been reduced by 100 times.
    -- Before:
    UPDATE pageviews SET count = count + 1 WHERE url = '/home';
    
    -- New:
    INSERT INTO pageviews_counts (
      url, fanout, count
    ) VALUES (
      '/home', FLOOR(RAND() * 100), 1
    ) ON DUPLICATE KEY UPDATE count = count + VALUES(count);
  • Periodically move the summarized counts from the pageviews_counts table to the pageviews table.

This approach is a significant improvement to directly updating the counter value. However, there can still be lock contention when upserting the same row. One way to reduce this probability is to increase the fanout to distribute the counting process over more rows. Alternatively, you can cooperate with the database locking system by using a transaction and trying to update a counter that is currently not locked:

  • You ask the database for existing counter rows that are currently not locked by any other query or transaction:
    SELECT fanout
    FROM pageviews_count
    WHERE url = '/home'
    LIMIT 1
    FOR UPDATE SKIP LOCKED
  • When a row is returned, it is used with a simple UPDATE query:
    UPDATE pageviews_counts
    SET count = count + 1
    WHERE url = '/home' AND fanout = 4
  • When no row is returned, either all existing fanout parameters are locked or there doesn't exist any row. You have to fallback to the standard approach:
    INSERT INTO pageviews_counts (
      url, fanout, count
    ) VALUES (
      '/home', FLOOR(RAND() * 100), 1
    ) ON DUPLICATE KEY UPDATE count = count + VALUES(count);

Disclaimer: You don't have to use your main database for everything. For this example, a Redis server would have been a valid choice which can easily do tens of thousands of increments each second without any needed optimization.

Additional Resources

Author Tobias Petry

Tobias Petry

SQL for Devs Founder


Are you interested to learn more?

Be notified on future content. Never spam.