Avoid Race Conditions in Django

django Dec 27, 2020

In this article, we will discuss how to avoid race conditions while building a Django application.

What are race conditions?

A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of the possible behaviors is undesirable. - Wikipedia

Race conditions can occur, especially in distributed software systems, and it cannot be easy to reproduce or debug. For example, we have a pizza delivery company that sends out discount coupons to their customers regularly. Customers can redeem this code through their website or a mobile application. The expected behavior here is that the customer should be able to redeem this coupon code only once. But sometimes, if the application does not handle race conditions, customers can simultaneously send multiple requests to redeem multiple discounts on the same coupon code. Others could use this bug unethically, and the company would face some financial losses.

This tweet somewhat summarises a race condition 😂 -

Knock knock
Race condition
Who's there?
— I Am Devloper (@iamdevloper) November 11, 2013

Now let us see how we can avoid race conditions in Django applications. Sometimes when we have to perform some operations on a field, such as setting a specific value. The obvious way to achieve this is to do something like this -

>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold += 1
>>> product.save()

If the product's number_sold old value is 10, the value written back will be 11.

In this case, if we have multiple threads running this code in a distributed environment, then we might have introduced a race condition bug in the system. Here we have to account for the scale at which the application is running (being multi-threaded), network latency between the application server and the database server, the file system IOPS, network bandwidth, and various other factors.

There are a variety of ways in which the factors mentioned above can be manipulated. Suppose we know the exact region where an application is deployed in the cloud (using a few OSINT tools). We can spin our cloud instance in the same region and make those simultaneous requests. Now the cloud instances being in the same region can communicate on a super-low latency network. Possibly, exploiting a race condition bug.

Getting back to the application side, a more secure way of achieving the above example would be using F() functions in Django. F() functions allow us to update a field's value at the database level instead of at the Python level. If the database is responsible for updating the field, the process becomes more robust. It will only update the field on the value of the field in the database when the save() or update() is executed. Something like this -

>>> from django.db.models import F
>>> product = Product.objects.get(name='Venezuelan Beaver Cheese')
>>> product.number_sold = F('number_sold') + 1
>>> product.save()

Another way to achieve this by using select_for_update(), which locks rows until the end of the transaction, generating a SELECT ... FOR UPDATE SQL statement on supported databases. For example,

>>> products = Product.objects.select_for_update().filter(category='Cheese')
>>> with transaction.automic():
        for product in products:
          ...[perform operations]...

I hope this article sheds some light on building more secure Django applications by avoiding race condition bugs.

References -