TIL: Partial indexes in PostgreSQL

| 2 min read

Introduction

This is a first post in the series "Today I Learned", where I share small bits that I learned recently. Today I would like to share how you can improve performance of your PostgreSQL database with partial indexes. We will go over what partial indexes are and when they might be useful. At the end, we will also see how they can be quickly used in a Django application. Let's dive in!

What is a partial index in PostgreSQL?

In PostgreSQL, a partial index is a type of an index that is created only on a subset of rows instead of on all rows in the table. In order to create a partial index, we need to provide an additional conditional expression when creating it. The created index will only include items that satisfy the provided conditional expression. Let's see how we can define such an index.

Let's assume that we have a very simple table designed to track settings for users in our application. The schema for it looks roughly like this:

CREATE TABLE settings (
id int,
user_id int,
active boolean,
);

After understanding the access patterns and the data, we notice that most of the settings are inactive and we mostly run queries for retrieving active settings. Let's create a partial index that will only include settings that are turned on:

CREATE INDEX settings_active_index on settings (active)
WHERE active = true

And with this simple snippet, we have created our first partial index. These type of indexes are especially useful in situations where you know that a lot of the queries you need to run operate only on a small subset of the rows, based on specific conditions.

Benefits of partial indexes in PostgreSQL

Okay, okay, we've created our partial index, but you might be wondering, why would I even create one in the first place? That's a great question and in most cases you won't need them. However, there are several reasons for using partial indexes. The biggest benefit is reducing size of the index, which translates to less disk space being consumed by your database. This also directly leads to improved performance of queries that can take advantage of the index. Additionally, as the index only includes a subset of rows, less update operations will require updates of the index, which will also improve performance of update operations.

How to use a partial index in Django application?

Recently, I've been working a little bit with Django and I want to share how you can take advantage of partial indexes in Django very easily.

Let's assume that we have a very model for our settings:

class Settings(models.Model):
user_id = models.IntegerField(required=True)
active = models.BooleanField(default=False)

We can create a partial index for it by just providing an additional condition when creating an index:

class Settings(models.Model):
user_id = models.IntegerField(required=True)
active = models.BooleanField(default=False)

class Meta:
indexes = [
models.Index(
fields=['active', 'user_id'],
condition=models.Q(active=True),
name='settings_active_user_id_idx',
)
]

Now our model has a partial index on active and user_id fields, but only for rows that have active set to True.

Summary

Today we learned what are partial indexes in PostgreSQL and how we can easily use them in our Django applications. While you probably won't use them very often, it's still worth knowing about them as in some specific cases they can prove to be really useful and improve peformance of key database queries. They are especially effective in situations where you have a big table and the queries only operate on a relatively small subset of the rows.

Thanks for reading and until next time!