How to Write Custom Migrations in Django

Blog / Django · May 6, 2022 · Updated June 10, 2026 · 10 min read
How to Write Custom Migrations in Django

A custom migration is a Django migration you write by hand instead of letting makemigrations generate it. You reach for one whenever a schema change also needs data moved or transformed, when you need raw SQL the ORM can't express, or when you have to reshape migration history without touching the database. The three building blocks are RunPython (data migrations in Python), RunSQL (raw SQL), and SeparateDatabaseAndState (decoupling what Django records from what runs on the database) — plus custom Operation classes for reusable logic. Everything below targets Django 5.x.

This guide is about authoring migration operations yourself. If instead you need to bootstrap migrations for a database that already has tables but no history, that's a different job — see our companion guide on creating initial Django migrations for an existing database.

When makemigrations isn't enough

makemigrations is excellent at schema diffs — new models, added fields, changed options — and it writes the DDL to match. It is deliberately blind to your data. It will create a column but never populate it; its rename detection asks a yes/no question and can guess wrong; and it cannot emit database-specific SQL such as a trigger, a partial index, or a materialized view. Any time the change involves existing rows or engine-specific DDL, you write a custom migration.

Start from an empty migration

Custom migrations begin from an empty stub that you fill in:

python manage.py makemigrations --empty --name backfill_full_name people

This writes a numbered file like people/migrations/0007_backfill_full_name.py with an empty operations list and the dependency on the previous migration already wired up. Always use --name so the filename documents intent instead of reading 0007_auto_20260610_1012.

Data migrations with RunPython

RunPython is the workhorse: it runs a Python function during migrate. Two rules matter most.

  • Load models through apps.get_model('app', 'Model'), never by importing them. The function runs against the historical state of your models at that point in history; a direct import gives you today's model, which may have fields that didn't exist yet (or have since been removed).
  • Provide a reverse function so the migration is reversible. If a step genuinely can't be undone, pass migrations.RunPython.noop rather than leaving the migration irreversible.
from django.db import migrations


def combine_names(apps, schema_editor):
    """Populate full_name from the existing first/last name columns."""
    Person = apps.get_model("people", "Person")
    for person in Person.objects.all().iterator():
        person.full_name = f"{person.first_name} {person.last_name}".strip()
        person.save(update_fields=["full_name"])


def clear_names(apps, schema_editor):
    """Reverse step so `migrate people 0006` rolls back cleanly."""
    Person = apps.get_model("people", "Person")
    Person.objects.update(full_name="")


class Migration(migrations.Migration):
    dependencies = [
        ("people", "0006_person_full_name"),
    ]

    operations = [
        migrations.RunPython(combine_names, clear_names),
    ]

A few practical refinements:

  • Use .iterator() (and bulk_update) on large tables so you don't pull every row into memory at once.
  • schema_editor.connection.alias tells you which database you're on; guard the operation with it in a multi-database project, because Django runs the migration once per database.
  • Pass elidable=True when the operation only mattered as a one-off historical fix — Django can then drop it when you squash migrations.
operations = [
    migrations.RunPython(
        combine_names,
        migrations.RunPython.noop,  # nothing meaningful to undo
        elidable=True,              # safe to drop when squashing
    ),
]

Raw SQL with RunSQL

When you need SQL the ORM won't generate — a CHECK constraint, a partial or GIN index, a trigger, an UPDATE ... FROM — use RunSQL. Always pass reverse_sql so the migration can be unapplied.

from django.db import migrations


class Migration(migrations.Migration):
    dependencies = [
        ("catalog", "0011_product_search"),
    ]

    operations = [
        migrations.RunSQL(
            sql="CREATE INDEX idx_product_active ON catalog_product (created_at) WHERE is_active;",
            reverse_sql="DROP INDEX idx_product_active;",
        ),
    ]

If the SQL changes the schema in a way Django's autodetector should know about (for example you hand-create a column), pass state_operations. Django then updates its in-memory model state to match without re-running the DDL:

from django.db import migrations, models

migrations.RunSQL(
    sql="ALTER TABLE catalog_product ADD COLUMN sku varchar(32) NOT NULL DEFAULT '';",
    reverse_sql="ALTER TABLE catalog_product DROP COLUMN sku;",
    state_operations=[
        migrations.AddField(
            model_name="product",
            name="sku",
            field=models.CharField(max_length=32, default=""),
        ),
    ],
)

A real-world example: backfill a new column safely

Combine the pieces. Say you're adding a denormalized order_total to Order. You can't add a populated NOT NULL column in one shot, so split the work into ordered steps: add the column nullable, backfill it with a data migration, then tighten the constraint. Migration 0020 adds the nullable field; this one fills every row and only then alters it.

from django.db import migrations, models
from django.db.models import Sum


def backfill_order_total(apps, schema_editor):
    Order = apps.get_model("shop", "Order")
    OrderLine = apps.get_model("shop", "OrderLine")
    totals = OrderLine.objects.values("order_id").annotate(total=Sum("line_total"))
    by_order = {row["order_id"]: row["total"] for row in totals}

    orders = []
    for order in Order.objects.all().iterator():
        order.order_total = by_order.get(order.id, 0)
        orders.append(order)
    Order.objects.bulk_update(orders, ["order_total"], batch_size=500)


class Migration(migrations.Migration):
    dependencies = [
        ("shop", "0020_order_order_total_nullable"),
    ]

    operations = [
        migrations.RunPython(backfill_order_total, migrations.RunPython.noop),
        migrations.AlterField(
            model_name="order",
            name="order_total",
            field=models.DecimalField(max_digits=10, decimal_places=2, default=0),
        ),
    ]

Keeping the schema change (add the column) in one migration and the data change (backfill) in another is a good habit: each step stays small, easy to reason about, and easy to reverse.

Controlling order: dependencies and run_before

Every migration declares dependencies — a list of (app_label, migration_name) tuples that must run first. makemigrations --empty wires the latest one for you, but you often add cross-app dependencies by hand when a data migration reads another app's tables.

class Migration(migrations.Migration):
    dependencies = [
        ("billing", "0009_invoice"),
        ("accounts", "0014_customer_tier"),  # cross-app: this migration reads Customer
    ]
    run_before = [
        ("billing", "0010_drop_legacy_totals"),  # force this to run before that one
    ]

    operations = [...]

run_before is the inverse of dependencies: it pushes this migration ahead of another one you can't edit — often a third-party app's migration.

Atomic and non-atomic migrations

On databases with transactional DDL (PostgreSQL, SQLite) Django wraps each migration in a transaction, so a failure rolls the whole file back. Sometimes you don't want that: a long data backfill you'd rather commit in batches, or a PostgreSQL CREATE INDEX CONCURRENTLY which cannot run inside a transaction. Set atomic = False.

class Migration(migrations.Migration):
    atomic = False

    dependencies = [("catalog", "0012_product_sku")]

    operations = [
        migrations.RunSQL(
            "CREATE INDEX CONCURRENTLY idx_sku ON catalog_product (sku);",
            reverse_sql="DROP INDEX CONCURRENTLY idx_sku;",
        ),
    ]

With atomic = False a mid-way failure leaves partial changes applied, so make each operation idempotent and safe to re-run. You can also mark a single RunPython operation non-atomic with RunPython(forwards, backwards, atomic=False) while the rest of the file stays transactional.

Renames without data loss: SeparateDatabaseAndState

Occasionally Django's recorded state and the real database must diverge for one step. The classic case: you've already renamed a table directly in SQL and only need Django to believe the rename happened, without issuing its own drop-and-recreate. SeparateDatabaseAndState runs one set of operations against the database and a different set against the migration state.

from django.db import migrations


class Migration(migrations.Migration):
    dependencies = [("crm", "0007_lead")]

    operations = [
        migrations.SeparateDatabaseAndState(
            # What actually runs on the database:
            database_operations=[
                migrations.RunSQL(
                    "ALTER TABLE crm_lead RENAME TO crm_prospect;",
                    reverse_sql="ALTER TABLE crm_prospect RENAME TO crm_lead;",
                ),
            ],
            # What Django records in its model state:
            state_operations=[
                migrations.RenameModel(old_name="Lead", new_name="Prospect"),
            ],
        ),
    ]

This is also how you adopt an existing column under a new field name, or move a model between apps without Django trying to drop and recreate the table.

Writing a custom Operation class

For logic you repeat across projects, subclass migrations.operations.base.Operation. You implement state_forwards (update the in-memory project state), database_forwards / database_backwards (apply and undo on the database via schema_editor), and describe (the line printed during migrate).

from django.db.migrations.operations.base import Operation


class LoadExtension(Operation):
    """Enable a PostgreSQL extension as a reversible migration operation."""

    reversible = True

    def __init__(self, name):
        self.name = name

    def state_forwards(self, app_label, state):
        pass  # no change to Django's model state

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        schema_editor.execute(f'CREATE EXTENSION IF NOT EXISTS "{self.name}";')

    def database_backwards(self, app_label, schema_editor, from_state, to_state):
        schema_editor.execute(f'DROP EXTENSION IF EXISTS "{self.name}";')

    def describe(self):
        return f"Enable PostgreSQL extension {self.name}"

For extensions specifically, Django already ships CreateExtension in django.contrib.postgres.operations — but the pattern above is how you build your own reusable operation when no helper exists.

Zero-downtime patterns across deploys

On a live system the migration and the code that uses it ship in separate deploys, so ordering matters. Two rules cover most cases:

  • Adding a column: deploy 1 adds it nullable (old code ignores it, new code can write it); a data migration backfills; deploy 2 makes it NOT NULL once every row has a value. Never add a populated NOT NULL column without a default in a single step.
  • Removing a column: deploy 1 ships code that no longer references the column; deploy 2 drops it. Dropping first would break the still-running old code mid-deploy.

Also beware migrations that hold locks. A plain CREATE INDEX, an older-PostgreSQL ALTER TABLE ... ADD COLUMN ... DEFAULT, or a backfill UPDATE over millions of rows can lock a table long enough to stall production. Prefer CREATE INDEX CONCURRENTLY (with atomic = False), batch your backfills, and run heavy data migrations off the request path.

Choosing the right tool

Mechanism Use it when Reversible? Operates on
RunPython Transforming or backfilling existing rows in Python with ORM access Yes — supply a reverse function (or noop) Data
RunSQL Engine-specific DDL/DML the ORM can't express (triggers, partial indexes, bulk UPDATE) Yes — supply reverse_sql Data and/or schema
SeparateDatabaseAndState Database and Django's recorded state must diverge (manual renames, app moves, adopting columns) Yes — both sides reverse State vs. database
Custom Operation class Reusable, parameterized logic applied across migrations or projects Yes — implement database_backwards Anything

Inspect and test migrations before they run

Never apply a hand-written migration blind. A few commands earn their keep:

# Show the SQL a migration will run, without applying it:
python manage.py sqlmigrate people 0007

# Show the full execution plan/order without applying:
python manage.py migrate --plan

# Apply up to a point, or roll one back:
python manage.py migrate people 0007
python manage.py migrate people 0006   # reverse to the previous state

Test data migrations the way you test code: run them forward and backward on a copy of production data, assert the row counts and values you expect, and confirm migrate <app> <previous> reverses cleanly. Django's test runner builds a fresh database from your migrations, so a broken dependency graph fails fast in CI.

Best practices

  • Always provide a reverse (reverse_code, reverse_sql, or an explicit noop) so any migration can be rolled back.
  • Load models with apps.get_model() inside RunPython — never import them directly.
  • Keep schema changes and data changes in separate migrations; it keeps each one small and reversible.
  • Make data migrations idempotent so re-running after a partial failure is safe, especially when atomic = False.
  • Batch large backfills with .iterator() and bulk_update, and run them off the hot path.
  • Test forward and backward on a copy of real data before you touch production.

At MicroPyramid we've spent 12+ years and 50+ projects writing, reviewing, and untangling migrations on production Django systems. When you'd rather hand off a tricky data migration or a zero-downtime schema change, our Django development team and database migration services do exactly this kind of work, and our Python development services cover the wider backend around it.

Frequently Asked Questions

When should I write a custom migration instead of using makemigrations?

Reach for a custom migration whenever the change involves existing data or engine-specific SQL. makemigrations only diffs your models and emits schema DDL — it never moves or transforms rows. Backfilling a new column, copying data between tables, creating a trigger or partial index, or reshaping migration history all need a hand-written migration.

How do I make a data migration reversible?

Pass a reverse function (or reverse SQL) as the second argument. For RunPython that's RunPython(forwards, backwards); for RunSQL it's the reverse_sql argument. If a step truly can't be undone, use migrations.RunPython.noop so Django still treats the migration as reversible and reversing to the previous migration works.

Why use apps.get_model() instead of importing my model directly?

Because a migration runs against the historical version of your models at that point in history, not today's code. apps.get_model('app', 'Model') returns that historical model with exactly the fields that existed then. A direct import gives you the current model, which may reference columns that don't exist yet or have since been dropped, crashing the migration.

How do I create an empty migration to edit?

Run python manage.py makemigrations --empty --name your_description your_app. Django writes a numbered file with an empty operations list and the dependency on the previous migration already filled in. Add your RunPython, RunSQL, or other operations to that list.

What does atomic = False do in a migration?

It tells Django not to wrap the migration in a single database transaction. You need it for operations that can't run inside a transaction, such as PostgreSQL's CREATE INDEX CONCURRENTLY, or for long batched backfills you want committed incrementally. The trade-off: a mid-way failure leaves partial changes, so make each operation idempotent.

How can I rename a table or move a model without losing data?

Use SeparateDatabaseAndState. Put the real RunSQL rename in database_operations and put RenameModel or RenameField in state_operations, so Django updates its recorded state without issuing its own drop-and-recreate. The data stays in place while Django's migration history stays consistent.

Share this article