Django Aggregation: Count, Sum, Avg, Min, Max with aggregate() and annotate()

Blog / Django · December 27, 2023 · Updated June 10, 2026 · 10 min read
Django Aggregation: Count, Sum, Avg, Min, Max with aggregate() and annotate()

Aggregation in the Django ORM lets you compute summary values (counts, sums, averages, minimums and maximums) directly in the database instead of pulling rows into Python and looping. There are two entry points, and knowing which to reach for is the whole game:

  • aggregate() returns a single summary dictionary for the entire queryset. It is a terminal operation: you get one Python dict, not a queryset. Use it for questions like "what is the average order value across all orders?"
  • annotate() attaches a computed value to each object (or each group) and returns a queryset you can keep chaining. It is Django's GROUP BY equivalent. Use it for questions like "how many orders does each customer have?"

In one line: aggregate() collapses a queryset to one row of numbers; annotate() adds a per-row (or per-group) number and stays a queryset.

This guide covers both, plus conditional aggregation, the distinct=True count fix, the JOIN-inflation gotcha that silently doubles your numbers, F() expressions inside aggregates, and filtering before vs. after annotation. Examples target Python 3.12+ and Django 5.x.

The example models

Every query below runs against this small e-commerce schema. We have stores, the products they sell, and the orders placed against those products.

# shop/models.py
from django.db import models


class Store(models.Model):
    name = models.CharField(max_length=120)
    city = models.CharField(max_length=80)

    def __str__(self):
        return self.name


class Product(models.Model):
    name = models.CharField(max_length=120)
    store = models.ForeignKey(Store, on_delete=models.CASCADE, related_name="products")
    price = models.DecimalField(max_digits=10, decimal_places=2)
    in_stock = models.BooleanField(default=True)

    def __str__(self):
        return self.name


class Order(models.Model):
    PENDING, PAID, REFUNDED = "pending", "paid", "refunded"
    STATUS_CHOICES = [(PENDING, "Pending"), (PAID, "Paid"), (REFUNDED, "Refunded")]

    product = models.ForeignKey(Product, on_delete=models.CASCADE, related_name="orders")
    quantity = models.PositiveIntegerField(default=1)
    total = models.DecimalField(max_digits=10, decimal_places=2)
    status = models.CharField(max_length=10, choices=STATUS_CHOICES, default=PENDING)
    created_at = models.DateTimeField(auto_now_add=True)

aggregate(): one summary value over a queryset

Call aggregate() at the end of any queryset. Pass it one or more aggregate functions and it returns a single dictionary. Filters applied before aggregate() scope the calculation, so you can summarise a subset just as easily as the whole table.

>>> from django.db.models import Count, Sum, Avg, Min, Max
>>> from shop.models import Product, Order

# Whole-table summaries
>>> Product.objects.aggregate(Avg("price"))
{'price__avg': Decimal('249.50')}

>>> Product.objects.aggregate(Max("price"), Min("price"))
{'price__max': Decimal('1299.00'), 'price__min': Decimal('9.99')}

# Several aggregates at once, with your own keys
>>> Order.objects.aggregate(
...     orders=Count("id"),
...     revenue=Sum("total"),
...     avg_order=Avg("total"),
... )
{'orders': 1280, 'revenue': Decimal('318400.00'), 'avg_order': Decimal('248.75')}

# Filter first to scope the summary
>>> Order.objects.filter(status=Order.PAID).aggregate(paid_revenue=Sum("total"))
{'paid_revenue': Decimal('291050.00')}

By default the output key is <field>__<function> (for example price__avg). Supply your own keyword name when you want a cleaner key or are running several aggregates of the same type. Sum, Avg, Min and Max return None (not 0) when the queryset is empty, so guard against None or pass default=0 (available since Django 4.0):

>>> Order.objects.filter(status="refunded").aggregate(Sum("total"))
{'total__sum': None}

>>> Order.objects.filter(status="refunded").aggregate(Sum("total", default=0))
{'total__sum': Decimal('0')}

The built-in aggregate functions

Function Returns Empty queryset Typical use
Count(field) Number of rows (or related rows) 0 How many orders / products
Sum(field) Total of a numeric column None (or default) Revenue, total quantity
Avg(field) Mean of a numeric column None (or default) Average order value
Min(field) Smallest value None (or default) Cheapest product, earliest date
Max(field) Largest value None (or default) Most expensive product, latest date

Count also accepts distinct=True and every function accepts a filter=Q(...) argument (covered below). All of them live in django.db.models.

annotate(): per-object and per-group values (GROUP BY)

annotate() adds a calculated attribute to each row of the queryset and keeps the queryset alive, so you can filter, order and slice the result. Annotating across a reverse relation gives you a per-object count for free.

# How many orders does each product have? (per-object annotation)
>>> products = Product.objects.annotate(order_count=Count("orders"))
>>> for p in products:
...     print(p.name, p.order_count)
'Wireless Mouse' 36
'Mechanical Keyboard' 12

# The annotation is a real queryset field: order and slice by it
>>> Product.objects.annotate(order_count=Count("orders")).order_by("-order_count")[:5]

Grouping with values().annotate()

To produce a true SQL GROUP BY over a column (rather than per object), chain values() before annotate(). The fields in values() become the grouping columns, and the aggregate is computed once per distinct group.

# Revenue and order count per store city (GROUP BY city)
>>> from django.db.models import Count, Sum
>>> (
...     Order.objects
...     .values("product__store__city")
...     .annotate(orders=Count("id"), revenue=Sum("total"))
...     .order_by("-revenue")
... )
<QuerySet [
  {'product__store__city': 'Hyderabad', 'orders': 540, 'revenue': Decimal('142300.00')},
  {'product__store__city': 'Bengaluru', 'orders': 410, 'revenue': Decimal('118900.00')},
]>

# Equivalent SQL:
# SELECT store.city, COUNT(order.id), SUM(order.total)
# FROM order JOIN product ... JOIN store ...
# GROUP BY store.city ORDER BY SUM(order.total) DESC;

The order of clauses matters. values("city").annotate(...) groups by city. If you reverse it to annotate(...).values("city"), Django annotates per object first and values() only selects columns. Put values() before annotate() whenever you mean "group by".

Conditional aggregation

Conditional aggregation answers "count/sum only the rows matching a condition" in a single query, instead of running one query per status. The modern way is the filter argument that every aggregate accepts; under the hood Django emits SQL FILTER (WHERE ...) (or a CASE expression on databases that lack it).

>>> from django.db.models import Count, Sum, Q

# Per-status counts in ONE query
>>> Order.objects.aggregate(
...     paid=Count("id", filter=Q(status="paid")),
...     pending=Count("id", filter=Q(status="pending")),
...     refunded=Count("id", filter=Q(status="refunded")),
... )
{'paid': 1170, 'pending': 95, 'refunded': 15}

# Mix conditional and unconditional aggregates per group
>>> Product.objects.annotate(
...     total_orders=Count("orders"),
...     paid_orders=Count("orders", filter=Q(orders__status="paid")),
...     paid_revenue=Sum("orders__total", filter=Q(orders__status="paid")),
... )

The older Case(When(...)) form

Before the filter argument existed, the same effect was achieved by summing a Case/When expression. It still works and is occasionally needed for non-boolean conditional values (for example, summing a discounted price only for some rows).

>>> from django.db.models import Sum, Case, When, IntegerField, DecimalField, Value

# Equivalent of paid_orders above, the explicit way
>>> Order.objects.aggregate(
...     paid=Sum(Case(When(status="paid", then=1), default=0, output_field=IntegerField())),
... )

# Conditional VALUE, not just a count: apply a 10% discount to paid orders only
>>> Order.objects.aggregate(
...     adjusted=Sum(Case(
...         When(status="paid", then="total"),
...         default=Value(0),
...         output_field=DecimalField(max_digits=12, decimal_places=2),
...     )),
... )

Reach for filter=Q(...) for conditional counts and sums of an existing field (it is shorter and reads better); reach for Case(When(...)) when the value being aggregated changes based on the condition.

Counting correctly: distinct=True and the JOIN-inflation gotcha

This is the single most common aggregation bug in Django. When you aggregate across two or more multi-valued relations in the same query, the underlying SQL JOIN multiplies rows together, and your counts and sums come out too high.

Suppose a store has 10 products and 200 orders. Asking for both counts at once joins products and orders, producing up to 10 x 200 rows, so a naive Count sees inflated totals:

# WRONG: counts are inflated by the JOIN fan-out
>>> from django.db.models import Count
>>> Store.objects.annotate(
...     products=Count("products"),
...     orders=Count("products__orders"),
... ).first().products
2000   # should be 10 -- multiplied by the order rows

# RIGHT: distinct=True de-duplicates each relation's rows
>>> store = Store.objects.annotate(
...     products=Count("products", distinct=True),
...     orders=Count("products__orders", distinct=True),
... ).first()
>>> store.products, store.orders
(10, 200)

distinct=True fixes counts, but it does not fix Sum or Avg, which have no notion of de-duplication. If you sum a field across an inflated JOIN, every value is added once per duplicated row. The robust fix is to compute each aggregate in its own Subquery so the relations never get joined together in one query:

>>> from django.db.models import OuterRef, Subquery, Sum, Count, IntegerField, DecimalField
>>> from django.db.models.functions import Coalesce

>>> product_count = (
...     Product.objects.filter(store=OuterRef("pk"))
...     .order_by().values("store")
...     .annotate(c=Count("id")).values("c")
... )
>>> order_revenue = (
...     Order.objects.filter(product__store=OuterRef("pk"))
...     .order_by().values("product__store")
...     .annotate(s=Sum("total")).values("s")
... )

>>> Store.objects.annotate(
...     products=Coalesce(Subquery(product_count, output_field=IntegerField()), 0),
...     revenue=Coalesce(Subquery(order_revenue, output_field=DecimalField(max_digits=12, decimal_places=2)), 0),
... )

The empty .order_by() is important: it strips any default model ordering that would otherwise sneak into the subquery's GROUP BY and break the grouping.

distinct=True to count unique values

Count with distinct=True also counts unique values of a column, which is the ORM equivalent of COUNT(DISTINCT col).

# How many distinct cities do we ship to?
>>> Order.objects.aggregate(cities=Count("product__store__city", distinct=True))
{'cities': 7}

F() expressions inside aggregates

F() references a column (or a computed expression of columns) at the database level. Combined with an aggregate, it lets you sum or average a derived value without loading rows into Python. For example, if Order stored unit_price and quantity separately, you could sum the line totals directly.

>>> from django.db.models import F, Sum, DecimalField

# Sum of (unit_price * quantity) computed in SQL
>>> Order.objects.aggregate(
...     gross=Sum(F("unit_price") * F("quantity"), output_field=DecimalField(max_digits=14, decimal_places=2)),
... )

# Per-product average margin, using F() across the relation
>>> Product.objects.annotate(
...     avg_margin=Avg(F("price") - F("cost")),
... )

When you combine fields of different types (or arithmetic that changes the type), pass an explicit output_field so Django knows how to build the column. For deeper query-tuning patterns like this, see our guide on Django database access optimization.

Filtering and ordering aggregated results

There are two distinct places to filter, and they map to SQL WHERE vs HAVING:

  • .filter() before annotate() restricts which rows feed the aggregate (SQL WHERE). "Count only paid orders per product."
  • .filter() after annotate() restricts which groups survive based on the aggregate (SQL HAVING). "Keep only products with more than five orders."

You can use both together, and you can order_by() an annotated field directly.

>>> from django.db.models import Count, Q

# WHERE: only paid orders count toward each product's total
>>> qs = Product.objects.annotate(
...     paid_orders=Count("orders", filter=Q(orders__status="paid")),
... )

# HAVING: keep only products that have more than 5 paid orders, best sellers first
>>> best_sellers = (
...     qs.filter(paid_orders__gt=5).order_by("-paid_orders")
... )

# You can also pre-filter rows, then group, then filter on the aggregate:
>>> (
...     Order.objects.filter(created_at__year=2026)        # WHERE
...     .values("product__store__city")
...     .annotate(revenue=Sum("total"))
...     .filter(revenue__gte=100000)                       # HAVING
...     .order_by("-revenue")
... )

aggregate() vs annotate() at a glance

aggregate() annotate()
Returns A single dict of summary values A queryset with an extra field per object/group
Scope The whole queryset, collapsed to one row Per object, or per group with values()
Chainable? No -- it is terminal Yes -- filter / order / slice the result
SQL it maps to A bare SELECT COUNT/SUM/... SELECT ..., AGG(...) ... GROUP BY ...
Use when You need one overall number You need a number for each row or group

A handy rule: if the answer is "one number," use aggregate(); if the answer is "a number per something," use annotate().

Aggregating over related models

Aggregates follow relationships with the usual double-underscore syntax, in both directions. Spanning a relation in an aggregate implicitly joins the related table, which is exactly why the de-duplication rules above matter once more than one relation is involved.

# Forward relation: average price of products sold in a given city
>>> Product.objects.filter(store__city="Hyderabad").aggregate(Avg("price"))

# Reverse relation: each store's order count and revenue, busiest first
>>> Store.objects.annotate(
...     orders=Count("products__orders", distinct=True),
...     revenue=Sum("products__orders__total"),
... ).order_by("-revenue")

When conditions get complex, combine aggregates with Q objects (see querying with Django Q objects) and conditional expressions (see Django conditional expression in queries). For data-model design and query architecture on a production Django codebase, our Django development services and broader Python development services teams have shipped 50+ projects over 12+ years where the ORM does the heavy lifting in the database, not in Python.

Frequently Asked Questions

What's the difference between aggregate() and annotate() in Django?

aggregate() returns a single dictionary summarising the whole queryset, for example the average price across all products. It is terminal: you can't chain anything after it. annotate() adds a computed value to each object (or each group when you use values().annotate()) and returns a queryset you can keep filtering, ordering and slicing. Use aggregate() for one overall number and annotate() for a number per object or per group.

How do I group by a field in Django?

Chain values() before annotate(). The fields you pass to values() become the GROUP BY columns, and the aggregate is computed once per group. For example, Order.objects.values("status").annotate(total=Sum("total")) produces one row per status with its summed total. Put values() first; if you call annotate() before values(), Django annotates per object instead of grouping.

How do I do conditional counts or sums in Django?

Pass a filter argument to the aggregate: Count("id", filter=Q(status="paid")) counts only paid rows, and you can run several such aggregates in one query. Django translates this to SQL FILTER (WHERE ...) or a CASE expression. For conditional values (where the number being aggregated changes per row, not just whether it counts), use Sum(Case(When(...), default=..., output_field=...)).

Why are my aggregated counts or sums too high in Django?

This is JOIN inflation. When a single query aggregates across two or more multi-valued relations, the SQL join multiplies the rows, so counts and sums get duplicated. Fix counts with distinct=True (Count("products", distinct=True)). Sum and Avg can't be de-duplicated that way, so compute each aggregate in its own Subquery (with OuterRef and an empty .order_by()) so the relations are never joined together in one query.

Can I filter on an annotated value in Django?

Yes. Calling .filter() after annotate() filters on the aggregate and maps to SQL HAVING, for example .annotate(n=Count("orders")).filter(n__gt=5) keeps only objects with more than five orders. Calling .filter() before annotate() (or using the filter= argument inside the aggregate) restricts which rows feed the aggregate and maps to SQL WHERE. You can combine both in one queryset.

What does aggregate() return when the queryset is empty?

Count returns 0, but Sum, Avg, Min and Max return None on an empty queryset because there is nothing to compute. To get a numeric fallback instead of None, pass default, for example Sum("total", default=0) (supported since Django 4.0), or wrap the expression in Coalesce.

Should I aggregate in the database or in Python?

Aggregate in the database whenever you can. aggregate() and annotate() push the work into SQL, returning only the final numbers instead of transferring every row to Python to loop over. On large tables this is the difference between one fast query and thousands of objects loaded into memory, and it is one of the first things we check when tuning a slow Django view.

Share this article