Django Queryset is generally lazy in nature. It will not hit the database until it evaluates the query results.
Example:
queryset = User.objects.all() # It won't hit the database
print (queryset) # Now, ORM turns the query into raw sql and fetches results from the database
1.Caching and QuerySets
Generally Django stores the query results in the catche when it fetches the results for the first time.
Example: Get all users first names list and emails list from the database.
first_names_list = [ user.first_name for user in User.objects.all()]
emails_list = [ user.email for user in User.objects.all()]
above code hits the database twice. To avoid the extra request to the database we can use the Django's database cache in the following way.
users_list = User.objects.all() # No database activity
first_names_list = [user.first_name for user in users_list] # Hits the database and stores the results in the cache
emails_list = [user.email for user in users_list] # uses the results from the cache.
other way is
users_list = User.objects.all()
db = dict(users_list.values_list("first_name", "email"))
first_names_list, emails_list = db.keys(), db.values()
Note: Querysets are not cached if query is not evaluated.
Eaxmple: If you want to take a subset/part of the query results
queryset = Users.objects.all()
first_five = queryset[:5] # Hits the database
first_ten = queryset[:10] # Hits the database
If you want to use the cache in the above situation you can do it in the following way
queryset = Users.objects.all()
bool(queryset) # queryset is evaluated and results are cached
first_five = queryset[:5] # uses the cache
first_ten = queryset[:10] # uses the cache
2. Complex Database Queries with "Q" objects
Django Queries uses "AND" caluse to fetch results when using filter.
Example:
User.objects.filter(email__contains="adam", first_name="Adam")
# equivalent SQL Qurery is
SELECT * FROM user_table WHERE email LIKE '%adam%' AND first_name=Adam;
If you want to filter the users email starts with "an" or "sa"
Simple Example:
users_list = User.objects.all()
filtered_users = []
for user in users_list:
if user.email.startswith("an") or user.email.startswith("sa"):
filtered_users.append(user)
In above example it gets all the records from the table. Insted, we can fetch the records that we required by using the "Q" object.
Example:
from django.db.models import Q
users_list = User.objects.filter(Q(email__startswith="an") | Q(email__startswith="sa"))
# equivalent SQL Qurery is
SELECT * FROM user_table WHERE email LIKE 'an%' OR email LIKE 'sa%'
Q object usage:
~Q(email__startswith="an") # email don't starts with "an"
# SQL equivalent
| = OR
& = AND
~ = NO
We can use parenthesis with "Q" object
Example:
Model.objects.filter((Q(key1="value1") & ~Q(key2="value2")) | (~Q(key3="value"))
3. Create Multiple Objects at once with Bulk Create
Simple Example: To create 4 users
users_details = [
("Dian", "[email protected]"),
("Ravi", "[email protected]"),
("Santa", "[email protected]"),
("Shera", "[email protected]")
]
for first_name, email in users_details:
User.objects.create(first_name=first_name, email=email)
Above example hits/request the database 4 times to create 4 users, but we can create 4 users with single database hit/request.
Example:
instance_list = [User(first_name=first_name, email=email) for first_name, email in users_details]
User.objects.bulk_create(instance_list)
4. Update Multiple Objects or a filtered Queryset at once with update
Let us consider Employee model as
from django.db import models
DESIGNATIONS = (("Designer", "Designer"),
("Developer", "Developer"),
("HR", "HR"))
class Employee(models.Model):
name = models.CharField(max_length=30)
email = models.EmailField(max_length=30)
designation = models.CharField(max_length=30, choices=DESIGNATIONS)
salary = models.DecimalField(default=20000)
After one year manager wants to increase the salary for Developers at an amount of 5000
SimpleExample:
developers = Employee.objects.filter(designation="Developer")
for developer in developers:
developer.salary = developer.salary + 5000
develope
r.save()
Above example hits the database for several times[i.e number of developers]. we can do this with a single database hit/request.
Example:
from django.db.models import F
amount = 5000
developers = Employee.objects.filter(designation="Developer")
developers.update(salary=F("salary")+amount)
5. Select only required fields in the query from the database to decrease querying time
We use this when we need only the data[ie. fields values] we dont get access to the model functions and relational objects. We can do this by using QuerySet.values() and QuerySet.values_list()
QuerySet.values() returns the list of dictionaries. Each dictionary represents an instance of object.
QuerySet.values_list() returns the list of tuples. Each tutple represents an instance of object. order of values in the tuple is id followed by the fields in the model.
from .models import Employee
# create 2 objects
Employee.objects.create(name="Ravi", email="[email protected]", designation="Developer", salary=30000)
Employee.objects.create(name="Santa", email="[email protected]", designation="Designer", salary=40000)
queryset = Employee.objects.all() # you can also use filter
# Example for QuerySet.values()
users_dict_list = queryset.values()
# above line is equivalent to
users_dict_list = [
{
"id": 1,
"name": "Ravi",
"email": "[email protected]",
"designation": "Developer",
"salary": 30000
},
{
"id": 2,
"name": "Santa",
"email": "[email protected]",
"designation": "Designer",
"salary": 40000
},
]
# To get only required fields data ie. "name", "salary"
users_dict_list = queryset.values("name", "salary")
# above line is equivalent to
users_dict_list = [
{
"name": "Ravi",
"salary": 30000
},
{
"name": "Santa",
"salary": 40000
},
]
# Example for QuerySet.values_list()
users_tuple_list = queryset.values_list()
# above line is equivalent to
users_tuple_list = [
(1, "Ravi", "[email protected]", "Developer", 30000),
(2, "Santa", "[email protected]", "Designer", 40000),
]
# To get only required fields data ie. "name", "salary"
users_tuple_list = queryset.values_list("name", "salary")
# above line is equivalent to
users_tuple_list = [
("Ravi", 30000),
("Santharao", 40000),
]
# We can also get list of values of a single field in the model by setting "flat=True"
users_names_list = queryset.values_list("name", flat=True) # it works only for single field
# above line is equivalent to
users_names_list = ["Ravi", "Santa"]
Don't hit/request the database for related objects. Fetch all related objects in a single query using select_related and prefetch_related Let us consider the following models
from django.db import models
class Address(models.Model):
city = models.CharField(max_length=100)
state = models.CharField(max_length=100)
pin = models.CharField(max_length=100)
class Person(models.Model):
name = models.CharField(max_length=100)
email = models.EmailField(max_length=100)
present_address = models.ForeignKey(Address)
previous_address = models.ForeignKey(Address)
class Book(models.Model):
name = models.CharField(max_length=100)
author = models.ForeignKey(Person)
publishers = models.ManyToManyField(Person)
usage of select_related
# without select related
person = Person.objects.get(id=1)
present_address = person.present_address # Hits the database.
previous_address = person.previous_address # Hits the database.
# total database hits = 3
# with select related
person = Person.objects.select_related().get(id=1)
present_address = person.present_address # Doesn't hit the database.
previous_address = person.previous_address # Doesn't hit the database.
# total database hits = 1
# you can also select the specific related objects
person = Person.objects.select_related("present_address").get(id=1)
present_address = person.present_address # Doesn't hit the database.
previous_address = person.previous_address # Hits the database.
Limitaions of select_related
select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.
Usage of prefetch_related
# without prefetch related
book = Book.objects.get(id=1)
author = book.author # Hits the database.
publishers = book # Hits the database.
# total database hits = 3
# with prefetch related
book = Book.objects.prefetch_related().get(id=1)
author = book.author # Doesn't hit the database.
publishers = book.publishers.all() # Doesn't hit the database.
# total database hits = 1
# you can also select the specific related objects
book = Book.objects.prefetch_related("publishers").get(id=1)
author = book.author # Doesn't hit the database.
publishers = book.publishers.all() # Hits the database.
Advantage over select_related
prefetch_related does a separate lookup for each relationship, and does the ‘joining’ in Python. prefetch_related allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related. It also supports prefetching of GenericRelation and GenericForeignKey
prefetching of related objects referenced by a GenericForeignKey is only supported if the query is restricted to one ContentType.
count = Book.objects.filter(author_id=5).count() # It returns the count(number of objects/records) only. So, operation is very fast.
7. use queryset.exists() if you want to know whether objects exists or not only
is_exists = Book.objects.filter(author_id=5).exists() # It returns the boolean value(True/False) only. So, operation is very fast.
8. Provide index to fields in database models & provide default ordering
If one of the models accessed very freequently use the indexes to the database tables.
class Book(models.Model):
name = models.CharField(max_length=100, db_index=True)
author = models.ForeignKey(Person)
published_on = models.DateField()
publishers = models.ManyToManyField(Person)
class Meta:
index_together = ["author", "published_on"]
ordering = ['-published_on']