Queries and Querysets¶
Field lookups¶
exact, iexact, contains, icontains, startswith, istartswith, endswith, iendswith, in, gt, gte, lt, lte, range, year, month, day, week_day, hour, minute, second, isnull, search, regex
Optimizing Django queries on big data¶
Suppose you have a query that needs to run against a table or tables with many millions of rows. Maybe you need to operate on a couple million of them. What are the do’s and don’t’s of a Django query that will not pessimize performance (time and memory use)?
Don’t bother with .iterator(), it downloads the whole result and then iterates over it. It does not do what many of us think/thought it did (use a database cursor to pull down the results only as you work through them)
Do limit the query ([start:end]) and run it repeatedly in reasonable sized batches, to avoid downloading too big a chunk
Do use .only() and its kin to minimize how much of each record is downloaded to what you need
Do not order_by() unless you must - it forces the DB to collect the results of the whole query first so it can sort them, even if you then limit the results you retrieve
Same for .distinct().
The model that a queryset is over¶
queryset.model
Combining querysets¶
Given two querysets over the same model, you can do things like this:
queryset = queryset1 & queryset2
queryset = queryset1 | queryset2
queryset = queryset1 & ~queryset2
(similar to Q
objects)
Custom QuerySets¶
Calling custom QuerySet methods from the manager
Creating a manager with QuerySet methods¶:
class Person(models.Model):
...
people = PersonQuerySet.as_manager()
class BaseManager(....):
....
class MyModel(models.Model):
objects = BaseManager.from_queryset(CustomQuerySet)()
Custom Lookups¶
Adding to the kwargs you can pass to filter and exclude etc.