django-treebeard stores the materialized path in the path column as a string like this: 000100020005002I. In this example, the following rows are its ancestors (given the default step length of 4):
0001
00010002
000100020005
000100020005002I
What django-treebeard does is to split a page's path into the aforementioned bits in Python and then perform a database query like so:
Organization.objects.filter(path__in=['0001', '00010002', '000100020005'])`
To avoid the n+1 query problem, we need to avoid splitting the path in Python and perform the ancestor lookup in the database via a subquery.
Pattern matching can be used to see if an ancestor's path is contained in the child's path: 00010002 matches 000100020005002I when the candidate's path is used as a pattern for the path of the organization in question:
000100020005002I LIKE 00010002% --- equals true
SELECT
organization.path,
ARRAY(
SELECT
name
FROM
organization o_
WHERE
organization.path LIKE o_.path || '%'
)
FROM
organization
| organization.path |
array |
| 0001 |
{root} |
| 00010001 |
{root, org_a} |
| 00010002 |
{root, org_b} |
| 000100020001 |
{root, org_b, org_b1} |
Django doesn't provide an out-of-the-box solution for switching the arguments around in a .filter(path__startswith='pattern') lookup (as is required in our case here). That's why I'm using the RawSQL expression.
>>> from django.db.models.expressions import RawSQL
>>> orgs = Organization.objects.annotate(
ancestors=RawSQL(
"""
ARRAY(
SELECT name FROM organization o_
WHERE organization.path LIKE o_.path || '%%'
)
FROM organization
""",
params=[],
)
)
>>> orgs[0].ancestors
['Root', "Org 1", "Org 2", "Org 3"]