I have a large hierarchy (2,500+ records) stored in Microsoft SQL Server (2019) using an adjacency list model (e.g., Id, ParentId). I am looking for an efficient approach to look up a record based on a particular path in the hierarchy. In other words, given a path (e.g. /Root/FolderA/SubfolderA), I'd like to retrieve the Id associated with the final node (i.e., SubfolderA in this case).
Note: Node names are not globally unique. I.e., we can't just look for
SubfolderAand assume it maps to/Root/FolderA/SubfolderA. There may be multiple nodes namedSubfolderAwithin the hierarchy.
Setup
Hierarchy
/Root
  /FolderA
    /SubfolderA
    /SubfolderB
  /FolderB
    /SubfolderA
    /SubfolderB
Structure
CREATE 
TABLE   [dbo].[Tree] (
        [Id]            INT             NOT NULL PRIMARY KEY, 
        [ParentId]      INT             NULL, 
        [Name]          VARCHAR(255)    NOT NULL, 
        CONSTRAINT      [FK_Hierarchy]  
        FOREIGN KEY     (ParentId) 
        REFERENCES      [Tree]([Id])
)
Data
INSERT INTO Tree VALUES (1,    NULL, 'Root');
INSERT INTO Tree VALUES (2,    1,    'FolderA');
INSERT INTO Tree VALUES (3,    2,    'SubfolderA');
INSERT INTO Tree VALUES (4,    2,    'SubfolderB');
INSERT INTO Tree VALUES (5,    1,    'FolderB');
INSERT INTO Tree VALUES (6,    5,    'SubfolderA');
INSERT INTO Tree VALUES (7,    5,    'SubfolderB');
Naïve Approach
There are quite a few threads on how to convert an adjacency list to materialized paths, including:
- SQL - Convert non-null adjacency list to path
- Build Enumeration Path from Adjacency List in SQL
- Flatten Adjacency List Hierarchy To A List Of All Paths
- Load hierarchical data from MSSQL with recursive common table expressions
View
We can use one of these approaches to convert the entire adjacency list into materialized paths using an rCTE:
CREATE 
VIEW            [dbo].[MaterializedPaths]
WITH            SCHEMABINDING
AS 
WITH RCTE AS (
  SELECT        Id,
                ParentId,
                CAST('/' + Name AS VARCHAR(255)) AS Path
  FROM          [dbo].[Tree] root
  WHERE         root.Id = 1
  UNION ALL
  SELECT        this.Id,
                this.ParentId,
                CAST(parent.Path + '/' + this.Name AS VARCHAR(255)) AS Path
  FROM          [dbo].[Tree] AS this
  INNER JOIN    RCTE parent
    ON          this.ParentId = parent.Id
)
SELECT          Id,
                Path
FROM            RCTE as hierarchy
Output
This produces the following output:
Id    Path
1     /Root
2     /Root/FolderA
3     /Root/FolderA/SubfolderA
4     /Root/FolderA/SubfolderB
5     /Root/FolderB
6     /Root/FolderB/SubfolderA
7     /Root/FolderB/SubfolderB
Query
We can filter that output using a simple WHERE clause:
SELECT          Id
FROM            MaterializedPaths
WHERE           Path = '/Root/FolderA/SubfolderA'
Problem
The naïve approach works fine. The problem is that it's incredibly inefficient—and, thus, slow—for querying large hierarchies, as it needs to dynamically reconstruct the entire set of materialized paths each call. In my case, that takes 8-9 seconds. Obviously, I could just store this data in a table and regenerate it via a trigger at any time the data changes. But I'd rather find a more efficient query and avoid the additional complexity.
Question
What is an efficient way of constructing this query? Or, at the risk of making this an XY problem, is there a way to limit the rCTE so that it only needs to evaluate the nodes in the hierarchy, instead of reconstructing the entire hierarchy each time?
 
    