Working with trees and hierarchies is a common case in SQL programming: product catalogs, organizational structure, menus. Initially, most DBMSs only supported a parent reference (ParentId), but over time, alternative methods emerged — Nested Sets and Path, as well as recursive queries via CTE.
The problem of the traditional Parent-Child approach is the low speed when traversing large hierarchies; queries at depth greater than 2-3 lead to an avalanche of JOINs. More complex techniques (Nested Sets, Path Enumeration) speed up bulk traversals, but complicate inserts/deletes.
The solution is to choose the optimal storage model for the specific task:
Example of a recursive subtree search via CTE:
WITH RecursiveTree AS ( SELECT id, parent_id, name, 0 as level FROM categories WHERE parent_id IS NULL UNION ALL SELECT c.id, c.parent_id, c.name, rt.level + 1 FROM categories c INNER JOIN RecursiveTree rt ON c.parent_id = rt.id ) SELECT * FROM RecursiveTree;
Key features:
Can a tree be stored in a single denormalized table with one level of links if the depth is fixed?
Only if the depth is small and always fixed (2-3 levels) — beyond that, JOINs become unmanageable, complicating change processing.
Will CTE "hang" with large trees — are there any stack or recursion depth limits?
Yes, in most DBMSs, limits are set on recursion depth (for example, 100/32767). For trees with 1000+ levels, manual depth management or custom traversal/splitting algorithms will be required.
Are Nested Sets the solution to all problems?
No, nested sets instantly find all descendants, but insertions/deletions require mass updates of all left/right indices — this is slow with a large number of frequent changes.
Example of an insertion (simplified):
UPDATE tree SET lft = lft + 2 WHERE lft > @parent_right; UPDATE tree SET rgt = rgt + 2 WHERE rgt >= @parent_right; INSERT INTO tree(id, name, lft, rgt) VALUES(@new_id, @name, @parent_right, @parent_right + 1);
A large online store stored its category tree in an adjacency list and built the menu using nested subqueries. With 6+ levels in the menu, the main query took several minutes to execute, and any change to the tree led to cascade updates. Pros:
Cons:
They switched to Nested Sets for static trees (categories) and used Path for the menu paths. They employed CTE for dynamic scenarios. Finding descendants became instantaneous, changes were made in batches. Pros:
Cons: