Answer.

Working with hierarchical structures is a classic case for relational databases. Examples include: directories, tree menus, organizational structures.

Background: Databases use a tabular model. There are several approaches for tree-like data, each with its advantages and disadvantages.

Problem: The standard parent_id model leads to slow recursive SELECTs. Real tasks (finding all descendants, paths, subtree counting) require optimization.

Solutions:

Adjacency List — a simple reference to parent_id.
Materialized Path — a string representing the full path.
Nested Sets — storing left/right boundaries (left/right) which allows for easy retrieval of subtrees.
Closure Table — a separate table of relationships (from->to) reflecting all the relationships in the tree.

Example of code for Materialized Path (PostgreSQL):

CREATE TABLE categories (
    id SERIAL PRIMARY KEY,
    name TEXT,
    path TEXT
);

-- Example of storing path: '1/2/5/' (root/subcategory/current)
SELECT * FROM categories WHERE path LIKE '1/2/%'; -- all descendants of 2

Example of code for Nested Sets:

CREATE TABLE nested_categories (
    id SERIAL PRIMARY KEY,
    name TEXT,
    lft INT NOT NULL,
    rgt INT NOT NULL
);

-- Child nodes
SELECT * FROM nested_categories WHERE lft > 2 AND rgt < 15;

Key features:

Adjacency List is convenient for simple tree structures (low nesting).
Materialized Path — fast subtree retrieval, easy to implement.
Nested Sets — instant access to all descendants, fast reads, but complex modification.

Trick questions.

Can we simply use a nested SELECT with parent_id to find all descendants to any depth?

This works only for shallow depths. For recursive searching, either a recursive CTE (WITH RECURSIVE) or more complex schemes are required, as simple JOINs lead to an enormous number of requests and poor performance.

Example:

WITH RECURSIVE cte AS (
    SELECT id, parent_id, name FROM categories WHERE id = 1
    UNION ALL
    SELECT c.id, c.parent_id, c.name
    FROM categories c
    INNER JOIN cte ON c.parent_id = cte.id
)
SELECT * FROM cte;

Why does the molecular tree (Nested Sets) quickly extract a subtree, but slowly add/remove nodes?

When modifying the tree, the lft/rgt values must be changed for many rows (the change step — all values greater than the insert/delete). The approach is ideal for reading, but for frequent changes, consider other methods.

When to use a Closure Table, instead of parent_id or materialized path?

Closure Table works perfectly for multiple requests for descendants at any level and regular relationship counting. The downside is that it requires more space.

Example:

CREATE TABLE closure (
    ancestor INT,
    descendant INT,
    depth INT
);

Typical mistakes and anti-patterns

Storing hierarchy only through parent_id, when fast searching for nested structures is needed.
Manually recalculating paths or lft/rgt without auxiliary functions.
Lack of indexing for key columns (parent_id/path/lft/rgt).

Real-life example

Negative case

In an online store, categories are implemented through parent_id. All nesting is done manually, searching for subcategories involves nested SELECTs.

Pros:

Simplicity, no advanced support required.

Cons:

Performance drops even at 3-4 levels of nesting.
Node moving operations are non-intuitive, leading to logical errors.

Positive case

Uses materialized path or closure table. All queries for nested categories are instantaneous, recalculation is done through batch scripts.