Classic SQL does not provide for storing multiple values in a single cell — the relational model requires normalization. However, in modern tasks, fields like "list of tags", "rating scale" are often encountered where it is convenient to operate with multiple values at the level of a single row. Some DBMS (PostgreSQL, Oracle) provide ARRAY data types or similar mechanisms.
Using arrays violates the principle of normalization, complicates many operations (filtering, updating, indexing), and also makes the code less portable between DBMS. But it can be convenient or unavoidable — for example, for caching or fast searching within small lists of values.
CREATE TABLE products ( id SERIAL PRIMARY KEY, tags TEXT[] ); -- Insertion: INSERT INTO products(tags) VALUES (ARRAY['eco','sale','hot']); -- Searching in the array: SELECT * FROM products WHERE 'eco' = ANY (tags);
Key features:
Can individual elements of an array be indexed?
In PostgreSQL — yes, via GIN/GIST indexes:
CREATE INDEX idx_tags ON products USING GIN (tags);
How to check for the presence of a value in an array in a string column through a delimiter faster?
SQL doesn’t inherently support this, they use pattern matching:
SELECT * FROM users WHERE ',admin,' like concat('%,',role,',%');
But this approach is unreliable and slow.
How many values can be stored in an array, and what limits it?
The limit depends on the DBMS — for example, in PostgreSQL, the limit is only on the row size (1–2 MB).
In an ecommerce project, product tags were decided to be stored as a comma-separated string in one column. This severely hindered fast product search by tag, causing filter errors, and tag duplication occurred due to parsing errors.
Pros:
Cons:
In PostgreSQL for small, immutable sets (user roles), ARRAY and a GIN index were used. For larger ones — a separate roles table.
Pros:
Cons: