ProgrammingBackend Developer

Explain the differences between data types in SQL. What pitfalls may arise from type mismatches, and how does this affect the performance and correctness of queries? Provide examples.

Pass interviews with Hintsage AI assistant

Answer.

In SQL, there are several main data types — string types (e.g., VARCHAR, CHAR, TEXT), numeric types (INT, DECIMAL, FLOAT), and date and time types (DATE, TIMESTAMP). It is crucial to choose the smallest suitable type for particular data to save space and optimize indexing. A common mistake is selecting a type with excessive length (e.g., VARCHAR(255) instead of VARCHAR(50)), which can lead to unnecessary resource consumption.

Another important distinction is numeric types: FLOAT stores approximate values, while DECIMAL stores exact values, which is useful for financial operations. Comparing strings of different collations, for example, can yield unexpected results if COLLATE is not considered.

Example:

-- Example of working with incompatible types declare @price varchar(10) = '100.99'; select @price + 1; -- Error: cannot add a string and a number -- It won't work without explicit type conversion

Trick question.

Question: What happens if you compare a VARCHAR field with a number? For instance: WHERE code = 123? Will the index work quickly?

Answer: SQL will attempt to convert the number to a string and compare as strings. However, if a numeric index is built on this field, it will not be used for string comparison, and the query will slow down. It is recommended to always use strict data types and explicit casting when necessary.

Example:

SELECT * FROM products WHERE code = '123'; -- Index may be used SELECT * FROM products WHERE code = 123; -- Index NOT used, full scan possible

History

In an online store, the "price" field was stored as VARCHAR to support various currency formats. After a couple of years, sorting products by price became necessary: sortings worked incorrectly, and indexing was impossible. A complex migration and type conversion were needed, which took weeks.

History

In one bank, a developer used FLOAT to store amounts. During reporting, discrepancies occurred due to rounding, and after a year this led to significant errors in bonus distribution.

History

In the logs of a startup, date and time were recorded as text (VARCHAR). When calculating intervals became necessary, it turned out that the times were in different formats everywhere, and calculations required laborious normalizing parsing, negatively affecting performance.