In SQL, there are several main data types — string types (e.g., VARCHAR, CHAR, TEXT), numeric types (INT, DECIMAL, FLOAT), and date and time types (DATE, TIMESTAMP). It is crucial to choose the smallest suitable type for particular data to save space and optimize indexing. A common mistake is selecting a type with excessive length (e.g., VARCHAR(255) instead of VARCHAR(50)), which can lead to unnecessary resource consumption.
Another important distinction is numeric types: FLOAT stores approximate values, while DECIMAL stores exact values, which is useful for financial operations. Comparing strings of different collations, for example, can yield unexpected results if COLLATE is not considered.
Example:
-- Example of working with incompatible types declare @price varchar(10) = '100.99'; select @price + 1; -- Error: cannot add a string and a number -- It won't work without explicit type conversion
Question: What happens if you compare a VARCHAR field with a number? For instance: WHERE code = 123? Will the index work quickly?
Answer: SQL will attempt to convert the number to a string and compare as strings. However, if a numeric index is built on this field, it will not be used for string comparison, and the query will slow down. It is recommended to always use strict data types and explicit casting when necessary.
Example:
SELECT * FROM products WHERE code = '123'; -- Index may be used SELECT * FROM products WHERE code = 123; -- Index NOT used, full scan possible
History
VARCHAR to support various currency formats. After a couple of years, sorting products by price became necessary: sortings worked incorrectly, and indexing was impossible. A complex migration and type conversion were needed, which took weeks.History
FLOAT to store amounts. During reporting, discrepancies occurred due to rounding, and after a year this led to significant errors in bonus distribution.History
In the logs of a startup, date and time were recorded as text (VARCHAR). When calculating intervals became necessary, it turned out that the times were in different formats everywhere, and calculations required laborious normalizing parsing, negatively affecting performance.