Bulk data insertion is a typical task when migrating, importing, or populating large tables. The efficiency of this operation depends on several factors:
BULK INSERT or COPY (PostgreSQL) — these are faster than standard INSERT in loops.BULK INSERT my_table FROM 'C:\data\bulkdata.csv' WITH ( FIELDTERMINATOR = ',', ROWTERMINATOR = ' ', BATCHSIZE = 5000, TABLOCK );
TABLOCKreduces lock conflicts during bulk inserts.
Question: Can indexes be disabled and recreated at any time to speed up bulk insert if the table is involved in transactions?
Answer: No, if the table is involved in active transactions, disabling or recreating indexes can lead to locks, data integrity issues, or even data loss if the transaction is rolled back. This operation should only be performed outside of transactions or planned as part of maintenance windows.
-- Wrong: BEGIN TRAN; ALTER INDEX ALL ON my_table DISABLE; -- ... bulk insert ... ALTER INDEX ALL ON my_table REBUILD; COMMIT;
Such disabling is not permissible within long transactions!
Story 1: In one project, parallel bulk inserts into a table with several unique indexes led to frequent deadlocks and a sharp performance drop. The solution was to temporarily disable non-key indexes during the import and reduce the batch size.
Story 2: Developers forgot to disable foreign key constraints during the data loading period, and each insert checked for related records in other large tables. This increased loading time from 40 minutes to 9 hours. After disabling constraints, the insert took 12 minutes.
Story 3: Attempting to insert a large file in a single query (without batching and without transactions) caused the transaction log to overflow and crashed the database server. After switching to batch processing, the issue was resolved.