Reclaiming Disk Space in PostgreSQL after DELETE operation
- 736 words
- 4 minutes read
- Updated on 4 Sep 2024
The above is a simple table with a corresponding index that we are going to use to describe
the data deletion flow using two methods: DELETE
and TRUNCATE
,
and how those methods affect physical space after completion.
What Is the Impact of DELETE and TRUNCATE on Disk Space?
There are scenarios where it becomes necessary to occasionally remove all the data from a table.
To achieve this, you can use either DELETE
or TRUNCATE
operation.
DELETE
operation may be more favorable when there are frequent queries against the table that is being cleaned up.
That’s because data is removed without waiting for all transactions to complete,
allowing them to still access the table rows which were marked for deletion.
In comparison, TRUNCATE
operation
violates the strict MVCC (multiversion concurrency control) semantics.
This may lead to inconsistency between the contents of the truncated table and other tables in the database.
A problem arises when dealing with tables containing gigabytes of data.
PostgreSQL does not release the freed memory back to the operating system after DELETE
operation.
To make it happen, you have to request it explicitly through the execution of VACUUM FULL
, as shown in Figure 2.
TRUNCATE
operation, on the other hand, immediately reclaims physical memory after completion, as shown in Figure 3.
Comparing Storage Space: Before and After Applying DELETE vs. TRUNCATE Operations
Let’s create a table and compare the storage space before and after applying both operations.
First, create initial table with some data and check its size:
CREATE TABLE books (id serial PRIMARY KEY, title VARCHAR (255) UNIQUE NOT NULL);
INSERT INTO books (title) VALUES ('Software Architecture: The Hard Parts');
INSERT INTO books (title) VALUES ('Fundamentals of Software Architecture');
INSERT INTO books (title) VALUES ('High Output Management');
INSERT INTO books (title) VALUES ('The History of Philosophy');
INSERT INTO books (title) VALUES ('Cassandra: The Definitive Guide');
SELECT pg_size_pretty(pg_total_relation_size('books'));
pg_size_pretty
----------------
40 kB
(1 row)
Next, execute TRUNCATE
operation and check the size:
TRUNCATE TABLE books;
SELECT pg_size_pretty(pg_total_relation_size('books'));
pg_size_pretty
----------------
16 kB
(1 row)
Now, do the same with DELETE
operation:
DELETE FROM books;
SELECT pg_size_pretty(pg_total_relation_size('books'));
pg_size_pretty
----------------
40 kB
(1 row)
As observed, the table size after TRUNCATE
is 16 kB
, when after DELETE
it is still 40 kB
.
Operating system thinks that the occupied disk space is still in use after DELETE
, while TRUNCATE
immediately reclaims it.
Using DELETE and VACUUM FULL to Emulate TRUNCATE Behavior
You can achieve a similar behavior to TRUNCATE
using DELETE
. Execute DELETE
followed by
VACUUM FULL
, as shown below. Keep in mind that because VACUUM FULL
places an ACCESS EXCLUSIVE
lock,
it may take some time to acquire that lock initially.
VACUUM (FULL, ANALYZE) books;
SELECT pg_size_pretty(pg_total_relation_size('books'));
pg_size_pretty
----------------
16 kB
(1 row)
Furthermore, when executing VACUUM FULL
,
it utilizes additional disk space roughly equivalent to the size of the table.
That is because the previous copy of the table cannot be released until the new one is completed.
Summary
Following a DELETE
operation, a VACUUM FULL
is essential to release space, but it necessitates exclusive access and temporary utilization of extra disk space.
TRUNCATE
quickly recovers physical disk space, while DELETE
does not free up physical space unless a VACUUM FULL
operation is performed.
DELETE
is recommended for tables that are frequently queried, as it enables ongoing transactions to access rows that are scheduled for deletion.
In the case of large tables where data is regularly removed, TRUNCATE
is more effective and easier to use compared to DELETE
followed by VACUUM FULL
.
Check routine vacuuming documentation which gives more details about VACUUM operations and their automation.
Frequently Asked Questions (FAQ)
Q: How Do I Reclaim Disk Space in PostgreSQL After Deleting Data?
A: After a DELETE
operation in PostgreSQL, you can reclaim disk space by executing the VACUUM FULL
command on the affected table, which cleans up dead tuples and frees space for future use.
Q: What Causes Bloat in PostgreSQL Databases?
A: Bloat in PostgreSQL databases is caused by accumulating dead tuples - rows marked for deletion but not yet removed by vacuuming - leading to unnecessary disk space usage and performance issues.
Q: Is Table Partitioning Effective Against PostgreSQL Bloat?
A: Partitioning tables in PostgreSQL can effectively manage bloat by segregating hot and cold data, enhancing query performance and easing maintenance efforts.