Shrinking my Matrix (Synapse) database

Published: 2025-12-27

During my routine backup procedures I wanted to investigate why the Synapse database dump was so big (about 740 MiB), for a Matrix server instance basically used just by me and a few acquaintances. Little I knew that I would dive into a rabbit hole.

Assisted by a useful blog post, I started looking around on the Synapse admin documentation and found these links containing all I had to learn (kudos for the clear and useful documentation):

In retrospect, I intervened at four levels:

  1. Reduced the size of the state_groups_state table using an external tool
  2. Manually deleted rooms I was not member anymore
  3. Deleted local media
  4. Compacted the Postgres database

I treat all my chats as ephemeral so a great advantage for me is that I could trim history really a lot: I am fine deleting everything older than 7 days.

§ Getting data about space used

First I checked how big the DB of the Matrix instance was:

synapse=> \l+

Name      | Owner      | Size    |
----------+------------+---------+
synapse   | db-user    | 3780 MB |

And which the biggest tables are. The Postgres command \dt+ is quick to remember but the following SQL query gives a better overview:

synapse=> select
  table_name,
  pg_size_pretty(pg_total_relation_size(quote_ident(table_name))),
  pg_total_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 3 desc;

table_name           | pg_size_pretty | pg_total_relation_size
---------------------+----------------+------------------------
 state_groups_state  | 1100 MB        |             1152933888
 event_json          | 947 MB         |              992829440
 events              | 406 MB         |              425549824
 event_auth          | 309 MB         |              323715072

§ Shrinking the state_groups_state table

This is a very specific point of intervention I didn't expect. Apparently this table grows unwieldy and this is by design because the server (quoting the docs) "need(s) to be able to relatively quickly calculate the state of a room at any point in that room's history. In other words, we need to know the state of the room at each event in that room".

To reduce the size of this table, Element suggests using a specific tool, the rust-synapse-compress-state.

After running the tool, that table didn't shrink excessively; I didn't note down the exact numbers but it went down of about 10% maybe. Your mileage may vary, it's probably very sensible to how many users are using (or are federated on) the server, thus how many events your server has to keep track.

This table could be further reduced but I wanted to stay on the safe side and only use officially endorsed tools.

§ Delete rooms I was not anymore in

I was not convinced by these (relatively) small wins, so I've dug further. Which rooms are taking the most DB space? (room IDs are fictional):

synapse=> SELECT s.canonical_alias, g.room_id, count(*) AS num_rows
FROM
  state_groups_state AS g,
  room_stats_state AS s
WHERE g.room_id = s.room_id
GROUP BY s.canonical_alias, g.room_id
ORDER BY num_rows desc
LIMIT 10;
 canonical_alias |               room_id                | num_rows
-----------------+--------------------------------------+----------
                 | !rv51imyvsxk7won3jq:jki.re           |  3066094
                 | !q9n2izqzlu1u7awqd8:libera.chat      |   942549
                 | !gq3kibrecs2i3homk1:matrix.org       |   556880
                 | !tnyqy76xwpk8vto281:matrix.org       |   284976
                 | !f1ffkg07cmht5fzfgm:matrix.org       |    99785
                 | !tkvo8kwfbgfvz7o26s:entropia.de      |    87835
                 | !vtswnxw50382f2cf72:greyface.org     |    59242
                 | !cvc8hpga6vd40ozfy9:fachschaften.org |    56507
                 | !ihfxg48g1mwhm1fl2h:matrix.org       |    26956
                 | !ugdala480vtb2rqfc3:computer.surgery |    25465

Hmm ... I don't think I am in all those rooms. I can't be sure, but this feels like when I leave a room (or maybe a DM), the relevant events are not purged. But I don't want to manually figure out a SQL to further clean up state_groups_state so I'll use another - blessed - approach.

I will just delete the rooms I am not in anymore using the purge_history API.

The following instructions are for "educational" purposes, I've found much easier to use the script matrix_prune_messages_v1.sh with a little addition: as per the documentation, one can also delete local events (which I am fine with) by adding {"delete_local_events": true} to the payload.

Example: create a timestamp of 1 year ago:

$ date --date='365 days ago' +%s000
1735246662000

Start purging events older than 1 year:

curl -X POST http://127.0.0.1:8008/_synapse/admin/v1/purge_history/<room_id> \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $TOKEN" \
    -d { "delete_local_events": true, "purge_up_to_ts": 1735246662000 } \
    | jq .

The server returns a purge_id that can be polled to check when the operation finishes.

curl http://127.0.0.1:8008/_synapse/admin/v1/purge_history_status/<purge_id> \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    | jq .

I run a server since many years, so this did take down the number of records (as expected).

§ Delete local media

Another thing I am not interested is storing media files. I removed all media files older than a week.

Timestamp of 7 days ago:

$ date --date='7 days ago' +%s000
1766186437000

Delete the files:

curl -X POST 'http://127.0.0.1:8008/_synapse/admin/v1/media/delete?before_ts=1766186437000' \
    -H "Authorization: Bearer $TOKEN" \
    -H "Content-Type: application/json" \
    | jq .

This reclaimed immediately about 1GiB disk space (which is a lot, considering how little users my server has).

§ Compact Postgres database

And now the cherry on top: running a VACUUM FULL; will now free a lot of disk space.

§ Conclusions

In the end the size of the database dump went down from 744 MiB to 104 MiB, that's a great win! I cannot imagine the space one can reclaim on a much busier server.

The next step is probably to setup a procedure (even better a script) to automate these actions and maybe run them once a year.