1

The metadata JSON file contains the schema for all snapshots. I have a few tables with thousands of columns, and the metadata JSON quickly grows to 1 GB, which impacts the Trino coordinator. I have to manually remove the schema for older snapshots.

I already run maintenance tasks (via spark) to expire snapshots, but this does not clean the schemas of older snapshots from the latest metadata.json file.

How can this be fixed?

asked Dec 20, 2025 at 7:06

1 Answer 1

0

clean_expired_metadata was added to the expire_snapshot procedure in Iceberg 1.10.0.

When true, cleans up metadata such as partition specs and schemas that are no longer referenced by snapshots.

Example:

CALL {catalog}.system.expire_snapshots(table => '{table_name}', clean_expired_metadata => true)
answered Dec 29, 2025 at 8:59
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.