Unwanted white space after apply redactions · pymupdf/PyMuPDF · Discussion #4709

katcom
Sep 23, 2025

Hi, thanks for the library, which is great.

Currently I'm using this library for document translation, and I ran into an issue when wiping out the original text and insert the new translated text. I'm using pymupdf==1.26.4 and I use the following codes for redactions of the original text:

page.add_redact_annot(p['rect'], text="")
page.apply_redactions()
page.clean_contents()

And the result is that the original text is removed, but sometimes it lefts a white rectangle after applying redactions, see below.

屏幕截图 2025年09月23日 155659

And the original text and rect is shown below.
屏幕截图 2025年09月23日 155918

Most of the times it works, but sometimes it doesn't. For example, only two of the redacted boxes have the unwanted white rectangles, as highlighted with blue rectangles below:
圖片

I believe my issue is related to this discussion. I have tried disable the color filling by using:
page.add_redact_annot(p['rect'], text="", fill=False)
also decrease the chance of overlapping by:
fitz.TOOLS.set_small_glyph_heights(True)

However, none of the methods above works and the issue persists.

My Configuration

Windows 11
Python 3.9.5
Pymupdf==1.26.4

Answered by robinwatts

Sep 23, 2025

The issue is not how you are adding the redactions annotations.

The redaction annotations merely describe the area that you want to remove content within.

The issue comes with how you apply the redactions.

page.apply_redactions(images=0, graphics=0, text = PDF_REDACT_REMOVE)

That will leave images and graphics untouched, while still removing the text.

Please let us know whether that solves the problem for you.

View full answer

Replies: 1 comment 1 reply

robinwatts
Sep 23, 2025
Maintainer

The issue is not how you are adding the redactions annotations.

The redaction annotations merely describe the area that you want to remove content within.

The issue comes with how you apply the redactions.

page.apply_redactions(images=0, graphics=0, text = PDF_REDACT_REMOVE)

That will leave images and graphics untouched, while still removing the text.

Please let us know whether that solves the problem for you.

1 reply

@katcom

katcom Sep 23, 2025
Author

Hi Robin,

Thanks a lot for your help and explanation. It works perfectly!

圖片

After exploring around the parameters I also found that
graphics=1
works the best for me case, since sometimes text generated by WordArt in Word not only contains the text but also an layer of vector graphics . Using graphics=1 can get rid with them.

Again, thank you very much!

Best

Answer selected by katcom

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unwanted white space after apply redactions #4709

Uh oh!

{{title}}

Uh oh!

katcom
Sep 23, 2025

My Configuration

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

robinwatts
Sep 23, 2025
Maintainer

Uh oh!

{{title}}

Uh oh!

katcom Sep 23, 2025
Author

Select a reply

Uh oh!

Unwanted white space after apply redactions #4709

Uh oh!

katcom Sep 23, 2025

My Configuration

Replies: 1 comment · 1 reply

Uh oh!

robinwatts Sep 23, 2025 Maintainer

Uh oh!

katcom Sep 23, 2025 Author

katcom
Sep 23, 2025

Replies: 1 comment 1 reply

robinwatts
Sep 23, 2025
Maintainer

katcom Sep 23, 2025
Author