-
Notifications
You must be signed in to change notification settings - Fork 650
Unwanted white space after apply redactions #4709
-
Hi, thanks for the library, which is great.
Currently I'm using this library for document translation, and I ran into an issue when wiping out the original text and insert the new translated text. I'm using pymupdf==1.26.4
and I use the following codes for redactions of the original text:
page.add_redact_annot(p['rect'], text="")
page.apply_redactions()
page.clean_contents()
And the result is that the original text is removed, but sometimes it lefts a white rectangle after applying redactions, see below.
屏幕截图 2025年09月23日 155659And the original text and rect is shown below.
屏幕截图 2025年09月23日 155918
Most of the times it works, but sometimes it doesn't. For example, only two of the redacted boxes have the unwanted white rectangles, as highlighted with blue rectangles below:
圖片
I believe my issue is related to this discussion. I have tried disable the color filling by using:
page.add_redact_annot(p['rect'], text="", fill=False)
also decrease the chance of overlapping by:
fitz.TOOLS.set_small_glyph_heights(True)
However, none of the methods above works and the issue persists.
My Configuration
- Windows 11
- Python 3.9.5
- Pymupdf==1.26.4
Beta Was this translation helpful? Give feedback.
All reactions
The issue is not how you are adding the redactions annotations.
The redaction annotations merely describe the area that you want to remove content within.
The issue comes with how you apply the redactions.
page.apply_redactions(images=0, graphics=0, text = PDF_REDACT_REMOVE)
That will leave images and graphics untouched, while still removing the text.
Please let us know whether that solves the problem for you.
Replies: 1 comment 1 reply
-
The issue is not how you are adding the redactions annotations.
The redaction annotations merely describe the area that you want to remove content within.
The issue comes with how you apply the redactions.
page.apply_redactions(images=0, graphics=0, text = PDF_REDACT_REMOVE)
That will leave images and graphics untouched, while still removing the text.
Please let us know whether that solves the problem for you.
Beta Was this translation helpful? Give feedback.
All reactions
-
👍 1 -
🚀 1
-
Hi Robin,
Thanks a lot for your help and explanation. It works perfectly!
圖片After exploring around the parameters I also found that
graphics=1
works the best for me case, since sometimes text generated by WordArt in Word not only contains the text but also an layer of vector graphics . Using graphics=1
can get rid with them.
Again, thank you very much!
Best
Beta Was this translation helpful? Give feedback.