Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Unwanted white space after apply redactions #4709

Discussion options

Hi, thanks for the library, which is great.

Currently I'm using this library for document translation, and I ran into an issue when wiping out the original text and insert the new translated text. I'm using pymupdf==1.26.4 and I use the following codes for redactions of the original text:

page.add_redact_annot(p['rect'], text="")
page.apply_redactions()
page.clean_contents() 

And the result is that the original text is removed, but sometimes it lefts a white rectangle after applying redactions, see below.

屏幕截图 2025年09月23日 155659

And the original text and rect is shown below.
屏幕截图 2025年09月23日 155918

Most of the times it works, but sometimes it doesn't. For example, only two of the redacted boxes have the unwanted white rectangles, as highlighted with blue rectangles below:
圖片

I believe my issue is related to this discussion. I have tried disable the color filling by using:
page.add_redact_annot(p['rect'], text="", fill=False)
also decrease the chance of overlapping by:
fitz.TOOLS.set_small_glyph_heights(True)

However, none of the methods above works and the issue persists.


My Configuration

  • Windows 11
  • Python 3.9.5
  • Pymupdf==1.26.4
You must be logged in to vote

The issue is not how you are adding the redactions annotations.

The redaction annotations merely describe the area that you want to remove content within.

The issue comes with how you apply the redactions.

page.apply_redactions(images=0, graphics=0, text = PDF_REDACT_REMOVE)

That will leave images and graphics untouched, while still removing the text.

Please let us know whether that solves the problem for you.

Replies: 1 comment 1 reply

Comment options

The issue is not how you are adding the redactions annotations.

The redaction annotations merely describe the area that you want to remove content within.

The issue comes with how you apply the redactions.

page.apply_redactions(images=0, graphics=0, text = PDF_REDACT_REMOVE)

That will leave images and graphics untouched, while still removing the text.

Please let us know whether that solves the problem for you.

You must be logged in to vote
1 reply
Comment options

Hi Robin,

Thanks a lot for your help and explanation. It works perfectly!

圖片

After exploring around the parameters I also found that
graphics=1
works the best for me case, since sometimes text generated by WordArt in Word not only contains the text but also an layer of vector graphics . Using graphics=1 can get rid with them.

Again, thank you very much!

Best

Answer selected by katcom
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /