Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

I tried to use update_object() to fix the cross-reference, but an error occurred. #4702

Unanswered
xiaolibuzai-ovo asked this question in Looking for help
Discussion options

code:

 for xref in range(1, self.doc.xref_length()):
 try:
 _ = self.doc.xref_object(xref)
 except:
 try:
 self.doc.update_object(xref, "<<>>")
 except Exception as e:
 logger.error(f"save update_object error: {str(e)}, {traceback.format_exc()}")

error:

PageProcess error: RAISEPY() takes 2 positional arguments but 3 were given, Traceback (most recent call last):
 File "/usr/local/trpc/bin/process/pymupdf_process.py", line 152, in save
 _ = self.doc.xref_object(xref)
 File "/usr/local/trpc/bin/lib/pymupdf/__init__.py", line 6032, in xref_object
 ret = extra.xref_object( self.this, xref, compressed, ascii)
 File "/usr/local/trpc/bin/lib/pymupdf/extra.py", line 120, in xref_object
 return _extra.xref_object(*args)
RuntimeError: bad xref
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
 File "/usr/local/trpc/bin/pdf_atom_process_py.py", line 126, in PageProcess
 url = await pymupdf_cli.save(file_id, request.scene, request.namespace_id)
 File "/usr/local/trpc/bin/process/pymupdf_process.py", line 154, in save
 self.doc.update_object(xref, "<<>>")
 File "/usr/local/trpc/bin/lib/pymupdf/__init__.py", line 5829, in update_object
 RAISEPY("bad xref", MSG_BAD_XREF, PyExc_ValueError)
TypeError: RAISEPY() takes 2 positional arguments but 3 were given

version: pymupdf==1.25.5
platform: linux

You must be logged in to vote

Replies: 4 comments 4 replies

Comment options

pymupdf-1.25.5 is very old, please upgrade to the latest version.

In particular, the incorrect call of RAISEPY() with "bad xref" was fixed a while ago.

You must be logged in to vote
2 replies
Comment options

I have already upgraded to the latest version, but the issue still persists.

v1.26.4

Comment options

PageProcess error: bad xref, Traceback (most recent call last):
File "/usr/local/trpc/bin/process/pymupdf_process.py", line 152, in save
_ = self.doc.xref_object(xref)
File "/usr/local/trpc/bin/lib/pymupdf/init.py", line 6095, in xref_object
ret = extra.xref_object( self.this, xref, compressed, ascii)
File "/usr/local/trpc/bin/lib/pymupdf/extra.py", line 120, in xref_object
return _extra.xref_object(*args)
RuntimeError: bad xref

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/trpc/bin/pdf_atom_process_py.py", line 125, in PageProcess
url = await pymupdf_cli.save(file_id, request.scene, request.namespace_id)
File "/usr/local/trpc/bin/process/pymupdf_process.py", line 154, in save
self.doc.update_object(xref, "<<>>")
File "/usr/local/trpc/bin/lib/pymupdf/init.py", line 5892, in update_object
RAISEPY("bad xref", MSG_BAD_XREF)
File "/usr/local/trpc/bin/lib/pymupdf/init.py", line 17449, in RAISEPY
raise Exception( msg)
Exception: bad xref

Comment options

You must be logged in to vote
0 replies
Comment options

I have created a test with your code, and with the current pymupdf-1.26.4 it runs without raising an exception, though there is a warning from mupdf repairing PDF document.

def test_4702():
 path = util.download(
 'https://github.com/user-attachments/files/22403483/01995b6ca7837b52abaa24e38e8c076d.pdf',
 'test_4702.pdf',
 )
 with pymupdf.open(path) as document:
 for xref in range(1, document.xref_length()):
 print(f'{xref=}')
 _ = document.xref_object(xref)
 wt = pymupdf.TOOLS.mupdf_warnings()
 assert wt == 'repairing PDF document'

Is your code modifying the document before it does the xref loop?

You must be logged in to vote
2 replies
Comment options

Yes, I am working on PDF layout restoration. After translating the English text, I cover it with a white rectangle and then insert the translated text using insert_html. However, an error occurs when saving the file.

Comment options

 Something like this.
 
 page = self.doc[page_num]
 if del_links:
 for link in page.get_links():
 page.delete_link(link)
 for i in range(len(blocks)):
 block = blocks[i]
 white = pymupdf.pdfcolor["white"]
 page.draw_rect(tuple(block.bbox), color=None, fill=white)
 
 
 for i in range(len(blocks)):
 block = blocks[i]
 try:
 
 page.insert_htmlbox(tuple(block.bbox), block.text, css=block.css, rotate=block.rotate, archive=ARCHIVE)
 except OverflowError as e:
 logger.error_context(self.ctx, f"restore err: {str(e)}")
 continue
Comment options

Please post a full reproducer. For example it needs to specify page_num.

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

AltStyle によって変換されたページ (->オリジナル) /