Description

I hit UnicodeDecodeError in xmlrpc_mergeDiff function in MoinMoin/xmlrpc/init.py.

The error is raised in the following code:

# generate the new page revision by applying the diff
newcontents = patch(basepage.get_raw_body_str(), decompress(str(diff)))
#print "Diff against %r" % basepage.get_raw_body_str()

# write page
try:
 currentpage.saveText(newcontents.decode("utf-8"), last_remote_rev or 0, comment=comment)
except PageEditor.Unchanged: # could happen in case of both wiki's pages being equal
 pass
except PageEditor.EditConflict:
 return LASTREV_INVALID

Probably the binary patch() may break utf-8 characters. My wiki is in Russian so pages contain multibyte characters. As result SyncPages does not work.

Steps to reproduce

  1. Make page A in a remove wiki.
    •  Revision 1.1
  2. Do sync
  3. Take full cold backup of the remote wiki.
  4. Edit page A in the remote wiki.
    •  Revision 1.1
       Revision 1.2
  5. Do sync
  6. Restore the remote wiki from backup.
  7. Edit page A in the remote wiki. It will get the same revision number 2, but will have different content.
    •  Revision 1.1
       Revision 2.2
  8. Edit page A in the remote wiki one more time to trigger synchronization.
    •  Revision 1.1
       Revision 2.2
       Revision 2.3
  9. Do sync.
  10. Local page A gets wrong content
    •  Revision 1.1
       Revision 1.2 <-- ERROR should be Revision 2.2
       Revision 2.3

Sorry no stack trace, but page is corrupted. If I would use multibyte characters it is possible to create wrong character that will produce stack trace.

Example

Component selection

  • general

Details

MoinMoin Version

1.9.3

OS and Version

Ubuntu 10.10

Python Version

2.6.6

Server Setup

Standalone

Server Details

Language you are using the wiki in (set in the browser/UserPreferences)

Russian (ru_RU)

Workaround

Unknown

Discussion

2010年11月07日

Hit another UnicodeDecodeError with the following stack trace

...
File "***/MoinMoin/wsgiapp.py", line 195, in handle_action
 handler(context.page.page_name, context)
File "***/MoinMoin/action/SyncPages.py", line 519, in execute
 ActionClass(pagename, request).render()
File "***/MoinMoin/action/SyncPages.py", line 220, in render
 self.sync(params, local, remote)
File "***/MoinMoin/action/SyncPages.py", line 515, in sync
 rpc_aggregator.scheduler(remote.create_multicall_object, handle_page, m_pages, 8, remote.prepare_multicall)
File "***/MoinMoin/util/rpc_aggregator.py", line 73, in scheduler
 call = gen.fetch_call()
File "***/MoinMoin/util/rpc_aggregator.py", line 32, in fetch_call
 next_item = self._gen.next()
File "***/MoinMoin/action/SyncPages.py", line 442, in run
 remote_contents_unicode = remote_contents.decode("utf-8")
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
 return codecs.utf_8_decode(input, errors, True)

I think it is the same error, but occured on a local side instead of remote. Debugger shows that SyncPages.py tries to reconstruct current page contents from local base and remote diff, while local base revision and remote base revisions have different contents. The result is garbage. Most of such merges may happen without errors. I'm lucky that patch break utf-8 character, I got a chance to notice page corruption.

Suppose, the of the issue is frequent wiki backup/restore (same as in MoinMoinBugs/1.9WikiSyncCorruptedSynctags). After restore page revision number may step back. Then after some edits it gets the same number as in sync time, but will probably have different content. Subsequent sync will try to blindly use this different revision in place of old lost revison and constructs invalid diff.

Will try to make a test case.

I can't currently reproduce the merge issue but half of the issue is the same as MoinMoinBugs/1.9WikiSyncCorruptedSynctags -- ReimarBauer 2012年02月05日 12:24:23

Plan

  • Priority:
  • Assigned to:
  • Status:


CategoryMoinMoinBug

MoinMoin: MoinMoinBugs/1.9WikiSyncUnicodeDecodeErrorInMergeDiff (last edited 2012年02月05日 12:24:23 by ReimarBauer )

AltStyle によって変換されたページ (->オリジナル) /