Description

Unlike usual, this is a collection of problems related to Xapian indexing.

Consider it rather as working notes done while experimenting with 1.9 than a formal bug report.

Indexer Testing method

A rather easy way to test the indexing code is to use filesystem files you have anyway:

find /some/base/dir -iname "*" >files.lst
# you can also use *.doc or *.pdf to select specific stuff
MoinMoin/script/moin.py --config-dir=/configdir --wiki-url=http://wikiurl/ index build --mode=rebuild --files=files.lst

--files indexing

  • non-ascii file names lead to unicode related exceptions
    • a quick fix is to just catch UnicodeError, so it doesn't crash the whole indexing run

  • maybe same problem exists for attachments?
  • generated "item names" are FS//filename (double slash!?)

pdf filter (poppler ubuntu 8.04)

  • hangs infinitely on japanese pdf
  • sometimes terminates with rc=1 although it produced quite some filtered text output

rc 127 for external filters

Seems to be the rc if the shell did not find the filter command. But what if it finds the command and the command returns with 127?


Related: PollAboutXapianSearchIndexingFilters


CategoryMoinMoinBug

MoinMoin: MoinMoinBugs/XapianIndexingProblems (last edited 2009年10月05日 15:31:49 by ThomasWaldmann )

AltStyle によって変換されたページ (->オリジナル) /