2005年02月07日T20:56:53 PA: > On Feb 07, 2005, at 21:45, Jolan Luff wrote: > >Is "full text search" some new buzzword that I'm not familiar with? > > Perhaps. > > http://en.wikipedia.org/wiki/Information_retrieval Definitely looks like a nice overview, but I don't see an answer on the spot of the question. Doing full text search with grep(1) is boring; that's a solution that only works well for small amounts of text. For large amounts of text, interesting amounts, full-text search works in a two-pass process. There's a relatively slow, lengthy process that builds an index --- typically 1/3 to 1/2 the size of the corpus of text being indexed --- and then very fast searches using that index. For years I used glimpse as my full-text search engine, but it wandered off behind a proprietary license and I lost track of it; lately I've been enjoying swish++ for full-text searching. These tools are terrific when you want to perform multiple keyword searches across large bodies of text --- e.g. all the documentation for all the packages in CPAN; all the RFCs; big email archives; trouble-ticket databases; etc. A freshmeat search for full-text search will turn up a lot of 'em. -Bennett
Attachment:
pgpzgRIWx4oeY.pgp 
Description: PGP signature