Indri

Indri is a search engine that provides state-of-the-art text search and a rich structured query language for text collections of up to 50 million documents (single machine) or 500 million documents (distributed search). Available for Linux, Solaris, Windows and Mac OSX.

No further development is being done with Indri. Please check out our latest project, Lucindri, which is the Indri search logic built on the Lucene search engine.
Lucindri Home Page
Lucindri Source Code on Github

Features

Powerful Query Interface

Supports popular structured query operators from INQUERY
Suffix-based wildcard term matching
Field retrieval
Passage retrieval

Flexible Indexing and Document Support

Supports UTF-8 encoded text
Language independent tokenization of UTF-8 encoded documents.
Parses PDF, HTML, XML, and TREC documents
Word and PowerPoint parsing (Windows only)
Text Annotations
Document Metadata

Package Versatility

Open source, with a flexible BSD-inspired license
Includes both command line tools and a Java user interface
API can be used from Java, PHP, or C++
Works on Windows, Linux, Solaris and Mac OS X

Scalability and Efficiency

Best-in-class ad hoc retrieval performance
Can be used on a cluster of machines for faster indexing and retrieval
Scales to terabyte-sized collections

Download

Indri can be obtained from the SourceForge Lemur Project Page.

Release History

The first version (1.0) of Indri was released in Jan 2002. Subsequent releases have been made 2-3 times each year since then. Release notes for the current release can be found on SourceForge.