Clustering lucene

Beside using Apache Solr or Katta, this article describes many ways to cluster a Lucene index:

  1. Use a shared file system between all nodes, and use FSDirectory.
  2. Use indexes on the nodes local file system and a synchronization strategy.
  3. Use a database using JDBCDirectory
  4. Use a distributed file system (eg Google File System, Nutch Distributed File System)
  5. Use a local cache with backup in the Database

Some other ways to distribute the index are discussed here. A document written at HP describes a parallel, distributed free text index called Distributed Lucene. This document from IBM gives some feelings about scaling-out versus scaling up using Nutch and Lucene.

A novel way is to use TerraCotta and Compass to cluster the index as described here.