- Use a shared file system between all nodes, and use FSDirectory.
- Use indexes on the nodes local file system and a synchronization strategy.
- Use a database using JDBCDirectory
- Use a distributed file system (eg Google File System, Nutch Distributed File System)
- Use a local cache with backup in the Database
Some other ways to distribute the index are discussed here. A document written at HP describes a parallel, distributed free text index called Distributed Lucene. This document from IBM gives some feelings about scaling-out versus scaling up using Nutch and Lucene.