Clustering lucene

Beside using Apache Solr or Katta, this article describes many ways to cluster a Lucene index:

  1. Use a shared file system between all nodes, and use FSDirectory.
  2. Use indexes on the nodes local file system and a synchronization strategy.
  3. Use a database using JDBCDirectory
  4. Use a distributed file system (eg Google File System, Nutch Distributed File System)
  5. Use a local cache with backup in the Database

Some other ways to distribute the index are discussed here. A document written at HP describes a parallel, distributed free text index called Distributed Lucene. This document from IBM gives some feelings about scaling-out versus scaling up using Nutch and Lucene.

A novel way is to use TerraCotta and Compass to cluster the index as described here.

Simple Thrift tutorial

I briefly explain here how to build a simple Thrift application in Java that returns the time on the server.

First, define the Thrift interface in file time.thrift:

# time.thrift
namespace java tserver.gen
typedef i64 Timestamp
service TimeServer {
   Timestamp time()
}

Generate the Java code:

thrift --gen java time.thrift

Import the generated code into your workspace. Now, we will implement the interface in the file TimeServerImpl.java:

Continue reading

Java NIO really faster than traditional IO ?

According to Jeremy Manson and Paul Tyma, NIO is not faster than the old traditional IO. Paul Tyma argues that since the Native Posix Thread Library (NPTL) arrived in Linux 2.6, multithreading is so cheap that it outperforms the select-based NIO alternative. He goes on to quote some impressive benchmarks from Rahul Bhargava, which show that multithreading gives at least 25% greater throughput that NIO, in a test with 1,700 concurrent connections.

Finally, here are the slides of an interesting presentation given by PaulTyma.

Review of distributed key-value stores

According to the blog entry of Richard Jones, the most mature distributed stores are:

Project-Voldemort

Despite being the winner of Richard Jones’ comparison, Voldemort presents several drawbacks:

  • JVM-only API
  • purely key-value, no structured storage
  • no dynamic group membership

Dynomite

Dynomite is currently used in production at powerset.com to serve images and supports thrift clients, dynamic adding of nodes. Unlike Voldemort it keeps all of the read repair, replication, and concurrency in the server, keeping the client code as simple as possible.  Although this project seems very interesting, I would like to see operational reports from Microsoft about its scalabiliy and performance.

Scalaris

Currently, Scalaris has no disk persistence: the system works only in-memory. They reported the following performance: 2500 transactions per second with 16 servers. Voldemort reports much better performance.

An issue with Scalaris as mentionned by Werner Vogels is that under realistic failures scenarios, even with 3 phase paxos, commit consitency can only be guaranteed by not taking writes. Hence, Scalaris could not be used in a sales-oriented environment like Amazon where write availability is more important than consistency.

Cassandra

Cassandra is not part of Richard Jones’ winner list due to the fact that the community and documentation about it is not much present.

Cassandra was accepted into Apache Incubator on 02.01.2009.  I think that the community around Cassandra will quickly grow during the next months. Moreover, the mailing lists are already more active that those of Voldemort. Here is a summary of its design choices and architecture.

Cassandra has some advantages over Voldemort:

  • provides structured storage (not only key-value as Voldemort)
  • Voldemort does not support dynamic group membership: one may not
    be able to add nodes to the cluster without bringing the cluster down. Cassandra uses dynamic group membership algorithm based on Gossip.
  • Thrift as a client protocol, and not only JVM API. Thrift was also accepted into Apache Incubator.
  • Hadoop integration