Review of distributed key-value stores

According to the blog entry of Richard Jones, the most mature distributed stores are:

Project-Voldemort

Despite being the winner of Richard Jones’ comparison, Voldemort presents several drawbacks:

  • JVM-only API
  • purely key-value, no structured storage
  • no dynamic group membership

Dynomite

Dynomite is currently used in production at powerset.com to serve images and supports thrift clients, dynamic adding of nodes. Unlike Voldemort it keeps all of the read repair, replication, and concurrency in the server, keeping the client code as simple as possible.  Although this project seems very interesting, I would like to see operational reports from Microsoft about its scalabiliy and performance.

Scalaris

Currently, Scalaris has no disk persistence: the system works only in-memory. They reported the following performance: 2500 transactions per second with 16 servers. Voldemort reports much better performance.

An issue with Scalaris as mentionned by Werner Vogels is that under realistic failures scenarios, even with 3 phase paxos, commit consitency can only be guaranteed by not taking writes. Hence, Scalaris could not be used in a sales-oriented environment like Amazon where write availability is more important than consistency.

Cassandra

Cassandra is not part of Richard Jones’ winner list due to the fact that the community and documentation about it is not much present.

Cassandra was accepted into Apache Incubator on 02.01.2009.  I think that the community around Cassandra will quickly grow during the next months. Moreover, the mailing lists are already more active that those of Voldemort. Here is a summary of its design choices and architecture.

Cassandra has some advantages over Voldemort:

  • provides structured storage (not only key-value as Voldemort)
  • Voldemort does not support dynamic group membership: one may not
    be able to add nodes to the cluster without bringing the cluster down. Cassandra uses dynamic group membership algorithm based on Gossip.
  • Thrift as a client protocol, and not only JVM API. Thrift was also accepted into Apache Incubator.
  • Hadoop integration

Thoughs on “An Architecture for a Service OrientedPeer-to-Peer System (SOPPS)”

I’ll just give some interesting excerpts related to P2P service discovery.

Summary

“It is based on p2p structures and offers a generic support of services as well as supply mechanisms to enable market management of those services offered”

“The SOPPS architecture needs to support completely different services, including management of content and other resources. Hence, mechanisms such as service fusion, i.e. the combination of existing services to create new, value-added services, and service description are required.”

“Peers are defined as nodes connected to a network, running the respective SOPPS software. SOPPS consists of peers communicating over a network in order to provide or use services.”

SOPPS architecture is based on 3 models:

  • Use model: a service is the provisioning of resources or the execution of tasks of one or more (temporarily provider) peers on behalf of one or more (temporarily user) peers. On the corresponding peer, it invokes functionality to discover an appropriate service and to determine and contact one or more provider peers offering that particular service.
  • Network model: abstracts the Internet into a overlay model
  • Peer model: Each peer has a number of local resources available at the bottom layer. These resources consist of hardware and software resources and can be used either for providing services to other peers or for support of the Core Functionality.

In order to access a service, the following steps need to be performed:

  • Service search phase: user creates a service description describing the content and his peer sends out a service request containing this description to other peers. This request is forwarded by the peers according to a search protocol.
  • Service negotiation phase: After a user has found one or several possible services he enters a negotiation with their providers to define the exact terms of service usage. This includes the fixing of parameters, e.g., the format the content will be encoded in, the data rate for sending the content, the price the user might have to pay for the service.
  • Service usage phase: the user can start using the service. Especially, this includes the remote invocation of methods offered by services during an initialization phase.
  • Application of rules: Throughout all the processes described above the rules management component watches over the fulfillment of previously defined rules.

Comments

Interesting high level view on the architecture. No technical details or solutions are given in the paper.

Paper [pdf]