7 lessons learned while building REDDIT (270 mio page views a month)

Very interesting article.


Another real-time search and indexing system built on Apache Lucene.

From their website:

Zoie is a mature open source project and has been deployed in a real-time large-scale consumer website: LinkedIn.com handling millions of searches as well as hundreds of thousands of updates daily.

All Zoie releases have gone through extensive functional and performance testing by LinkedIn before made public. All major versions are released after a trial period on the production environment.

In a real-time search/indexing system, a document is made available as soon as it is added to the index. This functionality is especially important to time-sensitive information such as news, job openings, tweets etc.

NOSQL community

The young “nosql” community met recently in San Francisco. A solid introduction was given about how distributed, non relational databases work. Moreover, they give an overview of the various projects out there.

Presentation slides and videos
Intro session – Todd Lipcon, Cloudera (slides, video1, video2)
Voldemort – Jay Kreps, Linkedin (slides, video1, video2)
Cassandra – Avinash Lakshman, Facebook (slides, video)
Dynomite – Cliff Moon, Powerset (slides, video)
HBase – Ryan Rawson, Stumbleupon (slides, video)
Hypertable – Doug Judd, Zvents (slides, video1, video2)
CouchDB – Chris Anderson, couch.io (slides, video1, video2)

VPork – Jon Travis, Springsource (slides, video)
MongoDb – Dwight Merriman, 10gen (slides, video)
Infinite Scalability – Jonas S Karlsson, Google (slides, video)

Some videos by Digg’s John Quinn, the rest by Martin Dittus from Last.fm. Pictures by Russ Garrett from Last.fm.

Source: http://blog.oskarsson.nu/2009/06/nosql-debrief.html