NOTE: Luke Lu pointed out in the comments that the benchmark was unfair to Hypertable because it uses the Thrift API, which is much slower than the C++ native API. So please interpret these results as such. I will try to do a better apples-to-apples comparison at some point.
Those of us who eat and breathe HBase at Facebook were curious to see how our internal version of HBase would fare, and thought to benchmark Hypertable against the Facebook version of HBase. So the first thing we did was to setup a YCSB workload to benchmark Hypertable and HBase. See this post for all the details, but the brief takeaway was that HBase was 1.4x better than Hypertable on writes, but Hypertable was 2-3x better on reads. So we setup a one-node, simple read test consisting 1M rows, 10 columns per row, 100 bytes per column (1GB total data) to measure this, and improve the reads from HBase. The results are shown below:
Obviously we wanted to see how to improve HBase read performance. So over the summer of 2012, we did a series of experiments to improve HBase read performance – see this post for details. The summary of the post is that we were able to improve HBase reads quite a bit, and at the end it was able to perform better than Hypertable as shown in the chart below.
Many people at Facebook have been involved in this effort, and overall this is definitely an awesome improvement to HBase performance!