Hunch

10

Tokyo Cabinet

February 28, 2009 by Rasmus, tagged database, dbm, performance, python, software, tokyocabinet and tyrant, filed under software

Lately I’ve been researching some into the holy grail of keyed data storage – best combination of performance, scalability, efficiency and availability. There are many, many options available ranging from the Berkeley DB to BigTable implementations like Hypertable.

Last weekend I spent some time looking into using BDB in a BigTable fashion for managing schema-free tables. However my tests revealed many problems with a solution like that. For instance, BDB is really slow when writing random keys into databases of >100k row size. In the beginning of this week I had a chat with Jon Åslund regarding this idea and he introduced me to Tokyo Cabinet – a modern, battle-tested and extremely high-performance DBM.

Despite the somewhat uncool name, Tokyo Cabinet is a silent beast developed by Mikio Hirabayashi and used in the high-load environment of Japanese Facebook-equivalent Mixi. TC (short for Tokyo Cabinet) is written in C99 C, sporting a clean and modern API.

Mikio states TC improves on other DBMs in the following areas:

  • Improves space efficiency – smaller size of database file.
  • Improves time efficiency – faster processing speed.
  • Improves parallelism – higher performance in multi-thread environment.
  • Improves usability – simplified API.
  • Improves robustness – database file is not corrupted even under catastrophic situation.
  • Supports 64-bit architecture – enormous memory space and database file are available.

Continue reading...