Database Microbenchmarks

Symas Corp., July 2012

This page follows on from Google's LevelDB benchmarks published in July 2011 at LevelDB. (A snapshot of that document is available here for reference. In addition to the benchmarks tested there, we add the venerable BerkeleyDB as well as the OpenLDAP MDB database. For this test, we compare LevelDB version 1.5 (git rev dd0d562b4d4fbd07db6a44f9e221f8d368fee8e4), SQLite3 (version 3.7.7.1) and Kyoto Cabinet's (version 1.2.76) TreeDB (a B+Tree based key-value store), Berkeley DB 5.3.21, and OpenLDAP MDB (git rev a0993354a603a970889ad5c160c289ecca316f81). We would like to acknowledge the LevelDB project for the original benchmark code.

Benchmarks were all performed on a Dell Precision M4400 laptop with a quad-core Intel(R) Core(TM)2 ExtremeCPU Q9300 @ 2.53GHz, with 6144 KB of total L3 cache and 8 GB of DDR2 RAM at 800 MHz. (Note that LevelDB uses at most two CPUs since the benchmarks are single threaded: one to run the benchmark, and one for background compactions.) The benchmarks were run on two different filesystems, one with a tmpfs and one with reiserfs on an SSD. The SSD is a relatively old model, Samsung PM800 Series 256GB. The system had Ubuntu 12.04 installed, with kernel 3.2.0-26. Tests were all run in single-user mode to prevent variations due to other system activity. CPU performance scaling was disabled (scaling_governor = performance), to ensure a consistent CPU clock speed for all tests. The numbers reported below are the median of three measurements. The databases are completely deleted between each of the three measurements.

Update: Additional tests were run on a Western Digital WD20EARX 2TB SATA hard drive. The HDD results start in Section 8. The results across multiple filesystems are in Section 11.

Benchmark Source Code

We wrote benchmark tools for SQLite, BerkeleyDB, MDB, and Kyoto TreeDB based on LevelDB's db_bench. The LevelDB, SQLite3, and TreeDB benchmark programs were originally provided in the LevelDB source distribution but we've made additional fixes to the versions used here. The code for each of the benchmarks resides here:

LevelDB: db_bench.cc.
SQLite: db_bench_sqlite3.cc.
Kyoto TreeDB: db_bench_tree_db.cc.
OpenLDAP MDB: db_bench_mdb.cc.
BerkeleyDB: db_bench_bdb.cc.

Custom Build Specifications

Compression support was disabled in the libraries that support it. No special malloc library was used in the build. All of the benchmark programs were linked to their respective static libraries, to show the actual size needed for a minimal program using each library.

LevelDB: Assertions were disabled.
TreeDB: We enabled the TSMALL and TLINEAR options when opening the database in order to reduce the footprint of each record.
SQLite: We tuned SQLite's performance, by setting its locking mode to exclusive. We also enabled SQLite's write-ahead logging.
BerkeleyDB: We configure with --with-mutex=POSIX/pthreads to avoid using the default hybrid mutex implementation.
MDB: Assertions were disabled.

1. Relative Footprint

Most database vendors claim their product is fast and lightweight. Looking at the total size of each application gives some insight into the footprint of each database implementation.

size db_bench*
 text	 data	 bss	 dec	 hex	filename
 271991	 1456	 320	 273767	 42d67	db_bench
1682579	 2288	 296	1685163	 19b6ab	db_bench_bdb
 96879	 1500	 296	 98675	 18173	db_bench_mdb
 655988	 7768	 1688	 665444	 a2764	db_bench_sqlite3
 296244	 4808	 1080	 302132	 49c34	db_bench_tree_db

The core of the MDB code is barely 32K of x86-64 object code. It fits entirely within most modern CPUs' on-chip caches. All of the other libraries are several times larger.

2. Baseline Performance

This section gives the baseline performance of all the databases. Following sections show how performance changes as various parameters are varied. For the baseline:

All operations are running on tmpfs. This shows the pure CPU time each database requires, independent of I/O speed.
Each database is allowed 4 MB of cache memory. (MDB has no cache, so this is irrelevant.)
Databases are opened in asynchronous write mode. (LevelDB's sync option, TreeDB's OAUTOSYNC option, SQLite3's synchronous options are all turned off, MDB uses the MDB_NOSYNC option, and BerkeleyDB uses the DB_TXN_WRITE_NOSYNC option). I.e., every write is pushed to the operating system, but the benchmark does not wait for the write to reach the disk.
Keys are 16 bytes each.
Value are 100 bytes each.
Sequential reads/writes traverse the key space in increasing order.
Random reads/writes traverse the key space in random order.

A. Sequential Reads

LevelDB 4,566,210 ops/sec

Kyoto TreeDB 851,788 ops/sec

SQLite3 265,816 ops/sec

MDB 14,492,754 ops/sec

BerkeleyDB 834,029 ops/sec

B. Random Reads

LevelDB 186,289 ops/sec

Kyoto TreeDB 107,631 ops/sec

SQLite3 82,706 ops/sec

MDB 768,640 ops/sec

BerkeleyDB 101,647 ops/sec

C. Sequential Writes

LevelDB 562,114 ops/sec

Kyoto TreeDB 345,423 ops/sec

SQLite3 55,860 ops/sec

MDB 113,161 ops/sec

BerkeleyDB 90,531 ops/sec

D. Random Writes

LevelDB 363,504 ops/sec

Kyoto TreeDB 106,134 ops/sec

SQLite3 37,074 ops/sec

MDB 93,835 ops/sec

BerkeleyDB 47,950 ops/sec

LevelDB has the fastest write operations. MDB has the fastest read operations by a huge margin, due to its single-level-store architecture. MDB was written for OpenLDAP; LDAP directory workloads tend to be many reads/few writes, so read optimization is more critical for that workload than writes. LevelDB is oriented towards many writes/few reads, so write optimization is emphasized there.

E. Batch Writes

A batch write is a set of writes that are applied atomically to the underlying database. A single batch of N writes may be significantly faster than N individual writes. The following benchmark writes one thousand batches where each batch contains one thousand 100-byte values. TreeDB does not support batch writes so its baseline numbers are repeated here for reference.

Sequential Writes

LevelDB 745,712 entries/sec (1.33x baseline)

Kyoto TreeDB 345,423 entries/sec (baseline)

SQLite3 111,161 entries/sec (1.99x baseline)

MDB 2,493,766 entries/sec (22.0x baseline)

BerkeleyDB 182,216 entries/sec (2.01x baseline)

Random Writes

LevelDB 469,263 entries/sec (1.29x baseline)

Kyoto TreeDB 106,135 entries/sec (baseline)

SQLite3 49,803 entries/sec (1.34x baseline)

MDB 155,521 entries/sec (1.66x baseline)

BerkeleyDB 61,248 entries/sec (1.28x baseline)

Because of the way LevelDB persistent storage is organized, batches of random writes are not much slower (only a factor of 1.6x) than batches of sequential writes. MDB has a special optimization for sequential writes, which is most effective in batched operation.

F. Synchronous Writes

In the following benchmark, we enable the synchronous writing modes of all of the databases. Since this change significantly slows down the benchmark, we stop after 10,000 writes. Unfortunately the resulting numbers are not directly comparable to the async numbers, since overall database size is also a factor in write performance and the resulting databases here are much smaller than the baseline.

For LevelDB, we set WriteOptions.sync = true.
In TreeDB, we enabled TreeDB's OAUTOSYNC option.
For SQLite3, we set "PRAGMA synchronous = FULL".
For MDB, we set no options since full sync is its default mode.
For BerkeleyDB, we set no options since full sync is its default mode.

Sequential Writes

LevelDB 372,024 ops/sec (0.661x baseline)

Kyoto TreeDB 6,889 ops/sec (0.0199x baseline)

SQLite3 51,970 ops/sec (0.93x baseline)

MDB 157,332 ops/sec (1.39x baseline)

BerkeleyDB 86,468 ops/sec (0.95x baseline)

Random Writes

LevelDB 349,895 ops/sec (0.96x baseline)

Kyoto TreeDB 7,080 ops/sec (0.067x baseline)

SQLite3 45,851 ops/sec (1.23x baseline)

MDB 147,776 ops/sec (1.57x baseline)

BerkeleyDB 78,296 ops/sec (1.63x baseline)

In both LevelDB and TreeDB the fact that operations are synchronous outweighs the fact that the database is much smaller than the baseline. TreeDB in particular performs extremely poorly in synchronous mode. On random writes, for SQLite3, MDB, and BerkeleyDB the smaller database size completely negates the cost of the synchronous writes.

3. Performance Using More Memory

We increased the overall cache size for each database to 128 MB. For SQLite3, we kept the page size at 1024 bytes, but increased the number of pages to 131,072 (up from 4096). For TreeDB, we also kept the page size at 1024 bytes, but increased the cache size to 128 MB (up from 4 MB). For MDB there is no cache, so the numbers are simply a copy of the baseline. Both MDB and BerkeleyDB use the default system page size (4096 bytes).

A. Sequential Reads

LevelDB 4,504,505 ops/sec (0.99x baseline)

Kyoto TreeDB 1,282,051 ops/sec (1.50x baseline)

SQLite3 339,328 ops/sec (1.27x baseline)

MDB 14,084,507 ops/sec (baseline)

BerkeleyDB 879,507 ops/sec (1.05x baseline)

B. Random Reads

LevelDB 187,196 ops/sec (1.005x baseline)

Kyoto TreeDB 218,675 ops/sec (2.03x baseline)

SQLite3 101,276 ops/sec (1.22x baseline)

MDB 765,697 ops/sec (baseline)

BerkeleyDB 173,641 ops/sec (1.70x baseline)

C. Sequential Writes

LevelDB 564,653 ops/sec (1.005x baseline)

Kyoto TreeDB 469,043 ops/sec (1.36x baseline)

SQLite3 53,642 ops/sec (0.96x baseline)

MDB 99,771 ops/sec (baseline)

BerkeleyDB 91,819 ops/sec (1.014x baseline)

D. Random Writes

LevelDB 362,450 ops/sec (0.997x baseline)

Kyoto TreeDB 227,324 ops/sec (2.14x baseline)

SQLite3 39,485 ops/sec (1.07x baseline)

MDB 87,040 ops/sec (baseline)

BerkeleyDB 72,643 ops/sec (1.51x baseline)

E. Batch Writes

Sequential Writes

LevelDB 744,602 entries/sec (1.32x non-batched)

Kyoto TreeDB 469,403 entries/sec (non-batched)

SQLite3 105,619 entries/sec (1.97x non-batched)

MDB 1,157,407 entries/sec (11.6x non-batched)

BerkeleyDB 184,502 entries/sec (2.01x non-batched)

Random Writes

LevelDB 478,240 entries/sec (1.32x non-batched)

Kyoto TreeDB 227,324 entries/sec (non-batched)

SQLite3 55,157 entries/sec (1.40x non-batched)

MDB 140,766 entries/sec (1.62x non-batched)

BerkeleyDB 115,969 entries/sec (1.59x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB 371,195 ops/sec (0.661x baseline)

Kyoto TreeDB 6,886 ops/sec (0.0199x baseline)

SQLite3 51,945 ops/sec (0.93x baseline)

MDB 125,188 ops/sec (1.25x baseline)

BerkeleyDB 86,236 ops/sec (0.95x baseline)

Random Writes

LevelDB 346,741 ops/sec (0.96x baseline)

Kyoto TreeDB 7,070 ops/sec (0.067x baseline)

SQLite3 45,880 ops/sec (1.23x baseline)

MDB 119,918 ops/sec (1.38x baseline)

BerkeleyDB 77,894 ops/sec (1.63x baseline)

4. Performance Using Large Values

For this benchmark, we use 100,000 byte values. To keep the benchmark running time reasonable, we stop after writing 1000 values. Otherwise, all of the same tests as for the Baseline are run.

A. Sequential Reads

LevelDB 194,628 ops/sec

Kyoto TreeDB 18,536 ops/sec

SQLite3 7,476 ops/sec

MDB 33,333,333 ops/sec

BerkeleyDB 9,174 ops/sec

B. Random Reads

LevelDB 17,115 ops/sec

Kyoto TreeDB 17,207 ops/sec

SQLite3 7,690 ops/sec

MDB 2,012,072 ops/sec

BerkeleyDB 9,347 ops/sec

MDB's single-level-store architecture clearly outclasses all of the other designs; the others barely even register on the results. MDB's zero-memcpy reads mean its read rate is essentially independent of the size of the data items being fetched; it is only affected by the total number of keys in the database.

C. Sequential Writes

LevelDB 3,422 ops/sec

Kyoto TreeDB 12,415 ops/sec

SQLite3 1.936 ops/sec

MDB 11,758 ops/sec

BerkeleyDB 1,869 ops/sec

D. Random Writes

LevelDB 2,178 ops/sec

Kyoto TreeDB 5,612 ops/sec

SQLite3 1,820 ops/sec

MDB 10,278 ops/sec

BerkeleyDB 1,543 ops/sec

E. Batch Writes

Sequential Writes

LevelDB 2,327 entries/sec

Kyoto TreeDB 12,416 entries/sec

SQLite3 1,908 entries/sec

MDB 6,828 entries/sec

BerkeleyDB 1,901 entries/sec

Random Writes

LevelDB 2,332 entries/sec

Kyoto TreeDB 5,612 entries/sec

SQLite3 1,957 entries/sec

MDB 9,032 entries/sec

BerkeleyDB 1,563 entries/sec

TreeDB has very good performance with large values using asynchronous writes. It has much worse performance in synchronous mode. Batch mode appears to have no benefit with large values; the work of writing the values cancels out the efficiency gained from batching. MDB has additional features to handle large values but the current benchmark code doesn't support it.

F. Synchronous Writes

Sequential Writes

LevelDB 1,090 ops/sec

Kyoto TreeDB 3,115 ops/sec

SQLite3 1,886 ops/sec

MDB 9,747 ops/sec

BerkeleyDB 2,167 ops/sec

Random Writes

LevelDB 1,064 ops/sec

Kyoto TreeDB 3,247 ops/sec

SQLite3 2,137 ops/sec

MDB 10,001 ops/sec

BerkeleyDB 1,882 ops/sec

5. Performance On SSD

The same tests as in Section 2 are performed again, this time using the Samsung SSD with reiserfs. This drive has been in regular use over the past several years and was not reformatted for the tests. It has very poor random write speed as a result.

A. Sequential Reads

LevelDB 4,366,812 ops/sec

Kyoto TreeDB 851,789 ops/sec

SQLite3 274,650 ops/sec

MDB 14,925,373 ops/sec

BerkeleyDB 804,505 ops/sec

B. Random Reads

LevelDB 154,321 ops/sec

Kyoto TreeDB 105,641 ops/sec

SQLite3 82,905 ops/sec

MDB 772,797 ops/sec

BerkeleyDB 103,875 ops/sec

Read performance is essentially the same as for tmpfs since all of the data is present in the filesystem cache.

C. Sequential Writes

LevelDB 414,079 ops/sec

Kyoto TreeDB 342,700 ops/sec

SQLite3 51,464 ops/sec

MDB 93,231 ops/sec

BerkeleyDB 52,048 ops/sec

D. Random Writes

LevelDB 150,399 ops/sec

Kyoto TreeDB 103,928 ops/sec

SQLite3 32,186 ops/sec

MDB 77,851 ops/sec

BerkeleyDB 15,959 ops/sec

Most of the databases perform at close to their tmpfs speeds, which is expected since these are asynchronous writes. However, BerkeleyDB shows a large reduction in throughput.

E. Batch Writes

Sequential Writes

LevelDB 509,165 entries/sec (1.23x non-batched)

Kyoto TreeDB 342,700 entries/sec (non-batched)

SQLite3 101,010 entries/sec (1.96x non-batched)

MDB 953,289 entries/sec (10.2x non-batched)

BerkeleyDB 79,618 entries/sec (1.52x non-batched)

Random Writes

LevelDB 202,799 entries/sec (1.35x non-batched)

Kyoto TreeDB 103,928 entries/sec (non-batched)

SQLite3 41,530 entries/sec (1.29x non-batched)

MDB 119,976 entries/sec (1.54x non-batched)

BerkeleyDB 15,261 entries/sec (0.96x non-batched)

F. Synchronous Writes

Here the difference between SSD and tmpfs is made obvious.

Sequential Writes

LevelDB 461 ops/sec (0.0011x asynch)

Kyoto TreeDB 60 ops/sec (0.0001x asynch)

SQLite3 357 ops/sec (0.0069x asynch)

MDB 198 ops/sec (0.0021x asynch)

BerkeleyDB 417 ops/sec (0.0080x asynch)

Random Writes

LevelDB 460 ops/sec (0.0031x asynch)

Kyoto TreeDB 67 ops/sec (0.0006x asynch)

SQLite3 361 ops/sec (0.0112x asynch)

MDB 194 ops/sec (0.0025x asynch)

BerkeleyDB 391 ops/sec (0.0245x asynch)

The slowness of the SSD overshadows any difference between sequential and random write performance here.

6. Performance Using More Memory

We increased the overall cache size for each database to 128 MB, as in Section 3. The "baseline" in these tests refers to the values from Section 5.

A. Sequential Reads

LevelDB 4,566,210 ops/sec (1.05x baseline)

Kyoto TreeDB 1,298,701 ops/sec (1.52x baseline)

SQLite3 343,289 ops/sec (1.25x baseline)

MDB 14,925,373 ops/sec (baseline)

BerkeleyDB 887,311 ops/sec (1.10x baseline)

B. Random Reads

LevelDB 154,655 ops/sec (1.002x baseline)

Kyoto TreeDB 219,154 ops/sec (2.07x baseline)

SQLite3 98,931 ops/sec (1.19x baseline)

MDB 772,798 ops/sec (baseline)

BerkeleyDB 171,644 ops/sec (1.65x baseline)

C. Sequential Writes

LevelDB 417,885 ops/sec (1.009x baseline)

Kyoto TreeDB 462,321 ops/sec (1.35x baseline)

SQLite3 51,575 ops/sec (1.002x baseline)

MDB 93,231 ops/sec (baseline)

BerkeleyDB 59,481 ops/sec (1.14x baseline)

D. Random Writes

LevelDB 150,466 ops/sec (1.000x baseline)

Kyoto TreeDB 225,073 ops/sec (2.17x baseline)

SQLite3 35,030 ops/sec (1.09x baseline)

MDB 77,851 ops/sec (baseline)

BerkeleyDB 53,502 ops/sec (3.35x baseline)

E. Batch Writes

Sequential Writes

LevelDB 505,051 entries/sec (1.21x non-batched)

Kyoto TreeDB 462,321 entries/sec (non-batched)

SQLite3 100,634 entries/sec (1.95x non-batched)

MDB 953,289 entries/sec (10.2x non-batched)

BerkeleyDB 97,714 entries/sec (1.64x non-batched)

Random Writes

LevelDB 205,212 entries/sec (1.36x non-batched)

Kyoto TreeDB 225,073 entries/sec (non-batched)

SQLite3 46,637 entries/sec (1.33x non-batched)

MDB 119,976 entries/sec (1.54x non-batched)

BerkeleyDB 77,119 entries/sec (1.44x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB 467 ops/sec (0.0011x asynch)

Kyoto TreeDB 61 ops/sec (0.0001x asynch)

SQLite3 369 ops/sec (0.0072x asynch)

MDB 199 ops/sec (0.0021x asynch)

BerkeleyDB 406 ops/sec (0.0068x asynch)

Random Writes

LevelDB 466 ops/sec (0.0031x baseline)

Kyoto TreeDB 70 ops/sec (0.0003x baseline)

SQLite3 366 ops/sec (0.0104x baseline)

MDB 194 ops/sec (0.0025x baseline)

BerkeleyDB 379 ops/sec (0.0071x baseline)

7. Performance Using Large Values

This is the same as the test in Section 4, using the SSD.

A. Sequential Reads

LevelDB 149,992 ops/sec

Kyoto TreeDB 18,776 ops/sec

SQLite3 7,845 ops/sec

MDB 32,258,064 ops/sec

BerkeleyDB 9,414 ops/sec

B. Random Reads

LevelDB 21,607 ops/sec

Kyoto TreeDB 17,390 ops/sec

SQLite3 8,033 ops/sec

MDB 1,976,285 ops/sec

BerkeleyDB 5,653 ops/sec

The read results are about the same as for tmpfs.

C. Sequential Writes

LevelDB 712 ops/sec

Kyoto TreeDB 12,425 ops/sec

SQLite3 1,184 ops/sec

MDB 4,403 ops/sec

BerkeleyDB 190 ops/sec

D. Random Writes

LevelDB 405 ops/sec

Kyoto TreeDB 5,089 ops/sec

SQLite3 1,311 ops/sec

MDB 4,165 ops/sec

BerkeleyDB 247 ops/sec

E. Batch Writes

Sequential Writes

LevelDB 2,194 entries/sec

Kyoto TreeDB 12,425 entries/sec

SQLite3 694 entries/sec

MDB 3,391 entries/sec

BerkeleyDB 306 entries/sec

Random Writes

LevelDB 2,184 entries/sec

Kyoto TreeDB 5,089 entries/sec

SQLite3 790 entries/sec

MDB 4,901 entries/sec

BerkeleyDB 291 entries/sec

F. Synchronous Writes

Sequential Writes

LevelDB 106 ops/sec

Kyoto TreeDB 32 ops/sec

SQLite3 92 ops/sec

MDB 91 ops/sec

BerkeleyDB 126 ops/sec

Random Writes

LevelDB 106 ops/sec

Kyoto TreeDB 38 ops/sec

SQLite3 104 ops/sec

MDB 88 ops/sec

BerkeleyDB 114 ops/sec

As before, TreeDB's write performance is good on asynchronous writes. BerkeleyDB's performance degrades the least in synchronous mode.

8. Performance On HDD

The same tests as in Section 2 are performed again, this time using the Western Digital WD20EARX HDD with EXT3 fs. The drive was attached to the laptop's eSATA port, so interface bottlenecks are not an issue. The MDB library used here is a littler newer than the previous tests, using revision 5da67968afb599697d7557c13b65fb961ec408dd which results in faster sequential write rates than in the previous tests so those numbers are not directly comparable.

Note that this data does not represent the maximum performance that the drive is capable of. For completeness, the tests were repeated on multiple other filesystems including EXT2, EXT3, EXT4, JFS, XFS, NTFS, ReiserFS, BTRFS, and ZFS. Those results will be uploaded later.

This drive uses 4KB physical sectors. The drive was partitioned into two 1TB partitions, 4KB aligned. The first partition was formatted with NTFS. The 2nd partition was reused with each of the other filesystems.

A. Sequential Reads

LevelDB 4,504,504 ops/sec

Kyoto TreeDB 851,789 ops/sec

SQLite3 272,554 ops/sec

MDB 14,705,882 ops/sec

BerkeleyDB 805,152 ops/sec

B. Random Reads

LevelDB 99,010 ops/sec

Kyoto TreeDB 106,315 ops/sec

SQLite3 82,034 ops/sec

MDB 772,200 ops/sec

BerkeleyDB 98,795 ops/sec

Read performance is essentially the same as the previous tests since all of the data is present in the filesystem cache. LevelDB and BerkeleyDB are slightly slower than before.

C. Sequential Writes

LevelDB 205,550 ops/sec

Kyoto TreeDB 344,828 ops/sec

SQLite3 46,164 ops/sec

MDB 78,021 ops/sec

BerkeleyDB 43,977 ops/sec

D. Random Writes

LevelDB 63,259 ops/sec

Kyoto TreeDB 101,194 ops/sec

SQLite3 28,581 ops/sec

MDB 61,335 ops/sec

BerkeleyDB 4,978 ops/sec

Kyoto Cabinet performs close to its tmpfs speed, while the other databases show more of a reduction in throughput. BerkeleyDB slows down the most.

E. Batch Writes

Sequential Writes

LevelDB 213,904 entries/sec (1.04x non-batched)

Kyoto TreeDB 344,828 entries/sec (non-batched)

SQLite3 91,291 entries/sec (1.98x non-batched)

MDB 1,602,564 entries/sec (20.5x non-batched)

BerkeleyDB 56,085 entries/sec (1.27x non-batched)

Random Writes

LevelDB 85,230 entries/sec (1.35x non-batched)

Kyoto TreeDB 101,194 entries/sec (non-batched)

SQLite3 35,791 entries/sec (1.25x non-batched)

MDB 109,866 entries/sec (1.79x non-batched)

BerkeleyDB 4,928 entries/sec (0.99x non-batched)

F. Synchronous Writes

As slow as the SSD was, the HDD results are even slower.

Note however, that further investigation shows that these results are nowhere near the maximum performance of the HDD. More details on this in Section 11.

Sequential Writes

LevelDB 68 ops/sec (0.0003x asynch)

Kyoto TreeDB 5 ops/sec (0.00001x asynch)

SQLite3 62 ops/sec (0.0013x asynch)

MDB 35 ops/sec (0.0004x asynch)

BerkeleyDB 60 ops/sec (0.0014x asynch)

Random Writes

LevelDB 68 ops/sec (0.0011x asynch)

Kyoto TreeDB 5 ops/sec (0.00005x asynch)

SQLite3 62 ops/sec (0.0222x asynch)

MDB 43 ops/sec (0.0007x asynch)

BerkeleyDB 60 ops/sec (0.0121x asynch)

The slowness of the HDD overshadows any difference between sequential and random write performance here. None of these systems are suitable for real-world use in this configuration, but Kyoto Cabinet is by far the worst. If an application demands full ACID transactions, Kyoto Cabinet should definitely be avoided.

9. Performance Using More Memory

We increased the overall cache size for each database to 128 MB, as in Section 3. The "baseline" in these tests refers to the values from Section 8.

A. Sequential Reads

LevelDB 4,464,286 ops/sec (0.99x baseline)

Kyoto TreeDB 1,236,094 ops/sec (1.45x baseline)

SQLite3 341,880 ops/sec (1.25x baseline)

MDB 14,705,882 ops/sec (baseline)

BerkeleyDB 548,546 ops/sec (0.68x baseline)

B. Random Reads

LevelDB 100,675 ops/sec (1.017x baseline)

Kyoto TreeDB 219,491 ops/sec (2.06x baseline)

SQLite3 98,830 ops/sec (1.20x baseline)

MDB 772,201 ops/sec (baseline)

BerkeleyDB 149,343 ops/sec (1.51x baseline)

C. Sequential Writes

LevelDB 206,228 ops/sec (1.003x baseline)

Kyoto TreeDB 320,616 ops/sec (0.93x baseline)

SQLite3 43,925 ops/sec (0.95x baseline)

MDB 78,021 ops/sec (baseline)

BerkeleyDB 49,993 ops/sec (1.14x baseline)

D. Random Writes

LevelDB 61,931 ops/sec (0.98x baseline)

Kyoto TreeDB 222,816 ops/sec (2.20x baseline)

SQLite3 29,996 ops/sec (1.05x baseline)

MDB 61,335 ops/sec (baseline)

BerkeleyDB 44,256 ops/sec (8.89x baseline)

E. Batch Writes

Sequential Writes

LevelDB 206,271 entries/sec (1.00x non-batched)

Kyoto TreeDB 320,616 entries/sec (non-batched)

SQLite3 91,458 entries/sec (1.98x non-batched)

MDB 1,602,564 entries/sec (20.5x non-batched)

BerkeleyDB 76,476 entries/sec (15.36x non-batched)

Random Writes

LevelDB 85,346 entries/sec (1.35x non-batched)

Kyoto TreeDB 222,816 entries/sec (non-batched)

SQLite3 41,658 entries/sec (1.46x non-batched)

MDB 109,866 entries/sec (1.79x non-batched)

BerkeleyDB 61,958 entries/sec (12.44x non-batched)

F. Synchronous Writes

Sequential Writes

LevelDB 67 ops/sec (0.0003x asynch)

Kyoto TreeDB 5 ops/sec (0.00001x asynch)

SQLite3 61 ops/sec (0.0013x asynch)

MDB 35 ops/sec (0.0004x asynch)

BerkeleyDB 58 ops/sec (0.0013x asynch)

Random Writes

LevelDB 67 ops/sec (0.001x baseline)

Kyoto TreeDB 5 ops/sec (0.00005x baseline)

SQLite3 61 ops/sec (0.0021x baseline)

MDB 43 ops/sec (0.0007x baseline)

BerkeleyDB 59 ops/sec (0.012x baseline)

10. Performance Using Large Values

This is the same as the test in Section 4, using the HDD.

A. Sequential Reads

LevelDB 139,276 ops/sec

Kyoto TreeDB 18,612 ops/sec

SQLite3 7,672 ops/sec

MDB 9,345,794 ops/sec

BerkeleyDB 9,273 ops/sec

B. Random Reads

LevelDB 23,064 ops/sec

Kyoto TreeDB 17,337 ops/sec

SQLite3 7,870 ops/sec

MDB 1,436,782 ops/sec

BerkeleyDB 4,423 ops/sec

Again, the read results are about the same as for tmpfs.

C. Sequential Writes

LevelDB 279 ops/sec

Kyoto TreeDB 4,861 ops/sec

SQLite3 1,343 ops/sec

MDB 5,643 ops/sec

BerkeleyDB 191 ops/sec

D. Random Writes

LevelDB 149 ops/sec

Kyoto TreeDB 5,278 ops/sec

SQLite3 1,376 ops/sec

MDB 5,237 ops/sec

BerkeleyDB 152 ops/sec

E. Batch Writes

Sequential Writes

LevelDB 2,174 entries/sec

Kyoto TreeDB 4,861 entries/sec

SQLite3 1007 entries/sec

MDB 4,069 entries/sec

BerkeleyDB 187 entries/sec

Random Writes

LevelDB 2,166 entries/sec

Kyoto TreeDB 5,279 entries/sec

SQLite3 1108 entries/sec

MDB 5,734 entries/sec

BerkeleyDB 142 entries/sec

F. Synchronous Writes

Sequential Writes

LevelDB 20 ops/sec

Kyoto TreeDB 3 ops/sec

SQLite3 18 ops/sec

MDB 15 ops/sec

BerkeleyDB 20 ops/sec

Random Writes

LevelDB 17 ops/sec

Kyoto TreeDB 4 ops/sec

SQLite3 18 ops/sec

MDB 15 ops/sec

BerkeleyDB 18 ops/sec

The slowness of the HDD makes most of the database implementations perform about the same. As before, kyoto Cabinet is much slower than the rest.

11. Performance Using Different Filesystems

The baseline test was repeated on the same HDD, but using a different filesystem each time. The filesystems tested are btrfs, ext2, ext3, ext4, jfs, ntfs, reiserfs, xfs, and zfs. In addition, the journaling filesystems that support using an external journal were retested with their journal stored on a tmpfs file. These were ext3, ext4, jfs, reiserfs, and xfs. Testing in this second configuration shows how much overhead the filesystem's journaling mechanism imposes, and how much performance is lost by using the default internal journal configuration.

Note: storing the journal on tmpfs was just for the purposes of this test. In a real deployment you would need to store the journal on an actual storage device, like a separate disk, otherwise the filesystem would be lost after a reboot.

The filesystems are created fresh for each test. The tests are only run once each due to the great length of time needed to collect all of the data. (It takes several minutes just to run mkfs for some of these filesystems.) The full results are not presented in HTML here; you will have to download the Spreadsheet to view the results.

You can display the results for a specific benchmark operation across all the filesystem types using the selector in cell B23 of the sheet. Likewise, you can display the results for a specific filesystem across all the benchmark operations using the selector in cell B1, but because the results are so totally dominated by MDB read performance, this view isn't quite as informative.

Just to summarize, jfs with an external journal is the fastest for synchronous writes. If your workload demands fully synchronous transactions, this is clearly the best choice. Otherwise, the original ext2 filesystem is fastest for asynchronous writes.

The raw data for all of these tests is also available. tmpfs, SSD, and HDD. The results are also tabulated in an OpenOffice spreadsheet for further analysis here. The raw filesystem test results are in out.hdd.tar.gz