Benchmarking CouchDB against MemcacheDB


shyam - Posted on 02 September 2008

This is by no means meant to represent a scientific evaluation, nor should it be seen as the absolute word on what other tweaks could be done to extract maximum performance out of either MemcacheDB or CouchDB. The components are mostly running with out-of-the-box settings and my aim was to get it as close as possible to what my needs are in the production set up I deal with. You may have a different way of setting the infrastructure up.

Moreover, it is really not a brilliant idea to compare CouchDB with MemcacheDB. Both represent and do things differently and are, honestly, meant to be used for different things. I already have a set up in place which offloads a lot of front-end data crunching to Memcached and it is the lack of persistence in that led me to look for MemcacheDB and CouchDB.

Again, it is not a fair comparison because CouchDB can, in one request, get all the data needed for a particular record, while MemcachedDB would require multiple requests to cobble together a similar data set. Thus, CouchDB can afford to be slower and a bit more expensive in terms of requests. Then there is also the additional penalty of having to reconstruct the logic to assemble the data together for MemcacheDB, which won't show up in a test like this. But the results do throw up some very interesting results.

The methods used to create the test data also were different for both cases, which could be a source of further discrepancy. There is an additional processing/time penalty to be paid for doing MD5 on a random number generated on-the-fly (for MemcacheDB). All CouchDB operations also happen over HTTP, which is again a significant difference compared to how PHP connects to MemcacheDB. CouchDB operations thus incur the additional penalty of PHP doing an fsockopen() to interact with it. There is also an anomaly with the second CouchDB write test where it shows up some 240 failed transactions. This, I am assuming, was due to some one-off external issue and I've not been able to replicate it ever since.

At least from the numbers I've seen, MemcacheDB seems to easily have an advantage over CouchDB in terms of doing writes, while CouchDB seems to have a bit of a lead in terms of read. Even then, you need to really be sure what is that you are looking for from either solution. The tests actually generate loads which a lot of people would not have to deal with. Moreover, all these are single instance set ups, with no clustering or load balancing in use. To give you an example, one of my applications in production have a requirement (at peak) for doing 180 concurrent requests that are primarily read operations.

From that point of view, the slowest read transaction rate on either set up is at 290, which is still a lot more than what I get out of my current set up, which is multi-tiered (DB cluster, Memcache, File Cache and CDN) and has significantly massive hardware powering it. Even if you were to get half that hardware on to the MemcacheDB/CouchDB set up, you'll get significantly huge performance gains and if you were to use a less resource-hungry webserver than Apache (Apache loads peaked at 32 (reads) & 2 (writes) with CouchDB and peaked at 13 (read) & 4 (write) with MemcacheDB), you would see a significant speed bump on it.

In terms of disk usage, it was expected that CouchDB would eat up a lot of disk, but the surprising fact was that MemcacheDB also used up quite a bit of space. The worrying factor for me with CouchDB was CPU usage. MemcacheDB did not even wince once with all the load thrown at it, while with CouchDB it was doing consistently over 90% CPU with the loads thrown at it.

Finally, to the test:

The set-up:

Baseline config for all servers: 4 Gig RAM, Dual Xeon 2.0 Ghz

Gigabit links

Server 1: Hosting the PHP script that connects to both MemcacheDB/CouchDB

PHP complied from source using:
'./configure' '--prefix=/usr/local/php5' '--with-apxs2=/usr/local/apache2/bin/apxs' '--with-iconv' '--with-gettext' '--enable-mbstring' '--with-openssl' '--with-xsl=/usr' '--with-mysql=/usr/local/mysql' '--with-zlib' '--with-zlib-dir' '--with-mysqli=/usr/local/mysql/bin/mysql_config' '--with-pgsql=/usr/local/postgres' '--with-bz2' '--with-gd=/usr/local/gd' '--enable-gd-native-ttf' '--with-xsl' '--with-libxml-dir=/usr/lib' '--with-curl'

Server 2: Running siege

Server 3: Server running MemcachedDB VERSION 1.1.0-beta

Server 4: CouchDB version: 0.8.1

The test scripts:

memset.php: Does an md5(rand()) and uses it as the key name and inserts some random text with it.

memget.php: Takes an array of values from the keys, gets a random one every time, fetches the value.

couchset.php: Takes rand as the key name and inserts random text with it.

couchget.php: Takes an array of values from the keys, gets a random one every time, fetches the vars.

The complete numbers here.