Riak performance problems when LevelDB database grows beyond 16GB

Discussion:

J***@seznam.cz

2012-10-11 16:12:14 UTC

Hello,

I am writing a new application and I am testing it on a cluster with 4 Riak
nodes (16 GM RAM, 2 x i3 3.4GHz - 2 cores).

The application is tested with the expected load of 1000 requests/second,
90% of the requests cause a Riak read andÂ write of a new key. The problem
is that the performance starts falling after 18-20 hours and one of the Riak
nodes stops responding after 23-25 hours.

(Key is cca 61 bytes long, data is just 3 timestamps converted to binary,
and there is a secondary key containing an expiration time. There should be
a mapred job to delete keys older than 24 hours, but it is turned off while
researching the performance problem.)

Logs on the other nodes show that the problematic node cannot be contacted:

2012-10-11 11:33:57.473 [error] <0.908.0> ** Node '***@172.16.0.2' not
responding **
** Removing (timedout) connection **

The problematic node itself does not respond to "/usr/sbin/riak ping", but
beam.smp is running and ALIVE messages are written regularly to the erlang
log. There is nothing suspicious in logs on the node,Â its error log is
empty.

The beam.smp consumes 20% memory and 50-100% of 1 CPU (the other 3 CPUs sit
idle),Â and the process has 267 open LevelDB files.

The database sizes are:

node1: 16249M, 281 files in 21 dirs (with 4 additional files like /home/
riak/leveldb/0/lost/BLOCKS.bad); this is the problematic node
node2: 16183M, 264 files in 16 dirs
node3: 16664M, 264 files in 16 dirs
node4: 16205M, 265 files in 16 dirs

I tried to attach to the beam.smp process with Erlang, but it does not
respond to net_adm:ping/1.

I attached gdb to the process, and gdb shows that most of its 93 threads are
idle (in ethr_event_wait), but 2 threads are in LevelDB code:

Thread 24 (Thread 0x7f1a8ecc0700 (LWP 3912)):
#0Â 0x00007f1a91f74d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64
-linux-gnu/libpthread.so.0
#1Â 0x00007f1a0ee0ae9d in leveldb::port::CondVar::Wait() () from /usr/lib/
riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#2Â 0x00007f1a0ede3841 in leveldb::DBImpl::MakeRoomForWrite(bool) () from /
usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#3Â 0x00007f1a0ede91ad in leveldb::DBImpl::Write(leveldb::WriteOptions const
&, leveldb::WriteBatch*) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#4Â 0x00007f1a0eddeca4 in eleveldb_write () from /usr/lib/riak/lib/eleveldb-
1.2.2p5/priv/eleveldb.so
#5Â 0x0000000000534c16 in process_main ()
#6Â 0x00000000004987e3 in ?? ()
#7Â 0x0000000000595320 in ?? ()
#8Â 0x00007f1a91f70e9a in start_thread () from /lib/x86_64-linux-gnu/
libpthread.so.0
#9Â 0x00007f1a91a964bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x0000000000000000 in ?? ()

Thread 20 (Thread 0x7f19fc727700 (LWP 3967)):
#0Â 0x00007f1a0ee05a67 in leveldb::crc32c::Extend(unsigned int, char const*,
unsigned long) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#1Â 0x00007f1a0ee012b9 in leveldb::TableBuilder::WriteRawBlock(leveldb::
Slice const&, leveldb::CompressionType, leveldb::BlockHandle*) () from /usr/
lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#2Â 0x00007f1a0ee01444 in leveldb::TableBuilder::WriteBlock(leveldb::
BlockBuilder*, leveldb::BlockHandle*) () from /usr/lib/riak/lib/eleveldb-
1.2.2p5/priv/eleveldb.so
#3Â 0x00007f1a0ee015e4 in leveldb::TableBuilder::Flush() () from /usr/lib/
riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#4Â 0x00007f1a0ee0178b in leveldb::TableBuilder::Add(leveldb::Slice const&,
leveldb::Slice const&) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#5Â 0x00007f1a0ede7cad in leveldb::DBImpl::DoCompactionWork(leveldb::
DBImpl::CompactionState*) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#6Â 0x00007f1a0ede8456 in leveldb::DBImpl::BackgroundCompaction() () from /
usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#7Â 0x00007f1a0ede9038 in leveldb::DBImpl::BackgroundCall() () from /usr/
lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#8Â 0x00007f1a0ee06c1e in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#9Â 0x00007f1a91f70e9a in start_thread () from /lib/x86_64-linux-gnu/
libpthread.so.0
#10 0x00007f1a91a964bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x0000000000000000 in ?? ()

When I looked at thread 20 in the process again, the stack has shown some
Snappy compressions, and many later inspections have shown call to fdatasync
(2),
which was replaced by some more compaction work. Thread 24 still sits in
leveldb::DBImpl::MakeRoomForWrite.

Thread 20 samples:
#0Â 0x00007f1a0ee0ed6d in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#1Â 0x00007f1a0ee0edb3 in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#2Â 0x00007f1a0ee0f9dc in snappy::internal::CompressFragment(char const*,
unsigned long, char*, unsigned short*, int) () from /usr/lib/riak/lib/
eleveldb-1.2.2p5/priv/eleveldb.so
#3Â 0x00007f1a0ee10dc1 in snappy::Compress(snappy::Source*, snappy::Sink*) (
) from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#4Â 0x00007f1a0ee1115a in snappy::RawCompress(char const*, unsigned long,
char*, unsigned long*) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#5Â 0x00007f1a0ee014eb in leveldb::TableBuilder::WriteBlock(leveldb::
BlockBuilder*, leveldb::BlockHandle*) () from /usr/lib/riak/lib/eleveldb-
1.2.2p5/priv/eleveldb.so
#6Â 0x00007f1a0ee015e4 in leveldb::TableBuilder::Flush() () from /usr/lib/
riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#7Â 0x00007f1a0ee0178b in leveldb::TableBuilder::Add(leveldb::Slice const&,
leveldb::Slice const&) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.soÂ
#8Â 0x00007f1a0ede7cad in leveldb::DBImpl::DoCompactionWork(leveldb::
DBImpl::CompactionState*) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#9Â 0x00007f1a0ede8456 in leveldb::DBImpl::BackgroundCompaction() () from /
usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#10 0x00007f1a0ede9038 in leveldb::DBImpl::BackgroundCall() () from /usr/
lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#11 0x00007f1a0ee06c1e in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#12 0x00007f1a91f70e9a in start_thread () from /lib/x86_64-linux-gnu/
libpthread.so.0
#13 0x00007f1a91a964bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#14 0x0000000000000000 in ?? ()

#0Â 0x00007f1a91a8fa5d in fdatasync () from /lib/x86_64-linux-gnu/libc.so.6
#1Â 0x00007f1a0ee08d64 in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#2Â 0x00007f1a0ede3357 in leveldb::DBImpl::FinishCompactionOutputFile
(leveldb::DBImpl::CompactionState*, leveldb::Iterator*) () from /usr/lib/
riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#3Â 0x00007f1a0ede7e6e in leveldb::DBImpl::DoCompactionWork(leveldb::
DBImpl::CompactionState*) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#4Â 0x00007f1a0ede8456 in leveldb::DBImpl::BackgroundCompaction() () from /
usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#5Â 0x00007f1a0ede9038 in leveldb::DBImpl::BackgroundCall() () from /usr/
lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#6Â 0x00007f1a0ee06c1e in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#7Â 0x00007f1a91f70e9a in start_thread () from /lib/x86_64-linux-gnu/
libpthread.so.0
#8Â 0x00007f1a91a964bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#9Â 0x0000000000000000 in ?? ()

#0Â 0x00007f1a0ee05765 in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#1Â 0x00007f1a0ee0b6da in leveldb::InternalKeyComparator::Compare(leveldb::
Slice const&, leveldb::Slice const&) const () from /usr/lib/riak/lib/
eleveldb-1.2.2p5/priv/eleveldb.so
#2Â 0x00007f1a0ee00218 in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#3Â 0x00007f1a0ee006aa in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#4Â 0x00007f1a0ede7ccd in leveldb::DBImpl::DoCompactionWork(leveldb::
DBImpl::CompactionState*) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#5Â 0x00007f1a0ede8456 in leveldb::DBImpl::BackgroundCompaction() () from /
usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#6Â 0x00007f1a0ede9038 in leveldb::DBImpl::BackgroundCall() () from /usr/
lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#7Â 0x00007f1a0ee06c1e in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#8Â 0x00007f1a91f70e9a in start_thread () from /lib/x86_64-linux-gnu/
libpthread.so.0
#9Â 0x00007f1a91a964bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x0000000000000000 in ?? ()

#0Â 0x00007f1a0eb61dcb in std::string::_M_mutate(unsigned long, unsigned
long, unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1Â 0x00007f1a0eb61e1c in std::string::_M_replace_safe(unsigned long,
unsigned long, char const*, unsigned long) () from /usr/lib/x86_64-linux-
gnu/libstdc++.so.6
#2Â 0x00007f1a0ee03559 in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#3Â 0x00007f1a0ee037bd in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#4Â 0x00007f1a0ee00680 in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#5Â 0x00007f1a0ede7ccd in leveldb::DBImpl::DoCompactionWork(leveldb::
DBImpl::CompactionState*) () from /usr/lib/riak/lib/eleveldb-1.2.2p5/priv/
eleveldb.so
#6Â 0x00007f1a0ede8456 in leveldb::DBImpl::BackgroundCompaction() () from /
usr/lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#7Â 0x00007f1a0ede9038 in leveldb::DBImpl::BackgroundCall() () from /usr/
lib/riak/lib/eleveldb-1.2.2p5/priv/eleveldb.so
#8Â 0x00007f1a0ee06c1e in ?? () from /usr/lib/riak/lib/eleveldb-1.2.2p5/
priv/eleveldb.so
#9Â 0x00007f1a91f70e9a in start_thread () from /lib/x86_64-linux-gnu/
libpthread.so.0
#10 0x00007f1a91a964bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#11 0x0000000000000000 in ?? ()

Software used:

OS: Ubuntu 12.04 LTS, amd64
Riak: riak_1.2.1rc2, installed from the Basho-provided deb package
client accesses Riak via riak-erlang-client 1.2.1

Any hints?

Thanks, Jan

Evan Vigil-McClanahan

2012-10-11 20:36:51 UTC