Riak 1.4 test on Azure - Webmachine error at path ...

Discussion:

Christian Rosnes

2013-07-25 12:16:32 UTC

Hi,

During a test I just performed on a small Riak 1.4 cluster setup on Azure,
I started seeing the Riak errors messages listed below after about 10
minutes.

The simple test was performed using lastest Jmeter running on two Azure
instances,
which also each runs haproxy and loadbalances the http/rest requests
between
the 4 Riak nodes.

Each Jmeter node was running 300 threads each. Up to around 10 minutes
I was seeing about 2400 PUT req/s. Then some 23 000+ requests failed with
errors
during a 5 minute period or so, and then the tests continued without errors
until I
stopped it some 5 minutes later.

The 'entries' bucket already contains some 20+ million entries from other
similar tests.

Q1: What could be the cause of these error messages ?

Q2: Am I overwhelming the "small" Riak instances with too many requests/s ?

Q3: And is there any remedy, besides maybe reducing the number of
requests/s ?

Thanks.

Christian
@NorSoulx (twitter)

#------------------------------------------------------------------------------

Azure images used for Riak:
CentOS release 6.3
Riak installation done according to this document:
* http://docs.basho.com/riak/latest/ops/building/installing/azure/

The Azure setup:
4 x Large(4core,7GB) Riak 1.4 cluster.
2 x Medium(2core,3.5GB) for JMeter and HaProxy nodes.

I know these are small instances,
but it's what I could squeeze in on 20 Compute Units (during the free
trial).

I'm testing with LevelDB and 5 x 2i (secondary indexes) in each JSON object.

Riak config is out-of-the-box, except for activating LevelDB backend,
and setting LevelDB cache_size to:

{cache_size, 268435456 } %% 256MB default cache size per-partition
%% (64 PARTITIONS/4 NODES =
16 PER NODE)

Each JSON object is dynamically generated by Jmeter, and looks something
like this:

( key used in this test is "<ts>:<uuid>" )

curl -v
http://localhost:8098/riak/entries/1374747941671:7f3e42b5-3477-434b-a43e-5ae34efac1ea
* About to connect() to localhost port 8098 (#0)
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8098 (#0)

GET /riak/entries/1374747941671:7f3e42b5-3477-434b-a43e-5ae34efac1ea

HTTP/1.1

User-Agent: curl/7.29.0
Host: localhost:8098
Accept: */*

< HTTP/1.1 200 OK
< X-Riak-Vclock: a85hYGBgzGDKBVIc6UvPHwh8v29PBlMiYx4rg0Zl6mm+LAA=
< x-riak-index-ts_int: 1374747941671
< x-riak-index-src_int: 38910224580
< x-riak-index-sno_int: 10010
< x-riak-index-pid_int: 6945
< x-riak-index-pfix_int: 389
< Vary: Accept-Encoding
< Server: MochiWeb/1.1 WebMachine/1.10.0 (never breaks eye contact)
< Link: </riak/entries>; rel="up"
< Last-Modified: Thu, 25 Jul 2013 10:25:44 GMT
< ETag: "3FNLsWE4uekv2y3OuIIGfa"
< Date: Thu, 25 Jul 2013 11:21:43 GMT
< Content-Type: application/json
< Content-Length: 386
<
* Connection #0 to host localhost left intact
{ "id":"1374747941671:7f3e42b5-3477-434b-a43e-5ae34efac1ea",
"ts":"1374747941671", "src": "38910224580", "sId":"SYSTEM_011",
"cid":"1374747941671:7f3e42b5-3477-434b-a43e-5ae34efac1ea",
"mTyp":"MTYPE_015","pid":"6945","dst":{"sno":"10010","sNum":"4580","dTyp":"DTYPE_020","co":"MK","pFix":"389"},
"bInfo":{"price":39, "cur":"MKD", "iTxt":"INVOICE_011"},
"prop":{"bMod":"BM_MEDIA_001"} }

The error messages:

/var/riak/log/console.log
-------------------------
2013-07-25 10:51:42.500 [error] <0.7736.172> Webmachine error at path
"/riak/entries/1374749501619:d7832329-2596-40f2-a8d0-56c278c5c1d7" :
"Internal Server Error"
2013-07-25 10:51:42.589 [error] <0.708.177> Webmachine error at path
"/riak/entries/1374749501502:9ef09821-ba94-4462-b36e-486d8098a305" :
"Internal Server Error"
2013-07-25 10:51:43.148 [error] <0.21245.174> Webmachine error at path
"/riak/entries/1374749502638:891e855b-018a-4809-975e-e813d034f7a7" :
"Internal Server Error"
2013-07-25 10:51:43.407 [error] <0.4894.167> Webmachine error at path
"/riak/entries/1374749502988:089d2c2e-fcff-4e0e-8c56-891f891393cb" :
"Internal Server Error"

2013-07-25 11:05:09.627 [error] <0.10618.174> Webmachine error at path
"/riak/entries/1374750249401:543cc0b9-1c5a-46b3-8631-47cc6297606e" :
"Service Unavailable"
2013-07-25 11:05:09.696 [error] <0.21230.174> Webmachine error at path
"/riak/entries/1374750249278:ed744534-c528-4c40-9865-4729c2246a0e" :
"Service Unavailable"
2013-07-25 11:05:10.396 [error] <0.19147.174> Webmachine error at path
"/riak/entries/1374750250000:311c493e-797a-422e-9569-d40d1245787d" :
"Service Unavailable"
2013-07-25 11:05:13.326 [error] <0.21076.166> Webmachine error at path
"/riak/entries/1374750250096:3a263a33-43db-4ec2-b16b-ce31af919963" :
"Service Unavailable"

/var/riak/log/error.log
-----------------------
2013-07-25 10:51:42.500 [error] <0.7736.172> Webmachine error at path
"/riak/entries/1374749501619:d7832329-2596-40f2-a8d0-56c278c5c1d7" :
"Internal Server Error"
2013-07-25 10:51:42.589 [error] <0.708.177> Webmachine error at path
"/riak/entries/1374749501502:9ef09821-ba94-4462-b36e-486d8098a305" :
"Internal Server Error"
2013-07-25 10:51:43.148 [error] <0.21245.174> Webmachine error at path
"/riak/entries/1374749502638:891e855b-018a-4809-975e-e813d034f7a7" :
"Internal Server Error"
2013-07-25 10:51:43.407 [error] <0.4894.167> Webmachine error at path
"/riak/entries/1374749502988:089d2c2e-fcff-4e0e-8c56-891f891393cb" :
"Internal Server Error"

/var/riak/log/crash.log
-----------------------
2013-07-25 10:51:42 =ERROR REPORT====
webmachine error:
path="/riak/entries/1374749501619:d7832329-2596-40f2-a8d0-56c278c5c1d7"
"Internal Server Error"
2013-07-25 10:51:42 =ERROR REPORT====
webmachine error:
path="/riak/entries/1374749501502:9ef09821-ba94-4462-b36e-486d8098a305"
"Internal Server Error"
2013-07-25 10:51:43 =ERROR REPORT====
webmachine error:
path="/riak/entries/1374749502638:891e855b-018a-4809-975e-e813d034f7a7"
"Internal Server Error"
2013-07-25 10:51:43 =ERROR REPORT====
webmachine error:
path="/riak/entries/1374749502988:089d2c2e-fcff-4e0e-8c56-891f891393cb"
"Internal Server Error"

Christian Rosnes

2013-07-28 19:53:32 UTC

Permalink

On Thu, Jul 25, 2013 at 2:16 PM, Christian Rosnes <

Post by Christian Rosnes
During a test I just performed on a small Riak 1.4 cluster setup on Azure,
I started seeing the Riak errors messages listed below after about 10
minutes.
The simple test was performed using lastest Jmeter running on two Azure
instances,
which also each runs haproxy and loadbalances the http/rest requests
between
the 4 Riak nodes.

An Update:

Increased some sysctl.conf network parameters on the Riak instances.

Post by Christian Rosnes

600 JMeter threads - 3600 seconds

[FINAL RESULTS]
total count: 5526844, overall avg: 386 (ms),
overall tps: 1536.6 (p/sec), recent tps: 1932.1 (p/sec), errors: 0

Post by Christian Rosnes

400 JMeter threads - 3600 seconds

[FINAL RESULTS]
total count: 6350600, overall avg: 224 (ms),
overall tps: 1765.7 (p/sec), recent tps: 1849.1 (p/sec), errors: 0

Post by Christian Rosnes

300 JMeter threads - 3600 seconds

[FINAL RESULTS]
total count: 5997689, overall avg: 178 (ms),
overall tps: 1666.5 (p/sec), recent tps: 1744.7 (p/sec), errors: 0

A drop in performance compared to previous Azure results
(from around 2000 req/s for 1-hour tests), but it may be caused
by the move of Riak data directory from ephemeral
/mnt/resource partitions to persistent XFS partitions;
'iostat' reports that the new partitions are near 100%
io utilization during the tests.

Will run a few more tests, then on to testing on some larger
(and a few more) Azure instances and compare result with
similar instances on AWS.

Christian
@NorSoulx

Matthew Von-Maszewski

2013-07-28 20:08:13 UTC

Permalink

Christian,

leveldb has two independent caches: file cache and data block cache. You have raised the data block cache from its default 8M to 256M per your earlier note. I would recommend the follow:

{max_open_files, 50}, %% 50 * 4Mbytes allocation for file cache
{cache_size, 104857600}, %% 100Mbytes for data block cache

The max_open_files default is 20 (which is internally reduced by 10). You are likely thrashing file opens. The file cache is far more important to performance than the data block cache.

Find the LOG file within one of your database "vnode" directories. Look for a line like this ' compacted to: files[ 0 9 25 14 2 0 0 ]'. You would like to be covering that total count of files (plus 10) with your max_open_files setting. Take the cache_size down to as low as 8Mbytes to achieve the coverage. Once you are down to 8Mbytes of cache_size, you should go no lower and give up on full max_open_files coverage.

Summary: total memory per vnode in 1.4 is (max_open_files - 10) * 4Mbytes + cache_size;

Matthew

Post by Christian Rosnes
During a test I just performed on a small Riak 1.4 cluster setup on Azure,
I started seeing the Riak errors messages listed below after about 10 minutes.
The simple test was performed using lastest Jmeter running on two Azure instances,
which also each runs haproxy and loadbalances the http/rest requests between
the 4 Riak nodes.
Increased some sysctl.conf network parameters on the Riak instances.

600 JMeter threads - 3600 seconds

[FINAL RESULTS]
total count: 5526844, overall avg: 386 (ms),
overall tps: 1536.6 (p/sec), recent tps: 1932.1 (p/sec), errors: 0

400 JMeter threads - 3600 seconds

[FINAL RESULTS]
total count: 6350600, overall avg: 224 (ms),
overall tps: 1765.7 (p/sec), recent tps: 1849.1 (p/sec), errors: 0

300 JMeter threads - 3600 seconds

[FINAL RESULTS]
total count: 5997689, overall avg: 178 (ms),
overall tps: 1666.5 (p/sec), recent tps: 1744.7 (p/sec), errors: 0
A drop in performance compared to previous Azure results
(from around 2000 req/s for 1-hour tests), but it may be caused
by the move of Riak data directory from ephemeral
/mnt/resource partitions to persistent XFS partitions;
'iostat' reports that the new partitions are near 100%
io utilization during the tests.
Will run a few more tests, then on to testing on some larger
(and a few more) Azure instances and compare result with
similar instances on AWS.
Christian
@NorSoulx
_______________________________________________
riak-users mailing list
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Christian Rosnes

2013-07-30 08:51:34 UTC

Permalink

On Sun, Jul 28, 2013 at 10:08 PM, Matthew Von-Maszewski

Post by Matthew Von-Maszewski
leveldb has two independent caches: file cache and data block cache. You
have raised the data block cache from its default 8M to 256M per your
{max_open_files, 50}, %% 50 * 4Mbytes allocation for file cache
{cache_size, 104857600}, %% 100Mbytes for data block cache
The max_open_files default is 20 (which is internally reduced by 10). You
are likely thrashing file opens. The file cache is far more important to
performance than the data block cache.
Find the LOG file within one of your database "vnode" directories. Look
for a line like this ' compacted to: files[ 0 9 25 14 2 0 0 ]'. You would
like to be covering that total count of files (plus 10) with your
max_open_files setting. Take the cache_size down to as low as 8Mbytes to
achieve the coverage. Once you are down to 8Mbytes of cache_size, you
should go no lower and give up on full max_open_files coverage.
Summary: total memory per vnode in 1.4 is (max_open_files - 10) * 4Mbytes + cache_size;

Thank you.

In app.config I have now set this for the eleveldb section:

{cache_size, 8388608} %% 8MB
{max_open_files, 260}

The 'max_open_files' was based on the highest sum of the 6 numbers I read
in the 'compacted to ...' line. Is this the right way to set this parameter,
or is it too high to have any significant benefit ?

After the sysctl.conf and app.config changes I've inserted about
50 million json objects via http and not seen any errors. The performance
for
each 1-hour test has ranged from 1700 to 1850 insert req/s.
The bucket used during testing now contains over 84 million json objects.

Btw - when I now check the logs:

[***@riak01 leveldb]# grep 'compacted' */LOG
2013/07/29-19:38:14.650538 7f96a1bdc700 compacted to: files[ 0 1 21 289 0 0
0 ]
2013/07/29-19:34:58.838520 7f9692138700 compacted to: files[ 2 0 30 288 0 0
0 ]
2013/07/29-19:38:02.037188 7f96c51b8700 compacted to: files[ 1 0 27 301 0 0
0 ]
2013/07/29-19:38:12.214409 7f96c51b8700 compacted to: files[ 1 0 26 302 0 0
0 ]
2013/07/29-19:37:06.503530 7f96a1bdc700 compacted to: files[ 1 1 22 284 0 0
0 ]
2013/07/29-19:31:41.932370 7f96bffff700 compacted to: files[ 0 1 25 291 0 0
0 ]
2013/07/29-19:32:18.097417 7f96bffff700 compacted to: files[ 0 1 24 292 0 0
0 ]
2013/07/29-19:30:20.986832 7f968e538700 compacted to: files[ 2 1 24 278 0 0
0 ]
2013/07/29-19:37:47.139039 7f96c51b8700 compacted to: files[ 3 0 20 300 0 0
0 ]
2013/07/29-19:15:10.950633 7f968db37700 compacted to: files[ 0 2 33 263 0 0
0 ]
2013/07/29-19:33:01.001246 7f968e538700 compacted to: files[ 1 1 30 280 0 0
0 ]
2013/07/29-19:31:41.494208 7f96bffff700 compacted to: files[ 1 1 25 283 0 0
0 ]
2013/07/29-19:31:57.008503 7f96bffff700 compacted to: files[ 1 1 24 284 0 0
0 ]
2013/07/29-19:39:00.008635 7f96a1bdc700 compacted to: files[ 0 1 31 289 0 0
0 ]

Could there be a benefit to increase 'max_open_files' even further, say to:
340 ?
(assuming there is enough memory)

Christian
@NorSoulx

Matthew Von-Maszewski

2013-07-30 12:48:47 UTC

Permalink

Yes, you can raise max_open_files as high as you have memory to cover.

The plan for Riak 2.0 is to make these settings automatic and dynamic. Then the cache_size will automatically decline as max_open_files increases to cover your dataset. Of course, there will be limits to prevent memory going to swap or OOM.

Matthew

Post by Matthew Von-Maszewski
{max_open_files, 50}, %% 50 * 4Mbytes allocation for file cache
{cache_size, 104857600}, %% 100Mbytes for data block cache
The max_open_files default is 20 (which is internally reduced by 10). You are likely thrashing file opens. The file cache is far more important to performance than the data block cache.
Find the LOG file within one of your database "vnode" directories. Look for a line like this ' compacted to: files[ 0 9 25 14 2 0 0 ]'. You would like to be covering that total count of files (plus 10) with your max_open_files setting. Take the cache_size down to as low as 8Mbytes to achieve the coverage. Once you are down to 8Mbytes of cache_size, you should go no lower and give up on full max_open_files coverage.
Summary: total memory per vnode in 1.4 is (max_open_files - 10) * 4Mbytes + cache_size;
Thank you.
{cache_size, 8388608} %% 8MB
{max_open_files, 260}
The 'max_open_files' was based on the highest sum of the 6 numbers I read
in the 'compacted to ...' line. Is this the right way to set this parameter,
or is it too high to have any significant benefit ?
After the sysctl.conf and app.config changes I've inserted about
50 million json objects via http and not seen any errors. The performance for
each 1-hour test has ranged from 1700 to 1850 insert req/s.
The bucket used during testing now contains over 84 million json objects.
2013/07/29-19:38:14.650538 7f96a1bdc700 compacted to: files[ 0 1 21 289 0 0 0 ]
2013/07/29-19:34:58.838520 7f9692138700 compacted to: files[ 2 0 30 288 0 0 0 ]
2013/07/29-19:38:02.037188 7f96c51b8700 compacted to: files[ 1 0 27 301 0 0 0 ]
2013/07/29-19:38:12.214409 7f96c51b8700 compacted to: files[ 1 0 26 302 0 0 0 ]
2013/07/29-19:37:06.503530 7f96a1bdc700 compacted to: files[ 1 1 22 284 0 0 0 ]
2013/07/29-19:31:41.932370 7f96bffff700 compacted to: files[ 0 1 25 291 0 0 0 ]
2013/07/29-19:32:18.097417 7f96bffff700 compacted to: files[ 0 1 24 292 0 0 0 ]
2013/07/29-19:30:20.986832 7f968e538700 compacted to: files[ 2 1 24 278 0 0 0 ]
2013/07/29-19:37:47.139039 7f96c51b8700 compacted to: files[ 3 0 20 300 0 0 0 ]
2013/07/29-19:15:10.950633 7f968db37700 compacted to: files[ 0 2 33 263 0 0 0 ]
2013/07/29-19:33:01.001246 7f968e538700 compacted to: files[ 1 1 30 280 0 0 0 ]
2013/07/29-19:31:41.494208 7f96bffff700 compacted to: files[ 1 1 25 283 0 0 0 ]
2013/07/29-19:31:57.008503 7f96bffff700 compacted to: files[ 1 1 24 284 0 0 0 ]
2013/07/29-19:39:00.008635 7f96a1bdc700 compacted to: files[ 0 1 31 289 0 0 0 ]
Could there be a benefit to increase 'max_open_files' even further, say to: 340 ?
(assuming there is enough memory)
Christian
@NorSoulx