Riak KV 2.0.4 Release Notes

Changes

AAE Fullsync Performance Improvements

In version 2.0.4, a number of improvements to the AAE fullsync feature were added, improvements that were initially introduced in Riak version 1.4.12 but had not yet been introduced into Riak 2.0.x. The following improvements have been introduced:

  • Data transfers are now pipelined instead of individually acknowledged, in the name of maximizing throughput
  • The code now avoids redundant scans of replcated data. For a replication factor of, say, 3 (i.e. an n_val of 3), only a third of the relevant vnodes are scanned.
  • The algorithm is now smarter about small differences

Performing an AAE fullsync operation between two identical clusters is very fast. The time it takes to finish an AAE fullsync is much closer to linear on the number of differences between the clusters. Below are the results of one of our benchmarks. Two 8-node clusters were used, each with the following characteristics:

  • Each node has 630 GB of SSD storage
  • Each node is running only two vnodes (16-partition ring)
  • 23 million objects per vnode
  • 99% of objects are 8 KB
  • 1% of objects are a mix of 8 KB to 40 MB outliers
  • 450 GB of data on disk per server

The results for each stage of fullsync:

  • Empty cluster to empty cluster: 6 seconds
  • Full cluster to empty cluster: 14 hours, 40 minutes
  • 10% changes: 3 hours, 45 minutes
  • 1% changes: 40.5 minutes
  • No changes: 42.5 seconds

Fixes

  • Fix stats process crash if no leader riak_repl/645
  • Address some minor bugs around establishing SSL connections riak_repl/644
  • 2.0 port of AAE transient filesystem failures riak_repl/640
  • Fix error/retry exit counts on location down messages riak_repl/639
  • Fix deadlock when performing AAE fullsync over SSL (Erlang VM patch)
  • Prevent servers from accepting SSLv3 connections (Erlang VM patch)
  • The map Data Type is now more efficient when merging
  • Fix a case in which sibling explosion could occur during handoff
  • Special handling for the net_ticktime setting in admin scripts node_package/166
  • Add a missing function clause in riak_kv_node that could result in crashes riak_core/693
  • Avoid timeouts when handoff sender is folding over a large number of keys riak_core/627
  • No more extra work for handoff sender after TCP error makes that work useless riak_core/626
  • Report error when failing to open file instead of crashing when calling riak_core_util:replace_file/2 riak_core/646
  • Debian package fixes
  • Ensure creation of ensembles when strongly consistent bucket types with different n_vals from default bucket type do not yet have buckets riak_kv/1008
  • Avoid SSL deadlocks that occur when sending data bidirectionally using Erlang SSL sockets. The fix is a patch to the Erlang VM shipped with the build.

Merged Pull Requests

Added Repositories

Known Issues

  • Clique can’t handle config values with Cuttlefish transformations

Download

Please see our downloads page.

Feedback

We would love to hear from you. You can reach us at any of the following links:

Or via email at info@riak.com.

Riak 2.0.3 Release Notes

Merged PRs