dump - riak backup solution for a single bucket -


What are your recommendations for solutions that support single or double rackets or snapshots in a file?

Backup of just one bucket is going to be a tough operation in Riaq.

All solutions will be boiled in the following two steps:

  1. List all the objects in the bucket, this is a difficult part, since the "manifest" anywhere in the cluster There is no list of contents of any bucket.

  2. Get them all a number of objects from the list above, and write it in a backup file This part is usually easy, however, to ensure you get maximum performance Want to get those GET parallel, in a multi-level fashion, and using some kind of connection pooling.

    As far as listing all objects, you have one of three options

    on a bucket via HTTP (eg The keys to running a streaming list on the / buckets / bucket / keys? Keys = stream ) or protocol buffer - see and for details. Under no circumstances should you key operation of a non-streaming routine list. (This will hang your entire cluster, and eventually once the time is over or the crashes will increase in a large amount of time once).

    Two issues to issue a secondary index (2i) query to achieve that object list. See for discussions and warnings.

    And if you are using and can recover all the objects through a single paged search query, then there will be three. (However, Rack Search has a query result limit of 10,000 results, therefore, this approach is far from ideal.)

    For an example of a standalone app that can backup a bucket, An experimental Java app, which uses the key of the streaming list, combines with parallel GET.

Comments