Struct rocksdb::Options

source ·
pub struct Options { /* private fields */ }
Expand description

Database-wide options around performance and behavior.

Please read the official tuning guide and most importantly, measure performance under realistic workloads with realistic hardware.

Examples

use rocksdb::{Options, DB};
use rocksdb::DBCompactionStyle;

fn badly_tuned_for_somebody_elses_disk() -> DB {
   let path = "path/for/rocksdb/storageX";
   let mut opts = Options::default();
   opts.create_if_missing(true);
   opts.set_max_open_files(10000);
   opts.set_use_fsync(false);
   opts.set_bytes_per_sync(8388608);
   opts.optimize_for_point_lookup(1024);
   opts.set_table_cache_num_shard_bits(6);
   opts.set_max_write_buffer_number(32);
   opts.set_write_buffer_size(536870912);
   opts.set_target_file_size_base(1073741824);
   opts.set_min_write_buffer_number_to_merge(4);
   opts.set_level_zero_stop_writes_trigger(2000);
   opts.set_level_zero_slowdown_writes_trigger(0);
   opts.set_compaction_style(DBCompactionStyle::Universal);
   opts.set_max_background_compactions(4);
   opts.set_max_background_flushes(4);
   opts.set_disable_auto_compactions(true);

   DB::open(&opts, path).unwrap()
}

Implementations§

By default, RocksDB uses only one background thread for flush and compaction. Calling this function will set it up such that total of total_threads is used. Good value for total_threads is the number of cores. You almost definitely want to call this function if your system is bottlenecked by RocksDB.

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.increase_parallelism(3);

Optimize level style compaction.

Default values for some parameters in Options are not optimized for heavy workloads and big datasets, which means you might observe write stalls under some conditions.

This can be used as one of the starting points for tuning RocksDB options in such cases.

Internally, it sets write_buffer_size, min_write_buffer_number_to_merge, max_write_buffer_number, level0_file_num_compaction_trigger, target_file_size_base, max_bytes_for_level_base, so it can override if those parameters were set before.

It sets buffer sizes so that memory consumption would be constrained by memtable_memory_budget.

Optimize universal style compaction.

Default values for some parameters in Options are not optimized for heavy workloads and big datasets, which means you might observe write stalls under some conditions.

This can be used as one of the starting points for tuning RocksDB options in such cases.

Internally, it sets write_buffer_size, min_write_buffer_number_to_merge, max_write_buffer_number, level0_file_num_compaction_trigger, target_file_size_base, max_bytes_for_level_base, so it can override if those parameters were set before.

It sets buffer sizes so that memory consumption would be constrained by memtable_memory_budget.

If true, the database will be created if it is missing.

Default: false

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.create_if_missing(true);

If true, any column families that didn’t exist when opening the database will be created.

Default: false

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.create_missing_column_families(true);

Specifies whether an error should be raised if the database already exists.

Default: false

Enable/disable paranoid checks.

If true, the implementation will do aggressive checking of the data it is processing and will stop early if it detects any errors. This may have unforeseen ramifications: for example, a corruption of one DB entry may cause a large number of entries to become unreadable or for the entire DB to become unopenable. If any of the writes to the database fails (Put, Delete, Merge, Write), the database will switch to read-only mode and fail all other Write operations.

Default: false

A list of paths where SST files can be put into, with its target size. Newer data is placed into paths specified earlier in the vector while older data gradually moves to paths specified later in the vector.

For example, you have a flash device with 10GB allocated for the DB, as well as a hard drive of 2TB, you should config it to be: [{“/flash_path”, 10GB}, {“/hard_drive”, 2TB}]

The system will try to guarantee data under each path is close to but not larger than the target size. But current and future file sizes used by determining where to place a file are based on best-effort estimation, which means there is a chance that the actual size under the directory is slightly more than target size under some workloads. User should give some buffer room for those cases.

If none of the paths has sufficient room to place a file, the file will be placed to the last path anyway, despite to the target size.

Placing newer data to earlier paths is also best-efforts. User should expect user files to be placed in higher levels in some extreme cases.

If left empty, only one path will be used, which is path passed when opening the DB.

Default: empty

Use the specified object to interact with the environment, e.g. to read/write files, schedule background work, etc. In the near future, support for doing storage operations such as read/write files through env will be deprecated in favor of file_system.

Default: Env::default()

Sets the compression algorithm that will be used for compressing blocks.

Default: DBCompressionType::Snappy (DBCompressionType::None if snappy feature is not enabled).

Examples
use rocksdb::{Options, DBCompressionType};

let mut opts = Options::default();
opts.set_compression_type(DBCompressionType::Snappy);

Sets the bottom-most compression algorithm that will be used for compressing blocks at the bottom-most level.

Note that to actually unable bottom-most compression configuration after setting the compression type it needs to be enabled by calling [set_bottommost_compression_options] or [set_bottommost_zstd_max_train_bytes] method with enabled argument set to true.

Examples
use rocksdb::{Options, DBCompressionType};

let mut opts = Options::default();
opts.set_bottommost_compression_type(DBCompressionType::Zstd);
opts.set_bottommost_zstd_max_train_bytes(0, true);

Different levels can have different compression policies. There are cases where most lower levels would like to use quick compression algorithms while the higher levels (which have more data) use compression algorithms that have better compression but could be slower. This array, if non-empty, should have an entry for each level of the database; these override the value specified in the previous field ‘compression’.

Examples
use rocksdb::{Options, DBCompressionType};

let mut opts = Options::default();
opts.set_compression_per_level(&[
    DBCompressionType::None,
    DBCompressionType::None,
    DBCompressionType::Snappy,
    DBCompressionType::Snappy,
    DBCompressionType::Snappy
]);

Maximum size of dictionaries used to prime the compression library. Enabling dictionary can improve compression ratios when there are repetitions across data blocks.

The dictionary is created by sampling the SST file data. If zstd_max_train_bytes is nonzero, the samples are passed through zstd’s dictionary generator. Otherwise, the random samples are used directly as the dictionary.

When compression dictionary is disabled, we compress and write each block before buffering data for the next one. When compression dictionary is enabled, we buffer all SST file data in-memory so we can sample it, as data can only be compressed and written after the dictionary has been finalized. So users of this feature may see increased memory usage.

Default: 0

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_compression_options(4, 5, 6, 7);

Sets compression options for blocks at the bottom-most level. Meaning of all settings is the same as in [set_compression_options] method but affect only the bottom-most compression which is set using [set_bottommost_compression_type] method.

Examples
use rocksdb::{Options, DBCompressionType};

let mut opts = Options::default();
opts.set_bottommost_compression_type(DBCompressionType::Zstd);
opts.set_bottommost_compression_options(4, 5, 6, 7, true);

Sets maximum size of training data passed to zstd’s dictionary trainer. Using zstd’s dictionary trainer can achieve even better compression ratio improvements than using max_dict_bytes alone.

The training data will be used to generate a dictionary of max_dict_bytes.

Default: 0.

Sets maximum size of training data passed to zstd’s dictionary trainer when compressing the bottom-most level. Using zstd’s dictionary trainer can achieve even better compression ratio improvements than using max_dict_bytes alone.

The training data will be used to generate a dictionary of max_dict_bytes.

Default: 0.

If non-zero, we perform bigger reads when doing compaction. If you’re running RocksDB on spinning disks, you should set this to at least 2MB. That way RocksDB’s compaction is doing sequential instead of random reads.

When non-zero, we also force new_table_reader_for_compaction_inputs to true.

Default: 0

Allow RocksDB to pick dynamic base of bytes for levels. With this feature turned on, RocksDB will automatically adjust max bytes for each level. The goal of this feature is to have lower bound on size amplification.

Default: false.

👎Deprecated since 0.5.0: add_merge_operator has been renamed to set_merge_operator

Sets a compaction filter used to determine if entries should be kept, changed, or removed during compaction.

An example use case is to remove entries with an expired TTL.

If you take a snapshot of the database, only values written since the last snapshot will be passed through the compaction filter.

If multi-threaded compaction is used, filter_fn may be called multiple times simultaneously.

This is a factory that provides compaction filter objects which allow an application to modify/delete a key-value during background compaction.

A new filter will be created on each compaction run. If multithreaded compaction is being used, each created CompactionFilter will only be used from a single thread and so does not need to be thread-safe.

Default: nullptr

Sets the comparator used to define the order of keys in the table. Default: a comparator that uses lexicographic byte-wise ordering

The client must ensure that the comparator supplied here has the same name and orders keys exactly the same as the comparator provided to previous open calls on the same DB.

👎Deprecated since 0.5.0: add_comparator has been renamed to set_comparator

Sets the optimize_filters_for_hits flag

Default: false

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_optimize_filters_for_hits(true);

Sets the periodicity when obsolete files get deleted.

The files that get out of scope by compaction process will still get automatically delete on every compaction, regardless of this setting.

Default: 6 hours

Prepare the DB for bulk loading.

All data will be in level 0 without any automatic compaction. It’s recommended to manually call CompactRange(NULL, NULL) before reading from the database, because otherwise the read can be very slow.

Sets the number of open files that can be used by the DB. You may need to increase this if your database has a large working set. Value -1 means files opened are always kept open. You can estimate number of files based on target_file_size_base and target_file_size_multiplier for level-based compaction. For universal-style compaction, you can usually set it to -1.

Default: -1

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_max_open_files(10);

If max_open_files is -1, DB will open all files on DB::Open(). You can use this option to increase the number of threads used to open the files. Default: 16

If true, then every store to stable storage will issue a fsync. If false, then every store to stable storage will issue a fdatasync. This parameter should be set to true while storing data to filesystem like ext3 that can lose files after a reboot.

Default: false

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_use_fsync(true);

Specifies the absolute info LOG dir.

If it is empty, the log files will be in the same dir as data. If it is non empty, the log files will be in the specified dir, and the db data dir’s absolute path will be used as the log file name’s prefix.

Default: empty

Specifies the log level. Consider the LogLevel enum for a list of possible levels.

Default: Info

Examples
use rocksdb::{Options, LogLevel};

let mut opts = Options::default();
opts.set_log_level(LogLevel::Warn);

Allows OS to incrementally sync files to disk while they are being written, asynchronously, in the background. This operation can be used to smooth out write I/Os over time. Users shouldn’t rely on it for persistency guarantee. Issue one request for every bytes_per_sync written. 0 turns it off.

Default: 0

You may consider using rate_limiter to regulate write rate to device. When rate limiter is enabled, it automatically enables bytes_per_sync to 1MB.

This option applies to table files

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_bytes_per_sync(1024 * 1024);

Same as bytes_per_sync, but applies to WAL files.

Default: 0, turned off

Dynamically changeable through SetDBOptions() API.

Sets the maximum buffer size that is used by WritableFileWriter.

On Windows, we need to maintain an aligned buffer for writes. We allow the buffer to grow until it’s size hits the limit in buffered IO and fix the buffer size when using direct IO to ensure alignment of write requests if the logical sector size is unusual

Default: 1024 * 1024 (1 MB)

Dynamically changeable through SetDBOptions() API.

If true, allow multi-writers to update mem tables in parallel. Only some memtable_factory-s support concurrent writes; currently it is implemented only for SkipListFactory. Concurrent memtable writes are not compatible with inplace_update_support or filter_deletes. It is strongly recommended to set enable_write_thread_adaptive_yield if you are going to use this feature.

Default: true

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_allow_concurrent_memtable_write(false);

If true, threads synchronizing with the write batch group leader will wait for up to write_thread_max_yield_usec before blocking on a mutex. This can substantially improve throughput for concurrent workloads, regardless of whether allow_concurrent_memtable_write is enabled.

Default: true

Specifies whether an iteration->Next() sequentially skips over keys with the same user-key or not.

This number specifies the number of keys (with the same userkey) that will be sequentially skipped before a reseek is issued.

Default: 8

Enable direct I/O mode for reading they may or may not improve performance depending on the use case

Files will be opened in “direct I/O” mode which means that data read from the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters.

Default: false

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_use_direct_reads(true);

Enable direct I/O mode for flush and compaction

Files will be opened in “direct I/O” mode which means that data written to the disk will not be cached or buffered. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by these parameters. they may or may not improve performance depending on the use case

Default: false

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_use_direct_io_for_flush_and_compaction(true);

Enable/dsiable child process inherit open files.

Default: true

👎Deprecated since 0.7.0: replaced with set_use_direct_reads/set_use_direct_io_for_flush_and_compaction methods

Hints to the OS that it should not buffer disk I/O. Enabling this parameter may improve performance but increases pressure on the system cache.

The exact behavior of this parameter is platform dependent.

On POSIX systems, after RocksDB reads data from disk it will mark the pages as “unneeded”. The operating system may - or may not

  • evict these pages from memory, reducing pressure on the system cache. If the disk block is requested again this can result in additional disk I/O.

On WINDOWS systems, files will be opened in “unbuffered I/O” mode which means that data read from the disk will not be cached or bufferized. The hardware buffer of the devices may however still be used. Memory mapped files are not impacted by this parameter.

Default: true

Examples
#[allow(deprecated)]
use rocksdb::Options;

let mut opts = Options::default();
opts.set_allow_os_buffer(false);

Sets the number of shards used for table cache.

Default: 6

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_table_cache_num_shard_bits(4);

By default target_file_size_multiplier is 1, which means by default files in different levels will have similar size.

Dynamically changeable through SetOptions() API

Sets the minimum number of write buffers that will be merged together before writing to storage. If set to 1, then all write buffers are flushed to L0 as individual files and this increases read amplification because a get request has to check in all of these files. Also, an in-memory merge may result in writing lesser data to storage if there are duplicate records in each of these individual write buffers.

Default: 1

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_min_write_buffer_number(2);

Sets the maximum number of write buffers that are built up in memory. The default and the minimum number is 2, so that when 1 write buffer is being flushed to storage, new writes can continue to the other write buffer. If max_write_buffer_number > 3, writing will be slowed down to options.delayed_write_rate if we are writing to the last write buffer allowed.

Default: 2

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_max_write_buffer_number(4);

Sets the amount of data to build up in memory (backed by an unsorted log on disk) before converting to a sorted on-disk file.

Larger values increase performance, especially during bulk loads. Up to max_write_buffer_number write buffers may be held in memory at the same time, so you may wish to adjust this parameter to control memory usage. Also, a larger write buffer will result in a longer recovery time the next time the database is opened.

Note that write_buffer_size is enforced per column family. See db_write_buffer_size for sharing memory across column families.

Default: 0x4000000 (64MiB)

Dynamically changeable through SetOptions() API

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_write_buffer_size(128 * 1024 * 1024);

Amount of data to build up in memtables across all column families before writing to disk.

This is distinct from write_buffer_size, which enforces a limit for a single memtable.

This feature is disabled by default. Specify a non-zero value to enable it.

Default: 0 (disabled)

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_db_write_buffer_size(128 * 1024 * 1024);

Control maximum total data size for a level. max_bytes_for_level_base is the max total for level-1. Maximum number of bytes for level L can be calculated as (max_bytes_for_level_base) * (max_bytes_for_level_multiplier ^ (L-1)) For example, if max_bytes_for_level_base is 200MB, and if max_bytes_for_level_multiplier is 10, total data size for level-1 will be 200MB, total file size for level-2 will be 2GB, and total file size for level-3 will be 20GB.

Default: 0x10000000 (256MiB).

Dynamically changeable through SetOptions() API

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_max_bytes_for_level_base(512 * 1024 * 1024);

Default: 10

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_max_bytes_for_level_multiplier(4.0);

The manifest file is rolled over on reaching this limit. The older manifest file be deleted. The default value is MAX_INT so that roll-over does not take place.

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_max_manifest_file_size(20 * 1024 * 1024);

Sets the target file size for compaction. target_file_size_base is per-file size for level-1. Target file size for level L can be calculated by target_file_size_base * (target_file_size_multiplier ^ (L-1)) For example, if target_file_size_base is 2MB and target_file_size_multiplier is 10, then each file on level-1 will be 2MB, and each file on level 2 will be 20MB, and each file on level-3 will be 200MB.

Default: 0x4000000 (64MiB)

Dynamically changeable through SetOptions() API

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_target_file_size_base(128 * 1024 * 1024);

Sets the minimum number of write buffers that will be merged together before writing to storage. If set to 1, then all write buffers are flushed to L0 as individual files and this increases read amplification because a get request has to check in all of these files. Also, an in-memory merge may result in writing lesser data to storage if there are duplicate records in each of these individual write buffers.

Default: 1

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_min_write_buffer_number_to_merge(2);

Sets the number of files to trigger level-0 compaction. A value < 0 means that level-0 compaction will not be triggered by number of files at all.

Default: 4

Dynamically changeable through SetOptions() API

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_level_zero_file_num_compaction_trigger(8);

Sets the soft limit on number of level-0 files. We start slowing down writes at this point. A value < 0 means that no writing slow down will be triggered by number of files in level-0.

Default: 20

Dynamically changeable through SetOptions() API

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_level_zero_slowdown_writes_trigger(10);

Sets the maximum number of level-0 files. We stop writes at this point.

Default: 24

Dynamically changeable through SetOptions() API

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_level_zero_stop_writes_trigger(48);

Sets the compaction style.

Default: DBCompactionStyle::Level

Examples
use rocksdb::{Options, DBCompactionStyle};

let mut opts = Options::default();
opts.set_compaction_style(DBCompactionStyle::Universal);

Sets the options needed to support Universal Style compactions.

Sets the options for FIFO compaction style.

Sets unordered_write to true trades higher write throughput with relaxing the immutability guarantee of snapshots. This violates the repeatability one expects from ::Get from a snapshot, as well as ::MultiGet and Iterator’s consistent-point-in-time view property. If the application cannot tolerate the relaxed guarantees, it can implement its own mechanisms to work around that and yet benefit from the higher throughput. Using TransactionDB with WRITE_PREPARED write policy and two_write_queues=true is one way to achieve immutable snapshots despite unordered_write.

By default, i.e., when it is false, rocksdb does not advance the sequence number for new snapshots unless all the writes with lower sequence numbers are already finished. This provides the immutability that we except from snapshots. Moreover, since Iterator and MultiGet internally depend on snapshots, the snapshot immutability results into Iterator and MultiGet offering consistent-point-in-time view. If set to true, although Read-Your-Own-Write property is still provided, the snapshot immutability property is relaxed: the writes issued after the snapshot is obtained (with larger sequence numbers) will be still not visible to the reads from that snapshot, however, there still might be pending writes (with lower sequence number) that will change the state visible to the snapshot after they are landed to the memtable.

Default: false

Sets maximum number of threads that will concurrently perform a compaction job by breaking it into multiple, smaller ones that are run simultaneously.

Default: 1 (i.e. no subcompactions)

Sets maximum number of concurrent background jobs (compactions and flushes).

Default: 2

Dynamically changeable through SetDBOptions() API.

👎Deprecated since 0.15.0: RocksDB automatically decides this based on the value of max_background_jobs

Sets the maximum number of concurrent background compaction jobs, submitted to the default LOW priority thread pool. We first try to schedule compactions based on base_background_compactions. If the compaction cannot catch up , we will increase number of compaction threads up to max_background_compactions.

If you’re increasing this, also consider increasing number of threads in LOW priority thread pool. For more information, see Env::SetBackgroundThreads

Default: 1

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_max_background_compactions(2);
👎Deprecated since 0.15.0: RocksDB automatically decides this based on the value of max_background_jobs

Sets the maximum number of concurrent background memtable flush jobs, submitted to the HIGH priority thread pool.

By default, all background jobs (major compaction and memtable flush) go to the LOW priority pool. If this option is set to a positive number, memtable flush jobs will be submitted to the HIGH priority pool. It is important when the same Env is shared by multiple db instances. Without a separate pool, long running major compaction jobs could potentially block memtable flush jobs of other db instances, leading to unnecessary Put stalls.

If you’re increasing this, also consider increasing number of threads in HIGH priority thread pool. For more information, see Env::SetBackgroundThreads

Default: 1

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_max_background_flushes(2);

Disables automatic compactions. Manual compactions can still be issued on this column family

Default: false

Dynamically changeable through SetOptions() API

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_disable_auto_compactions(true);

SetMemtableHugePageSize sets the page size for huge page for arena used by the memtable. If <=0, it won’t allocate from huge page but from malloc. Users are responsible to reserve huge pages for it to be allocated. For example: sysctl -w vm.nr_hugepages=20 See linux doc Documentation/vm/hugetlbpage.txt If there isn’t enough free huge page available, it will fall back to malloc.

Dynamically changeable through SetOptions() API

Sets the maximum number of successive merge operations on a key in the memtable.

When a merge operation is added to the memtable and the maximum number of successive merges is reached, the value of the key will be calculated and inserted into the memtable instead of the merge operation. This will ensure that there are never more than max_successive_merges merge operations in the memtable.

Default: 0 (disabled)

Control locality of bloom filter probes to improve cache miss rate. This option only applies to memtable prefix bloom and plaintable prefix bloom. It essentially limits the max number of cache lines each bloom filter check can touch.

This optimization is turned off when set to 0. The number should never be greater than number of probes. This option can boost performance for in-memory workload but should use with care since it can cause higher false positive rate.

Default: 0

Enable/disable thread-safe inplace updates.

Requires updates if

  • key exists in current memtable
  • new sizeof(new_value) <= sizeof(old_value)
  • old_value for that key is a put i.e. kTypeValue

Default: false.

Sets the number of locks used for inplace update.

Default: 10000 when inplace_update_support = true, otherwise 0.

Different max-size multipliers for different levels. These are multiplied by max_bytes_for_level_multiplier to arrive at the max-size of each level.

Default: 1

Dynamically changeable through SetOptions() API

If true, then DB::Open() will not fetch and check sizes of all sst files. This may significantly speed up startup if there are many sst files, especially when using non-default Env with expensive GetFileSize(). We’ll still check that all required sst files exist. If paranoid_checks is false, this option is ignored, and sst files are not checked at all.

Default: false

The total maximum size(bytes) of write buffers to maintain in memory including copies of buffers that have already been flushed. This parameter only affects trimming of flushed buffers and does not affect flushing. This controls the maximum amount of write history that will be available in memory for conflict checking when Transactions are used. The actual size of write history (flushed Memtables) might be higher than this limit if further trimming will reduce write history total size below this limit. For example, if max_write_buffer_size_to_maintain is set to 64MB, and there are three flushed Memtables, with sizes of 32MB, 20MB, 20MB. Because trimming the next Memtable of size 20MB will reduce total memory usage to 52MB which is below the limit, RocksDB will stop trimming.

When using an OptimisticTransactionDB: If this value is too low, some transactions may fail at commit time due to not being able to determine whether there were any write conflicts.

When using a TransactionDB: If Transaction::SetSnapshot is used, TransactionDB will read either in-memory write buffers or SST files to do write-conflict checking. Increasing this value can reduce the number of reads to SST files done for conflict detection.

Setting this value to 0 will cause write buffers to be freed immediately after they are flushed. If this value is set to -1, ‘max_write_buffer_number * write_buffer_size’ will be used.

Default: If using a TransactionDB/OptimisticTransactionDB, the default value will be set to the value of ‘max_write_buffer_number * write_buffer_size’ if it is not explicitly set by the user. Otherwise, the default is 0.

By default, a single write thread queue is maintained. The thread gets to the head of the queue becomes write batch group leader and responsible for writing to WAL and memtable for the batch group.

If enable_pipelined_write is true, separate write thread queue is maintained for WAL write and memtable write. A write thread first enter WAL writer queue and then memtable writer queue. Pending thread on the WAL writer queue thus only have to wait for previous writers to finish their WAL writing but not the memtable writing. Enabling the feature may improve write throughput and reduce latency of the prepare phase of two-phase commit.

Default: false

Defines the underlying memtable implementation. See official wiki for more information. Defaults to using a skiplist.

Examples
use rocksdb::{Options, MemtableFactory};
let mut opts = Options::default();
let factory = MemtableFactory::HashSkipList {
    bucket_count: 1_000_000,
    height: 4,
    branching_factor: 4,
};

opts.set_allow_concurrent_memtable_write(false);
opts.set_memtable_factory(factory);

Sets the table factory to a CuckooTableFactory (the default table factory is a block-based table factory that provides a default implementation of TableBuilder and TableReader with default BlockBasedTableOptions). See official wiki for more information on this table format.

Examples
use rocksdb::{Options, CuckooTableOptions};

let mut opts = Options::default();
let mut factory_opts = CuckooTableOptions::default();
factory_opts.set_hash_ratio(0.8);
factory_opts.set_max_search_depth(20);
factory_opts.set_cuckoo_block_size(10);
factory_opts.set_identity_as_first_hash(true);
factory_opts.set_use_module_hash(false);

opts.set_cuckoo_table_factory(&factory_opts);

Sets the factory as plain table. See official wiki for more information.

Examples
use rocksdb::{Options, PlainTableFactoryOptions};

let mut opts = Options::default();
let factory_opts = PlainTableFactoryOptions {
  user_key_length: 0,
  bloom_bits_per_key: 20,
  hash_table_ratio: 0.75,
  index_sparseness: 16,
};

opts.set_plain_table_factory(&factory_opts);

Sets the start level to use compression.

Measure IO stats in compactions and flushes, if true.

Default: false

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_report_bg_io_stats(true);

Once write-ahead logs exceed this size, we will start forcing the flush of column families whose memtables are backed by the oldest live WAL file (i.e. the ones that are causing all the space amplification).

Default: 0

Examples
use rocksdb::Options;

let mut opts = Options::default();
// Set max total wal size to 1G.
opts.set_max_total_wal_size(1 << 30);

Recovery mode to control the consistency while replaying WAL.

Default: DBRecoveryMode::PointInTime

Examples
use rocksdb::{Options, DBRecoveryMode};

let mut opts = Options::default();
opts.set_wal_recovery_mode(DBRecoveryMode::AbsoluteConsistency);

If not zero, dump rocksdb.stats to LOG every stats_dump_period_sec.

Default: 600 (10 mins)

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_stats_dump_period_sec(300);

If not zero, dump rocksdb.stats to RocksDB to LOG every stats_persist_period_sec.

Default: 600 (10 mins)

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_stats_persist_period_sec(5);

When set to true, reading SST files will opt out of the filesystem’s readahead. Setting this to false may improve sequential iteration performance.

Default: true

Specifies the file access pattern once a compaction is started.

It will be applied to all input files of a compaction.

Default: Normal

Enable/disable adaptive mutex, which spins in the user space before resorting to kernel.

This could reduce context switch when the mutex is not heavily contended. However, if the mutex is hot, we could end up wasting spin time.

Default: false

Sets the number of levels for this database.

When a prefix_extractor is defined through opts.set_prefix_extractor this creates a prefix bloom filter for each memtable with the size of write_buffer_size * memtable_prefix_bloom_ratio (capped at 0.25).

Default: 0

Examples
use rocksdb::{Options, SliceTransform};

let mut opts = Options::default();
let transform = SliceTransform::create_fixed_prefix(10);
opts.set_prefix_extractor(transform);
opts.set_memtable_prefix_bloom_ratio(0.2);

Sets the maximum number of bytes in all compacted files. We try to limit number of bytes in one compaction to be lower than this threshold. But it’s not guaranteed.

Value 0 will be sanitized.

Default: target_file_size_base * 25

Specifies the absolute path of the directory the write-ahead log (WAL) should be written to.

Default: same directory as the database

Examples
use rocksdb::Options;

let mut opts = Options::default();
opts.set_wal_dir("/path/to/dir");

Sets the WAL ttl in seconds.

The following two options affect how archived logs will be deleted.

  1. If both set to 0, logs will be deleted asap and will not get into the archive.
  2. If wal_ttl_seconds is 0 and wal_size_limit_mb is not 0, WAL files will be checked every 10 min and if total size is greater then wal_size_limit_mb, they will be deleted starting with the earliest until size_limit is met. All empty files will be deleted.
  3. If wal_ttl_seconds is not 0 and wall_size_limit_mb is 0, then WAL files will be checked every wal_ttl_seconds / 2 and those that are older than wal_ttl_seconds will be deleted.
  4. If both are not 0, WAL files will be checked every 10 min and both checks will be performed with ttl being first.

Default: 0

Sets the WAL size limit in MB.

If total size of WAL files is greater then wal_size_limit_mb, they will be deleted starting with the earliest until size_limit is met.

Default: 0

Sets the number of bytes to preallocate (via fallocate) the manifest files.

Default is 4MB, which is reasonable to reduce random IO as well as prevent overallocation for mounts that preallocate large amounts of data (such as xfs’s allocsize option).

If true, then DB::Open() will not update the statistics used to optimize compaction decision by loading table properties from many files. Turning off this feature will improve DBOpen time especially in disk environment.

Default: false

Specify the maximal number of info log files to be kept.

Default: 1000

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_keep_log_file_num(100);

Allow the OS to mmap file for writing.

Default: false

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_allow_mmap_writes(true);

Allow the OS to mmap file for reading sst tables.

Default: false

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_allow_mmap_reads(true);

If enabled, WAL is not flushed automatically after each write. Instead it relies on manual invocation of DB::flush_wal() to write the WAL buffer to its file.

Default: false

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_manual_wal_flush(true);

Guarantee that all column families are flushed together atomically. This option applies to both manual flushes (db.flush()) and automatic background flushes caused when memtables are filled.

Note that this is only useful when the WAL is disabled. When using the WAL, writes are always consistent across column families.

Default: false

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_atomic_flush(true);

Sets global cache for table-level rows. Cache must outlive DB instance which uses it.

Default: null (disabled) Not supported in ROCKSDB_LITE mode!

Use to control write rate of flush and compaction. Flush has higher priority than compaction. If rate limiter is enabled, bytes_per_sync is set to 1MB by default.

Default: disable

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_ratelimiter(1024 * 1024, 100 * 1000, 10);

Sets the maximal size of the info log file.

If the log file is larger than max_log_file_size, a new info log file will be created. If max_log_file_size is equal to zero, all logs will be written to one log file.

Default: 0

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_max_log_file_size(0);

Sets the time for the info log file to roll (in seconds).

If specified with non-zero value, log file will be rolled if it has been active longer than log_file_time_to_roll. Default: 0 (disabled)

Controls the recycling of log files.

If non-zero, previously written log files will be reused for new logs, overwriting the old data. The value indicates how many such files we will keep around at any point in time for later use. This is more efficient because the blocks are already allocated and fdatasync does not need to update the inode after each write.

Default: 0

Examples
use rocksdb::Options;

let mut options = Options::default();
options.set_recycle_log_file_num(5);

Sets the threshold at which all writes will be slowed down to at least delayed_write_rate if estimated bytes needed to be compaction exceed this threshold.

Default: 64GB

Sets the bytes threshold at which all writes are stopped if estimated bytes needed to be compaction exceed this threshold.

Default: 256GB

Sets the size of one block in arena memory allocation.

If <= 0, a proper value is automatically calculated (usually 1/10 of writer_buffer_size).

Default: 0

If true, then print malloc stats together with rocksdb.stats when printing to LOG.

Default: false

Enable whole key bloom filter in memtable. Note this will only take effect if memtable_prefix_bloom_size_ratio is not 0. Enabling whole key filtering can potentially reduce CPU usage for point-look-ups.

Default: false (disable)

Dynamically changeable through SetOptions() API

Trait Implementations§

Returns a copy of the value. Read more
Performs copy-assignment from source. Read more
Returns the “default value” for a type. Read more
Executes the destructor for this type. Read more

Auto Trait Implementations§

Blanket Implementations§

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
Uses borrowed data to replace owned data, usually by cloning. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.