Crate try_runtime_cli

Expand description

Try-runtime

Substrate’s ultimate testing framework for the power users.

As the name suggests, try-runtime is a detailed testing framework that gives you a lot of control over what is being executed in which environment. It is recommended that user’s first familiarize themselves with substrate in depth, particularly the execution model. It is critical to deeply understand how the wasm/native interactions, and the runtime apis work in the substrate runtime, before commencing to working with try-runtime.

The basis of all try-runtime commands is the same: connect to a live node, scrape its state and put it inside a TestExternalities, then call into a specific runtime-api using the given state and some runtime.

All of the variables in the above statement are made italic. Let’s look at each of them:

State is the key-value pairs of data that comprise the canonical information that any blockchain is keeping. A state can be full (all key-value pairs), or be partial (only pairs related to some pallets). Moreover, some keys are special and are not related to specific pallets, known as well_known_keys in substrate. The most important of these is the :CODE: key, which contains the code used for execution, when wasm execution is chosen.
A runtime-api call is a call into a function defined in the runtime, on top of a given state. Each subcommand of try-runtime utilizes a specific runtime-api.
Finally, the runtime is the actual code that is used to execute the aforementioned runtime-api. All substrate based chains always have two runtimes: native and wasm. The decision of which one is chosen is non-trivial. First, let’s look at the options:
1. Native: this means that the runtime that is in your codebase, aka whatever you see in your editor, is being used. This runtime is easier for diagnostics. We refer to this as the “local runtime”.
2. Wasm: this means that whatever is stored in the :CODE: key of the state that your scrape is being used. In plain sight, since the entire state (including :CODE:) is scraped from a remote chain, you could conclude that the wasm runtime, if used, is always equal to the canonical runtime of the live chain (i.e. NOT the “local runtime”). That’s factually true, but then the testing would be quite lame. Typically, with try-runtime, you don’t want to execute whatever code is already on the live chain. Instead, you want your local runtime (which typically includes a non-released feature) to be used. This is why try-runtime overwrites the wasm runtime (at :CODE:) with the local runtime as well. That being said, this behavior can be controlled in certain subcommands with a special flag (--overwrite-wasm-code).

The decision of which runtime is eventually used is based on two facts:

--execution flag. If you specify wasm, then it is always wasm. If it is native, then if and ONLY IF the spec versions match, then the native runtime is used. Else, wasm runtime is used again.
--chain flag (if present in your cli), which determines which local runtime, is selected. This will specify:
1. which native runtime is used, if you select --execution Native
2. which wasm runtime is used to replace the :CODE:, if try-runtime is instructed to do so.

All in all, if the term “local runtime” is used in the rest of this crate’s documentation, it means either the native runtime, or the wasm runtime when overwritten inside :CODE:. In other words, it means your… well, “local runtime”, regardless of wasm or native.

//! See Command for more information about each command’s specific customization flags, and assumptions regarding the runtime being used.

Finally, To make sure there are no errors regarding this, always run any try-runtime command with executor=trace logging targets, which will specify which runtime is being used per api call.

Furthermore, other relevant log targets are: try-runtime::cli, remote-ext, and runtime.

Spec name check

A common pitfall is that you might be running some test on top of the state of chain x, with the runtime of chain y. To avoid this all commands do a spec-name check before executing anything by default. This will check the spec name of the remote node your are connected to, with the spec name of your local runtime and ensure that they match.

Should you need to disable this on certain occasions, a top level flag of --no-spec-name-check can be used.

The spec version is also always inspected, but if it is a mismatch, it will only emit a warning.

Note nodes that operate with `try-runtime`

There are a number of flags that need to be preferably set on a running node in order to work well with try-runtime’s expensive RPC queries:

set --rpc-max-payload 1000 to ensure large RPC queries can work.
set --ws-max-out-buffer-capacity 1000 to ensure the websocket connection can handle large RPC queries.
set --rpc-cors all to ensure ws connections can come through.

Note that none of the try-runtime operations need unsafe RPCs.

Migration Best Practices

One of the main use-cases of try-runtime is using it for testing storage migrations. The following points makes sure you can effectively test your migrations with try-runtime.

Adding pre/post hooks

One of the gems that come only in the try-runtime feature flag is the pre_upgrade and post_upgrade hooks for OnRuntimeUpgrade. This trait is implemented either inside the pallet, or manually in a runtime, to define a migration. In both cases, these functions can be added, given the right flag:

#[cfg(feature = try-runtime)]
fn pre_upgrade() -> Result<(), &'static str> {}

#[cfg(feature = try-runtime)]
fn post_upgrade() -> Result<(), &'static str> {}

(The pallet macro syntax will support this simply as a part of #[pallet::hooks]).

These hooks allow you to execute some code, only within the on-runtime-upgrade command, before and after the migration. If any data needs to be temporarily stored between the pre/post migration hooks, OnRuntimeUpgradeHelpersExt can help with that. Note that you should be mindful with any mutable storage ops in the pre/post migration checks, as you almost certainly will not want to mutate any of the storage that is to be migrated.

Logging

It is super helpful to make sure your migration code uses logging (always with a runtime log target prefix, e.g. runtime::balance) and state exactly at which stage it is, and what it is doing.

Guarding migrations

Always make sure that any migration code is guarded either by StorageVersion, or by some custom storage item, so that it is NEVER executed twice, even if the code lives in two consecutive runtimes.

Examples

Run the migrations of the local runtime on the state of polkadot, from the polkadot repo where we have --chain polkadot-dev, on the latest finalized block’s state

RUST_LOG=runtime=trace,try-runtime::cli=trace,executor=trace \
    cargo run try-runtime \
    --execution Native \
    --chain polkadot-dev \
    on-runtime-upgrade \
    live \
    --uri wss://rpc.polkadot.io

Same as previous one, but let’s say we want to run this command from the substrate repo, where we don’t have a matching spec name/version.

RUST_LOG=runtime=trace,try-runtime::cli=trace,executor=trace \
    cargo run try-runtime \
    --execution Native \
    --chain dev \
    --no-spec-name-check \ # mind this one!
    on-runtime-upgrade \
    live \
    --uri wss://rpc.polkadot.io

Same as the previous one, but run it at specific block number’s state. This means that this block hash’s state shall not yet have been pruned in rpc.polkadot.io.

RUST_LOG=runtime=trace,try-runtime::cli=trace,executor=trace \
    cargo run try-runtime \
    --execution Native \
    --chain dev \
    --no-spec-name-check \ # mind this one! on-runtime-upgrade \
    on-runtime-upgrade \
    live \
    --uri wss://rpc.polkadot.io \
    --at <block-hash>

Moving to execute-block and offchain-workers. For these commands, you always needs to specify a block hash. For the rest of these examples, we assume we’re in the polkadot repo.

First, let’s assume you are in a branch that has the same spec name/version as the live polkadot network.

RUST_LOG=runtime=trace,try-runtime::cli=trace,executor=trace \
    cargo run try-runtime \
    --execution Wasm \
    --chain polkadot-dev \
    --uri wss://rpc.polkadot.io \
    execute-block \
    live \
    --at <block-hash>

This is wasm, so it will technically execute the code that lives on the live network. Let’s say you want to execute your local runtime. Since you have a matching spec versions, you can simply change --execution Wasm to --execution Native to achieve this. Your logs of executor=trace should show something among the lines of:

Request for native execution succeeded (native: polkadot-9900 (parity-polkadot-0.tx7.au0), chain: polkadot-9900 (parity-polkadot-0.tx7.au0))

If you don’t have matching spec versions, then are doomed to execute wasm. In this case, you can manually overwrite the wasm code with your local runtime:

RUST_LOG=runtime=trace,try-runtime::cli=trace,executor=trace \
    cargo run try-runtime \
    --execution Wasm \
    --chain polkadot-dev \
    execute-block \
    live \
    --uri wss://rpc.polkadot.io \
    --at <block-hash> \
    --overwrite-wasm-code

For all of these blocks, the block with hash <block-hash> is being used, and the initial state is the state of the parent hash. This is because by omitting ExecuteBlockCmd::block_at, the --at is used for both. This should be good enough for 99% of the cases. The only case where you need to specify block-at and block-ws-uri is with snapshots. Let’s say you have a file snap and you know it corresponds to the state of the parent block of X. Then you’d do:

RUST_LOG=runtime=trace,try-runtime::cli=trace,executor=trace \
    cargo run try-runtime \
    --execution Wasm \
    --chain polkadot-dev \
    --uri wss://rpc.polkadot.io \
    execute-block \
    --block-at <x> \
    --block-ws-uri wss://rpc.polkadot.io \
    --overwrite-wasm-code \
    snap \
    -s snap \