Final merges

Black-Box Ledger Testing

Cardano Canonical Ledger State

Note: this document describes the current export/import shape, but some details of the export/import interface and the test runner are still being aligned. The contents are therefore subject to change.

Overview

The goal of the project is to make ledger conformance testing usable by implementers who do not want to depend on the Agda specification or the full internal Haskell test stack. Instead of validating the ledger by inspecting its internals, the suite treats the ledger as a black box:

  1. Start from a canonical snapshot of ledger state.
  2. Apply a sequence of transactions or a block.
  3. Compare the resulting canonical snapshot with the expected one.

This gives other node or ledger implementations a stable, language-agnostic reference corpus for testing their ledger transition function against known-good scenarios.

Why It Matters

The main benefit is practical interoperability testing with a much lower integration cost than the formal-spec pipeline. A consumer only needs to understand the exported canonical format and implement the ledger transition function that it already exposes.

The project is useful because it:

  1. Provides a concrete reference suite of ledger scenarios instead of a purely internal test harness.
  2. Lets new implementations validate themselves against known-good snapshots without depending on Haskell or Agda tooling.
  3. Supports regression testing as the ledger evolves, because the suite is versioned by protocol version.
  4. Gives a path to derive fresh black-box fixtures from existing cardano-ledger conformance tests.

Exported Test Suite Format

The exporter writes test cases as a directory tree. Each test case lives under a protocol-version directory and preserves the original HSpec path and example description, so the structure remains readable and stable.

The directory layout is:

<base>/metadata.json
<base>/Protocol <version>/<spec path>/<example description>/
  initial-<n>.scls
  txn-<n>.cbor
  block-<n>-issuer.cbor
  block-<n>-tx-<i>.cbor
  final-<n>.scls
  failures-<n>.cbor

The metadata.json file is the entry point for the suite. It records an array of scenarios and the list of exported fixtures. Concretely, metadata.json is a JSON array of objects with the following top-level fields:

  • era - era label, for example Conway.
  • era_imp - test implementation label, for example Conway.
  • protocol_version - ledger protocol version as a number.
  • description - HSpec example description.
  • states - array of fixtures for this test case.
  • path - array of HSpec path segments.
  • globals - exported ledger globals used to run the scenario.

Each fixture in states is a JSON object with:

  • epoch_no - the epoch number recorded for the fixture.
  • initial_state - path to the initial .scls snapshot.
  • transactions - a tagged sum type describing either a transaction or a block.
  • final_state - a tagged sum type describing either an expected final snapshot or an expected failure file.

A concrete exported example looks like this in practice:

{
  "era": "Conway",
  "era_imp": "Conway",
  "protocol_version": 9,
  "description": "BodyRefScriptsSizeTooBig",
  "path": ["BBODY"],
  "globals": {
    "fixed_epoch_size": 4320,
    "fixed_slot_length": {
      "get_slot_length": 1
    },
    "slots_per_kes_period": 129600,
    "stability_window": 1620,
    "randomness_stabilisation_window": 2160,
    "security_parameter": 108,
    "max_kes_evo": 62,
    "quorum": 5,
    "max_lovelace_supply": 45000000000000000,
    "active_slot_coeff": {
      "un_active_slot_val": 0.2,
      "un_active_slot_log": -2231435513142097557662950166345559
    },
    "network_id": "Testnet",
    "system_start": "2017-09-23T21:44:51Z"
  },
  "states": [
    {
      "epoch_no": 899,
      "initial_state": "initial-2285.scls",
      "transactions": {
        "tag": "block",
        "contents": [
          "block-2285-issuer.cbor",
          [
            "block-2285-tx-0.cbor",
            "block-2285-tx-1.cbor",
            "block-2285-tx-2.cbor",
            "block-2285-tx-3.cbor",
            "block-2285-tx-4.cbor",
            "block-2285-tx-5.cbor",
            "block-2285-tx-6.cbor",
            "block-2285-tx-7.cbor",
            "block-2285-tx-8.cbor",
            "block-2285-tx-9.cbor",
            "block-2285-tx-10.cbor",
            "block-2285-tx-11.cbor"
          ]
        ]
      },
      "final_state": {
        "tag": "failures",
        "contents": "failures-2285.cbor"
      }
    },
    {
      "epoch_no": 900,
      "initial_state": "initial-2286.scls",
      "transactions": {
        "tag": "tx",
        "contents": "txn-2286.cbor"
      },
      "final_state": {
        "tag": "final_state",
        "contents": "final-2286.scls"
      }
    }
  ]
}

In this export, the first fixture is a block scenario that ends in failures-2285.cbor, and the latter fixtures are single-transaction success cases that point to final-*.scls files. For a block fixture, transactions uses the block constructor with the issuer file followed by the per-transaction CBOR files. For a failing scenario, final_state uses the failures branch and points to the failure file instead of a final snapshot.

Each transactions field is a tagged sum type encoded in JSON as an object with tag and contents. The two cases are:

  • Single transaction: {"tag": "tx", "contents": "<txn file>"}
  • Block: {"tag": "block", "contents": ["<issuer file>", ["<tx file>", ...]]}

Each final_state field is also a tagged sum type encoded in JSON as an object with tag and contents. The two cases are:

  • Successful final snapshot: {"tag": "final_state", "contents": "<final .scls file>"}
  • Expected failure payload: {"tag": "failures", "contents": "<failures .cbor file>"}

The exported state files use the canonical-state serialization format (SCLS). In practice, the suite contains:

  1. An initial .scls snapshot.
  2. A transaction payload in CBOR, or a block issuer plus per-transaction CBOR files.
  3. Either a final .scls snapshot or a .cbor file containing the expected failures.

How to Export the Test Suite

The exporter is implemented as a Haskell program that lives in the cardano-ledger repository. As of time of writing, the exporter implementation exists on Tweag’s fork of cardano-ledger on branch export-scls. It depends on the internal test harness and the SCLS serialization format.

To build the exporter, you should have a working Haskell environment with the necessary dependencies. See the README in the cardano-ledger repository for instructions on setting up the environment and building the project. To execute the exporter, you should set the SCLS_EXPORT_PATH environment variable to the desired output directory for the exported test suite. Then, you can run the exporter binary with no arguments, and it will generate the test suite in the specified directory.

As a reference, we provide a sample exported test suite in this Google Drive folder: Sample Exported Test Suite. This sample suite contains the same set of scenarios one can export using the instructions above. You can use this sample suite to understand the structure and contents of the exported data, as well as to test your own runner implementations.

What a Foreign-Language Test Runner Must Implement

A test runner in another language should implement the same contract as the reference Haskell test runner flow.

At a minimum, it should:

  1. Read the metadata.json file and understand each scenario’s metadata and the list of fixtures.
  2. For each scenario, and for each fixture, it should:
    1. Decode the initial SCLS snapshot into an internal ledger state.
    2. Decode the transaction or block payloads from CBOR using the recorded protocol version.
    3. Apply the ledger transition function with the recorded epoch number, globals, and epoch/protocol metadata.
    4. Either:
      • when the result is successful, compare the expected final SCLS snapshot against the resulting state; or
      • when the result is a failure, compare the expected failure payload against the observed failure.

The runner should also respect the protocol-versioned nature of the data. That means the transaction decoding, block decoding, and state comparison logic all need to use the same version that was recorded in the metadata.

Practical Notes for Implementers

SCLS Format

Details about the SCLS format can be found in the cardano-cls repository and in CIP-0165.

Transaction and Block Formats

The transaction and block payloads are encoded in CBOR using the same formats as the internal Haskell implementation. Take as reference the CDDL definitions in the cardano-ledger repository.

A transaction payload matches the transaction type in the CDDL, and a block payload consists of an issuer file (matching the $vkey type) plus a list of transaction files (each one matching the transaction type).

Failures Format

When a scenario is expected to fail, the exporter writes the expected failures in a CBOR file. The format of this file follows the same structure as the internal Haskell test suite. As of time of writing, there is no formal specification of the failure format, but it generally includes information about the type of failure, the error message, and any relevant context. A CDDL specification for the failure format should be provided in the future to give implementers a clear contract for decoding and comparing failure cases.

Epoch Size and Slot Length

The test scenarios assume a fixed epoch size and slot length, which are recorded in the globals field of the metadata. Implementers should use these values when interpreting the epoch and slot numbers in the scenarios, as well as when applying the ledger transition function.

State Comparison

If a runner cannot preserve byte-for-byte equality on exported state files, it should still compare the important canonical fields. The Haskell harness already follows that idea by checking the manifest root hash first and then falling back to namespace-level equality checks when the files are not byte-identical. Tools like scls-util can help with this by providing a way to inspect the contents of the SCLS files and compare them at the field level.

For example, you can diff two exported snapshots with scls-util:

scls-util diff ./initial-2286.scls ./final-2286.scls

This prints field-level differences in the canonical state, which is useful when byte-level equality is not expected.

Similar to other diff tools, it will also return a non-zero exit code when differences are found, or zero when the files are considered equal. This allows it to be used in test runners for asserting equality of canonical states.

Reference Test Runner Implementation

For reference, there is a sample test runner implementation in Haskell that can be found in the cardano-ledger repository. This sample runner demonstrates how to read the exported test suite, decode the initial state and transactions, apply the ledger transition function, and compare the results against the expected outcomes. Implementers can use this sample as a guide for building their own test runners in other languages.

Design Alternative: Standalone Runner Executable With Hooks

An initial design was to provide a standalone executable that would run scenarios and call user-provided hooks for loading and storing state.

During implementation, we found that most of the meaningful logic sits inside those hooks and is implementation-dependent, including:

  1. Decoding and mapping canonical state into the target ledger representation.
  2. Wiring transaction and block decoding into era- and version-specific code paths.
  3. Invoking the local ledger transition function with project-specific runtime context.
  4. Mapping and asserting failure values in the host language’s testing framework.
  5. Encoding the expected final state back into the canonical format for comparison.

Because this logic is the bulk of the work and must be integrated with each language’s own test framework, introducing a separate runner executable would add another interface to maintain without materially reducing implementation effort.

For that reason, the current direction is to standardize the exported data and runner contract, while encouraging native test-runner integrations per implementation.

More Documentation