Skip to content

test: quint spec for stf#15

Open
tac0turtle wants to merge 11 commits intomainfrom
marko/quint_trial
Open

test: quint spec for stf#15
tac0turtle wants to merge 11 commits intomainfrom
marko/quint_trial

Conversation

@tac0turtle
Copy link
Contributor

@tac0turtle tac0turtle commented Feb 24, 2026

Overview

This pr is a trial of testing quint for stf formal verification. There are a few simple tests

Summary by CodeRabbit

  • Tests

    • Added comprehensive test infrastructure for STF (State Transition Function) conformance validation, including support modules for core execution, post-transaction handling, and call depth verification.
  • Documentation

    • Added formal specifications defining STF behavior for core operations, post-transaction processing, and call depth limits, enabling model-based testing.
  • Chores

    • Updated build configuration with testing dependencies and added test and specification build recipes.

@coderabbitai
Copy link

coderabbitai bot commented Feb 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3e18f17-f4c6-4904-afc7-8d6625993b25

📥 Commits

Reviewing files that changed from the base of the PR and between cb0dc79 and 51c60eb.

📒 Files selected for processing (3)
  • crates/app/stf/Cargo.toml
  • crates/app/stf/tests/quint_common.rs
  • justfile
🚧 Files skipped from review as they are similar to previous changes (3)
  • crates/app/stf/tests/quint_common.rs
  • crates/app/stf/Cargo.toml
  • justfile

📝 Walkthrough

Walkthrough

This PR introduces formal Quint specifications for STF (State Transition Function) semantics and corresponding Rust test harnesses for conformance validation. Three new specs model call-depth constraints, core ledger operations, and post-transaction execution. Three test modules implement Quint-compatible driver harnesses to replay these specifications against the STF implementation. Configuration updates add token generation and spec conformance test workflows.

Changes

Cohort / File(s) Summary
Formatting & Dependencies
crates/app/sdk/extensions/token/src/lib.rs, crates/app/stf/Cargo.toml
Removed blank line in test code; added dev-dependencies (anyhow, quint-connect, serde) for test infrastructure.
Quint Specifications
specs/stf_call_depth.qnt, specs/stf_core.qnt, specs/stf_post_tx.qnt
Three new formal STF specifications: call-depth constraint model (~193 lines), comprehensive ledger/transaction/block/gas model (~779 lines), and post-tx handler semantics (~392 lines). Each includes actions, invariants, witnesses, and extensive test suites.
Rust Test Harnesses
crates/app/stf/tests/quint_common.rs, crates/app/stf/tests/quint_core_connect.rs, crates/app/stf/tests/quint_post_tx_connect.rs
New test support module providing in-memory storage, noop STF components, and helper utilities; two Quint-compatible Driver harnesses (~438 and ~376 lines) to validate core and post-tx STF behavior via test state models, transaction/block abstractions, result conversion, and error normalization.
Build & Test Workflows
justfile
Added token code generation recipe (evolve_specgen) and four spec-related test recipes (spec-test-core, spec-test-extended, spec-test, spec-traces) for Quint and conformance testing.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hops through specs so shiny new,
Call depths bounded, storage true,
Tests and harnesses align,
Formal models, oh so fine!
Quint and Rust in harmony—
STF conformance verified! 🎉

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'test: quint spec for stf' clearly and concisely summarizes the primary change: introducing Quint specifications and test infrastructure for STF validation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch marko/quint_trial

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tac0turtle tac0turtle marked this pull request as ready for review March 9, 2026 14:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (2)
specs/traces/out_postTxRejectsButKeepsStateTest_0.itf.json (1)

1-1: Normalize volatile generator metadata before committing fixtures.

#meta.description and #meta.timestamp will change on every regeneration without changing STF behavior, so they will create noisy fixture diffs and avoidable merge conflicts. If the harness does not assert on them, strip or normalize those fields before committing the ITF files.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/traces/out_postTxRejectsButKeepsStateTest_0.itf.json` at line 1, The
ITF fixture contains volatile metadata fields "#meta.description" and
"#meta.timestamp" that change on every generation and cause noisy diffs; update
the trace-writing path that emits these records (where the fixture JSON is
produced) to strip or normalize "#meta.description" and "#meta.timestamp" before
writing/committing—either remove those keys entirely or replace them with fixed
placeholders (e.g., empty string or zero) so regenerated traces do not include
volatile values while preserving all functional fields ("#meta", "vars",
"states", etc.).
crates/app/stf/tests/quint_post_tx_connect.rs (1)

223-230: Fail fast on invalid driver steps.

register_account currently accepts duplicates, and apply_registered_tx never checks that the recipient is actually registered. When a trace violates those preconditions, the harness replays an impossible step instead of failing at the boundary.

♻️ Proposed guard rails
 fn register_account(&mut self, recipient: u64) -> Result {
+    if self.accounts().contains(&recipient) {
+        return Err(anyhow!("account {recipient} is already registered"));
+    }
     register_account(
         &mut self.storage,
         AccountId::from_u64(recipient),
         "test_account",
     );
@@
 fn apply_registered_tx(
     &mut self,
     recipient: u64,
     gas_limit: u64,
     fail_e: bool,
     reject_pt: bool,
 ) -> Result {
+    if !self.accounts().contains(&recipient) {
+        return Err(anyhow!(
+            "account {recipient} must be registered before apply_registered_tx"
+        ));
+    }
     let msg = TestMsg {
         key: WRITE_KEY.to_vec(),
         value: WRITE_VALUE.to_vec(),
         fail_after_write: fail_e,
     };

As per coding guidelines Fail fast with assertions for invariants and explicit error handling in Rust.

Also applies to: 232-238

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/app/stf/tests/quint_post_tx_connect.rs` around lines 223 - 230, Update
the test harness to fail fast on invalid driver steps: make register_account
(the harness method) check for duplicate registrations and return Err (or
assert!) when the AccountId already exists instead of silently accepting
duplicates, and update apply_registered_tx to validate that the recipient
AccountId is registered before replaying the transaction, returning an error or
panicking immediately if the precondition is violated; use the existing
register_account helper and apply_registered_tx identifiers to locate the code
and ensure both explicit error handling paths are added so invalid traces fail
at the boundary rather than proceeding.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/app/stf/tests/quint_call_depth_conformance.rs`:
- Around line 155-183: The suite never tests the depth-exceeded path because
every ConformanceCase in known_test_cases() has expect_ok: true; add a failing
case that triggers ERR_CALL_DEPTH_EXCEEDED by appending a ConformanceCase with a
distinctive test_name (e.g., "depthExceededTest"), set requested_depth to a
value above the allowed limit (one greater than the max used by other cases,
e.g., 4), and set expect_ok: false so the branch handling the error (the code
around the current error-path) is exercised and validated.

In `@crates/app/stf/tests/quint_common.rs`:
- Line 14: The test harness imports and types use hashbrown::HashMap causing
nondeterministic iteration; replace that with an ordered map type
(std::collections::BTreeMap) across this file: change the import (remove use
hashbrown::HashMap), update type declarations and return types for CodeStore,
InMemoryStorage, and any storage helper functions that currently use HashMap to
use BTreeMap instead, and update all construction/usage sites (including the
regions around the referenced sections ~185-220 and ~272-302) so iteration order
is deterministic and consistent with list_identifiers().
- Around line 295-297: The code silently wraps out-of-range ITF bytes by using
`as u8` when building `key` and `value` from `key_itf`/`value_itf` in the loop
over `account_store_itf.entries`; replace the unchecked casts with a fallible
conversion (e.g. `u8::try_from(b.as_u64())` or `b.as_u64().try_into()`) and
either propagate or panic with a clear message so malformed traces fail fast
(apply to both the `key` and `value` mappings).

In `@crates/app/stf/tests/quint_conformance.rs`:
- Around line 40-45: The test state struct ItfState is missing the accounts
field so final assertions (in bootstrapTest and bootstrapFailureTest) only check
storage and miss whether senders were registered; add an accounts field to
ItfState matching the test harness account map type used elsewhere (e.g.,
ItfMap<ItfBigInt, ItfAccount> or the appropriate ItfAccount value type) and
update the final assertions in bootstrapTest and bootstrapFailureTest to
explicitly assert the traced account set (presence/absence and any expected
account metadata) in addition to the existing storage checks so a missing sender
registration will fail the test.

In `@crates/app/stf/tests/quint_core_connect.rs`:
- Around line 427-438: The #[quint_run] test quint_core_qc_run depends on an
external quint CLI that is not available in CI; either install the Quint
toolchain in your CI workflow before running tests, or gate the test so it’s
only compiled/executed when the environment provides Quint (e.g., add
#[cfg(feature = "quint")] or #[cfg(env = "QUINT_AVAILABLE")] above the fn
quint_core_qc_run and enable that feature/env only in dev machines/CI that
install Quint); apply the same gating change to the other test that uses
#[quint_run] to avoid breaking CI.

In `@justfile`:
- Around line 252-257: The spec-traces justfile recipe currently only
regenerates traces from stf_core.qnt; update the recipe (spec-traces) to
regenerate traces for all spec files by running quint test against stf_core.qnt,
stf_post_tx.qnt, and stf_call_depth.qnt (or a glob like specs/*.qnt) and remove
old traces as before, and also update the preceding comment to say it
regenerates all spec traces rather than only the core traces; reference the
spec-traces target and the spec filenames stf_core.qnt, stf_post_tx.qnt, and
stf_call_depth.qnt when making the change.

In `@specs/stf_call_depth.qnt`:
- Around line 174-192: Add a new run that exercises the actual call-depth
boundary by repeatedly invoking do_exec (or do_exec_rejected if defined) from
init until the stack length reaches MAX_CALL_DEPTH, assert that the last allowed
call succeeds and call_stack.length() == MAX_CALL_DEPTH and entries are correct,
then attempt one more call that should return ERR_CALL_DEPTH and assert that
call_stack remains unchanged; reference the existing init, do_exec,
do_exec_rejected, call_stack, MAX_CALL_DEPTH, and ERR_CALL_DEPTH symbols when
adding this test so it mirrors the recursiveCallsTest pattern but stops at
MAX_CALL_DEPTH and checks the rejected-call behavior.

In `@specs/stf_core.qnt`:
- Around line 418-441: The transition step currently uses apply_block with an
arbitrary recipient, allowing generation of storage for unregistered accounts
and violating the intended abstraction; update the step transition so execution
goes through the registered-account path (mirror qc_step) by either replacing
the apply_block call with apply_registered_block or constraining the nondet
recipient in action step to only registered accounts (e.g., ensure recipient is
chosen from registered accounts or call apply_registered_block with tx), and
keep register_account as the other non-deterministic choice so
storage_accounts_registered becomes invariant under the transition relation.

In `@specs/stf_post_tx.qnt`:
- Around line 181-198: The action "step" can immediately violate the invariant
"storage_accounts_registered" because the nondeterministic branch calls
apply_txs([tx]) without ensuring the recipient is registered; fix by guarding
the apply_txs branch so it only runs when recipient is already registered or by
composing the branch to perform register_account(recipient) before apply_txs;
specifically update the action step (the nondet branch that invokes
apply_txs([tx]) in the step action) to check membership in accounts (or call
register_account(recipient) first) so that writing storage[recipient] cannot
occur for an unregistered recipient.

---

Nitpick comments:
In `@crates/app/stf/tests/quint_post_tx_connect.rs`:
- Around line 223-230: Update the test harness to fail fast on invalid driver
steps: make register_account (the harness method) check for duplicate
registrations and return Err (or assert!) when the AccountId already exists
instead of silently accepting duplicates, and update apply_registered_tx to
validate that the recipient AccountId is registered before replaying the
transaction, returning an error or panicking immediately if the precondition is
violated; use the existing register_account helper and apply_registered_tx
identifiers to locate the code and ensure both explicit error handling paths are
added so invalid traces fail at the boundary rather than proceeding.

In `@specs/traces/out_postTxRejectsButKeepsStateTest_0.itf.json`:
- Line 1: The ITF fixture contains volatile metadata fields "#meta.description"
and "#meta.timestamp" that change on every generation and cause noisy diffs;
update the trace-writing path that emits these records (where the fixture JSON
is produced) to strip or normalize "#meta.description" and "#meta.timestamp"
before writing/committing—either remove those keys entirely or replace them with
fixed placeholders (e.g., empty string or zero) so regenerated traces do not
include volatile values while preserving all functional fields ("#meta", "vars",
"states", etc.).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e8a79d3a-3d9a-475a-b92e-4ea3a5c05e5c

📥 Commits

Reviewing files that changed from the base of the PR and between adb107e and cb0dc79.

📒 Files selected for processing (35)
  • crates/app/sdk/extensions/token/src/lib.rs
  • crates/app/stf/Cargo.toml
  • crates/app/stf/tests/quint_call_depth_conformance.rs
  • crates/app/stf/tests/quint_common.rs
  • crates/app/stf/tests/quint_conformance.rs
  • crates/app/stf/tests/quint_core_connect.rs
  • crates/app/stf/tests/quint_post_tx_conformance.rs
  • crates/app/stf/tests/quint_post_tx_connect.rs
  • justfile
  • specs/stf_call_depth.qnt
  • specs/stf_core.qnt
  • specs/stf_post_tx.qnt
  • specs/traces/out_blockGasLimitTest_5.itf.json
  • specs/traces/out_bootstrapFailureTest_8.itf.json
  • specs/traces/out_bootstrapTest_7.itf.json
  • specs/traces/out_cannotReturnFromEmptyTest_4.itf.json
  • specs/traces/out_emptyBlockTest_0.itf.json
  • specs/traces/out_executionFailureRollbackTest_3.itf.json
  • specs/traces/out_fullUnwindTest_3.itf.json
  • specs/traces/out_happyPathTest_2.itf.json
  • specs/traces/out_mixedOutcomesTest_6.itf.json
  • specs/traces/out_mixedPostTxTest_3.itf.json
  • specs/traces/out_multiBlockTest_10.itf.json
  • specs/traces/out_nestedCallsTest_1.itf.json
  • specs/traces/out_outOfGasIgnoresPostTxTest_4.itf.json
  • specs/traces/out_outOfGasTest_4.itf.json
  • specs/traces/out_overwriteTest_11.itf.json
  • specs/traces/out_postTxDoesNotMaskExecFailureTest_1.itf.json
  • specs/traces/out_postTxRejectsButKeepsStateTest_0.itf.json
  • specs/traces/out_recursiveCallsTest_5.itf.json
  • specs/traces/out_returnUnwindsStackTest_2.itf.json
  • specs/traces/out_singleCallTest_0.itf.json
  • specs/traces/out_successfulTxTest_1.itf.json
  • specs/traces/out_unregisteredSenderTest_9.itf.json
  • specs/traces/out_validationFailureTest_2.itf.json
💤 Files with no reviewable changes (1)
  • crates/app/sdk/extensions/token/src/lib.rs

Comment on lines +155 to +183
fn known_test_cases() -> Vec<ConformanceCase> {
vec![
ConformanceCase {
test_name: "singleCallTest",
requested_depth: 1,
expect_ok: true,
},
ConformanceCase {
test_name: "nestedCallsTest",
requested_depth: 3,
expect_ok: true,
},
ConformanceCase {
test_name: "returnUnwindsStackTest",
requested_depth: 2,
expect_ok: true,
},
ConformanceCase {
test_name: "fullUnwindTest",
requested_depth: 2,
expect_ok: true,
},
ConformanceCase {
test_name: "recursiveCallsTest",
requested_depth: 3,
expect_ok: true,
},
]
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a depth-exceeded case.

Every ConformanceCase currently sets expect_ok: true, so the branch at Lines 253-260 is dead and the suite never validates the actual call-depth boundary or ERR_CALL_DEPTH_EXCEEDED. One failing trace/case is needed to make this file test depth enforcement rather than only successful recursion.

Also applies to: 253-260

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/app/stf/tests/quint_call_depth_conformance.rs` around lines 155 - 183,
The suite never tests the depth-exceeded path because every ConformanceCase in
known_test_cases() has expect_ok: true; add a failing case that triggers
ERR_CALL_DEPTH_EXCEEDED by appending a ConformanceCase with a distinctive
test_name (e.g., "depthExceededTest"), set requested_depth to a value above the
allowed limit (one greater than the max used by other cases, e.g., 4), and set
expect_ok: false so the branch handling the error (the code around the current
error-path) is exercised and validated.

AccountsCodeStorage, BeginBlocker, EndBlocker, PostTxExecution, StateChange, TxValidator,
WritableKV,
};
use hashbrown::HashMap;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use ordered maps in this shared harness.

CodeStore, InMemoryStorage, and the storage helper return types are all HashMap-based. list_identifiers() already exposes that iteration order directly, so the conformance harness is still carrying nondeterminism in the one place it should be deterministic.

♻️ Minimal refactor
-use hashbrown::HashMap;
 use serde::de::DeserializeOwned;
 use serde::Deserialize;
+use std::collections::BTreeMap;
 use std::io::BufReader;
 use std::path::Path;
@@
 pub struct CodeStore {
-    codes: HashMap<String, Box<dyn AccountCode>>,
+    codes: BTreeMap<String, Box<dyn AccountCode>>,
 }
@@
         Self {
-            codes: HashMap::new(),
+            codes: BTreeMap::new(),
         }
@@
 pub struct InMemoryStorage {
-    pub data: HashMap<Vec<u8>, Vec<u8>>,
+    pub data: BTreeMap<Vec<u8>, Vec<u8>>,
 }
@@
 ) -> HashMap<Vec<u8>, Vec<u8>> {
     let prefix = account.as_bytes();
-    let mut result = HashMap::new();
+    let mut result = BTreeMap::new();
@@
 ) -> HashMap<Vec<u8>, Vec<u8>> {
-    let mut expected = HashMap::new();
+    let mut expected = BTreeMap::new();

As per coding guidelines, "Use BTreeMap or BTreeSet instead of HashMap or HashSet to maintain deterministic iteration order".

Also applies to: 185-220, 272-302

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/app/stf/tests/quint_common.rs` at line 14, The test harness imports
and types use hashbrown::HashMap causing nondeterministic iteration; replace
that with an ordered map type (std::collections::BTreeMap) across this file:
change the import (remove use hashbrown::HashMap), update type declarations and
return types for CodeStore, InMemoryStorage, and any storage helper functions
that currently use HashMap to use BTreeMap instead, and update all
construction/usage sites (including the regions around the referenced sections
~185-220 and ~272-302) so iteration order is deterministic and consistent with
list_identifiers().

Comment on lines +40 to +45
#[derive(Deserialize)]
struct ItfState {
block_height: ItfBigInt,
last_result: ItfBlockResult,
storage: ItfMap<ItfBigInt, ItfMap<Vec<ItfBigInt>, Vec<ItfBigInt>>>,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Assert the traced account set too.

ItfState drops accounts, and the final assertions at Lines 673-679 only compare user storage. That makes bootstrapTest and bootstrapFailureTest blind to a missing sender registration, because “account exists with empty storage” and “account does not exist” both satisfy the current checks.

Also applies to: 405-441, 673-679

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/app/stf/tests/quint_conformance.rs` around lines 40 - 45, The test
state struct ItfState is missing the accounts field so final assertions (in
bootstrapTest and bootstrapFailureTest) only check storage and miss whether
senders were registered; add an accounts field to ItfState matching the test
harness account map type used elsewhere (e.g., ItfMap<ItfBigInt, ItfAccount> or
the appropriate ItfAccount value type) and update the final assertions in
bootstrapTest and bootstrapFailureTest to explicitly assert the traced account
set (presence/absence and any expected account metadata) in addition to the
existing storage checks so a missing sender registration will fail the test.

justfile Outdated
Comment on lines +252 to +257
# Regenerate ITF traces from core Quint spec (run after editing specs/stf_core.qnt)
[group('spec')]
spec-traces:
rm -f specs/traces/*.itf.json
quint test specs/stf_core.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

spec-traces recipe only regenerates core traces, missing extended specs.

The recipe only regenerates traces from stf_core.qnt, but the project also has stf_post_tx.qnt and stf_call_depth.qnt specs. If someone edits those specs and runs just spec-traces, the traces will be stale.

Proposed fix to regenerate all spec traces
 # Regenerate ITF traces from core Quint spec (run after editing specs/stf_core.qnt)
 [group('spec')]
 spec-traces:
     rm -f specs/traces/*.itf.json
     quint test specs/stf_core.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"
+    quint test specs/stf_post_tx.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"
+    quint test specs/stf_call_depth.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"

Also update the comment to reflect regenerating all specs:

-# Regenerate ITF traces from core Quint spec (run after editing specs/stf_core.qnt)
+# Regenerate ITF traces from all Quint specs (run after editing any specs/*.qnt)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Regenerate ITF traces from core Quint spec (run after editing specs/stf_core.qnt)
[group('spec')]
spec-traces:
rm -f specs/traces/*.itf.json
quint test specs/stf_core.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"
# Regenerate ITF traces from core Quint spec (run after editing specs/stf_core.qnt)
[group('spec')]
spec-traces:
rm -f specs/traces/*.itf.json
quint test specs/stf_core.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"
quint test specs/stf_post_tx.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"
quint test specs/stf_call_depth.qnt --out-itf "specs/traces/out_{test}_{seq}.itf.json"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@justfile` around lines 252 - 257, The spec-traces justfile recipe currently
only regenerates traces from stf_core.qnt; update the recipe (spec-traces) to
regenerate traces for all spec files by running quint test against stf_core.qnt,
stf_post_tx.qnt, and stf_call_depth.qnt (or a glob like specs/*.qnt) and remove
old traces as before, and also update the preceding comment to say it
regenerates all spec traces rather than only the core traces; reference the
spec-traces target and the spec filenames stf_core.qnt, stf_post_tx.qnt, and
stf_call_depth.qnt when making the change.

Comment on lines +174 to +192
/// Test: cannot return from empty stack.
run cannotReturnFromEmptyTest =
init
.then(return_from_call)
.fail()

/// Test: recursive calls (same account) track depth.
run recursiveCallsTest =
init
.then(do_exec(1))
.then(do_exec(1))
.then(do_exec(1))
.then(all {
assert(call_stack.length() == 3),
assert(call_stack[0] == 1),
assert(call_stack[1] == 1),
assert(call_stack[2] == 1),
stutter,
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a run that hits the actual call-depth boundary.

The module models do_exec_rejected, but none of the run blocks exercise it. That leaves the main edge case here untested, so the committed ITF traces will not catch an off-by-one bug at depth 64. Please add a run that fills the stack to MAX_CALL_DEPTH, verifies the last allowed call succeeds, then verifies the next call returns ERR_CALL_DEPTH and leaves call_stack unchanged.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/stf_call_depth.qnt` around lines 174 - 192, Add a new run that
exercises the actual call-depth boundary by repeatedly invoking do_exec (or
do_exec_rejected if defined) from init until the stack length reaches
MAX_CALL_DEPTH, assert that the last allowed call succeeds and
call_stack.length() == MAX_CALL_DEPTH and entries are correct, then attempt one
more call that should return ERR_CALL_DEPTH and assert that call_stack remains
unchanged; reference the existing init, do_exec, do_exec_rejected, call_stack,
MAX_CALL_DEPTH, and ERR_CALL_DEPTH symbols when adding this test so it mirrors
the recursiveCallsTest pattern but stops at MAX_CALL_DEPTH and checks the
rejected-call behavior.

Comment on lines +418 to +441
action step = {
nondet id = MC_ACCOUNTS.oneOf()
nondet sender = MC_ACCOUNTS.oneOf()
nondet recipient = MC_ACCOUNTS.oneOf()
nondet gas_limit = MC_GAS_LIMITS.oneOf()
nondet block_gas = MC_BLOCK_GAS_LIMITS.oneOf()
nondet fail_v = Set(true, false).oneOf()
nondet fail_e = Set(true, false).oneOf()
nondet needs_b = Set(true, false).oneOf()
nondet fail_b = Set(true, false).oneOf()
val tx: Tx = {
sender: sender, recipient: recipient, gas_limit: gas_limit,
write_key: K1, write_value: V1,
fail_validate: fail_v, fail_execute: fail_e,
needs_bootstrap: needs_b, fail_bootstrap: fail_b,
}
any {
apply_block({
height: block_height + 1, time: 0,
txs: [tx], gas_limit: block_gas,
}),
register_account(id),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

step still explores states the runtime rejects.

qc_step already routes execution through apply_registered_block, but step still calls apply_block with an arbitrary recipient. That lets model checking produce storage for unregistered accounts, so storage_accounts_registered is not actually invariant under the current transition relation.

🛠️ One way to align the transition relation
 action step = {
   nondet id = MC_ACCOUNTS.oneOf()
   nondet sender = MC_ACCOUNTS.oneOf()
   nondet recipient = MC_ACCOUNTS.oneOf()
   nondet gas_limit = MC_GAS_LIMITS.oneOf()
   nondet block_gas = MC_BLOCK_GAS_LIMITS.oneOf()
   nondet fail_v = Set(true, false).oneOf()
   nondet fail_e = Set(true, false).oneOf()
   nondet needs_b = Set(true, false).oneOf()
   nondet fail_b = Set(true, false).oneOf()
-  val tx: Tx = {
-    sender: sender, recipient: recipient, gas_limit: gas_limit,
-    write_key: K1, write_value: V1,
-    fail_validate: fail_v, fail_execute: fail_e,
-    needs_bootstrap: needs_b, fail_bootstrap: fail_b,
-  }
   any {
-    apply_block({
-      height: block_height + 1, time: 0,
-      txs: [tx], gas_limit: block_gas,
-    }),
+    apply_registered_block(sender, recipient, gas_limit, block_gas, fail_v, fail_e, needs_b, fail_b),
     register_account(id),
   }
 }

If the looser abstraction is intentional, then INV-5 needs to be relaxed instead of being checked as an invariant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/stf_core.qnt` around lines 418 - 441, The transition step currently
uses apply_block with an arbitrary recipient, allowing generation of storage for
unregistered accounts and violating the intended abstraction; update the step
transition so execution goes through the registered-account path (mirror
qc_step) by either replacing the apply_block call with apply_registered_block or
constraining the nondet recipient in action step to only registered accounts
(e.g., ensure recipient is chosen from registered accounts or call
apply_registered_block with tx), and keep register_account as the other
non-deterministic choice so storage_accounts_registered becomes invariant under
the transition relation.

Comment on lines +181 to +198
action step = {
nondet recipient = MC_ACCOUNTS.oneOf()
nondet gas_limit = Set(1, 100, 10000).oneOf()
nondet fail_e = Set(true, false).oneOf()
nondet reject_pt = Set(true, false).oneOf()
val tx: Tx = {
recipient: recipient,
gas_limit: gas_limit,
write_key: K1,
write_value: V1,
fail_execute: fail_e,
reject_post_tx: reject_pt,
}
any {
apply_txs([tx]),
register_account(recipient),
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

step can violate storage_accounts_registered immediately.

The apply_txs([tx]) branch on Lines 194-197 has no registration guard, so a successful write creates storage[recipient] while accounts stays unchanged. Any invariant/model-checking run over step can therefore falsify storage_accounts_registered by construction.

🔧 Minimal fix
   action step = {
     nondet recipient = MC_ACCOUNTS.oneOf()
     nondet gas_limit = Set(1, 100, 10000).oneOf()
     nondet fail_e = Set(true, false).oneOf()
     nondet reject_pt = Set(true, false).oneOf()
-    val tx: Tx = {
-      recipient: recipient,
-      gas_limit: gas_limit,
-      write_key: K1,
-      write_value: V1,
-      fail_execute: fail_e,
-      reject_post_tx: reject_pt,
-    }
     any {
-      apply_txs([tx]),
+      apply_registered_tx(recipient, gas_limit, fail_e, reject_pt),
       register_account(recipient),
     }
   }

Also applies to: 225-227

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@specs/stf_post_tx.qnt` around lines 181 - 198, The action "step" can
immediately violate the invariant "storage_accounts_registered" because the
nondeterministic branch calls apply_txs([tx]) without ensuring the recipient is
registered; fix by guarding the apply_txs branch so it only runs when recipient
is already registered or by composing the branch to perform
register_account(recipient) before apply_txs; specifically update the action
step (the nondet branch that invokes apply_txs([tx]) in the step action) to
check membership in accounts (or call register_account(recipient) first) so that
writing storage[recipient] cannot occur for an unregistered recipient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant