SMT2 back-end: handle unflatten for arrays when use_as_const is false#8895
SMT2 back-end: handle unflatten for arrays when use_as_const is false#8895tautschnig wants to merge 1 commit intodiffblue:developfrom
Conversation
When using --smt2 --outfile without an explicit solver, CBMC defaults to the GENERIC solver which has use_as_const=false. The unflatten function for array types had PRECONDITION(use_as_const) which caused a crash in this configuration. Provide an alternative code path when use_as_const is false: instead of using the non-standard ((as const ArrayType) value) syntax, use a pre-declared auxiliary base array variable with nested store operations for all array elements. The auxiliary arrays are pre-declared during find_symbols_rec (for type discovery) and in set_to (before define-fun emission), ensuring they are available before any expression context that might need them. Fixes: diffblue#8637 Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
ce172b5 to
ae1e512
Compare
There was a problem hiding this comment.
Pull request overview
Fixes a crash in the SMT2 back-end when emitting --smt2 --outfile without an explicit solver (GENERIC solver, use_as_const=false) by adding a non-as const unflatten path for arrays and extending the regression suite accordingly.
Changes:
- Adds an auxiliary-array mechanism to support array unflattening when
use_as_constis false. - Pre-declares auxiliary base arrays during symbol/type discovery and before
define-funemission. - Adds a regression test and excludes it from the incremental SMT suite via tagging.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/solvers/smt2/smt2_conv.h | Adds state and a helper for managing auxiliary base arrays used by unflatten. |
| src/solvers/smt2/smt2_conv.cpp | Implements auxiliary array declarations and a fallback unflatten path when use_as_const is false. |
| regression/cbmc/smt2-outfile-array/test.desc | Adds a regression test for --smt2 --outfile and tags it out of incremental SMT runs. |
| regression/cbmc/smt2-outfile-array/main.c | New C test exercising array handling that previously triggered the crash. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| std::string smt2_convt::declare_unflatten_base(const array_typet &array_type) | ||
| { | ||
| auto it = unflatten_cache.find(array_type); | ||
| if(it != unflatten_cache.end()) | ||
| return it->second; | ||
| std::string name = "unflatten_base_" + std::to_string(unflatten_counter++); | ||
| out << "(declare-fun " << name << " () "; | ||
| convert_type(array_type); | ||
| out << ")\n"; | ||
| unflatten_cache[array_type] = name; | ||
| return name; | ||
| } |
There was a problem hiding this comment.
declare_unflatten_base emits SMT-LIB commands directly to out, and it’s called from find_symbols_rec, which is conceptually a discovery pass. Emitting (declare-fun ...) during symbol/type discovery risks producing output in an invalid order (e.g., before any required prologue such as set-logic, or interleaved with other command phases), depending on how/when find_symbols_rec is invoked. A more robust approach is to make find_symbols_rec record required auxiliary decls (type -> name) without writing to out, and then emit all recorded (declare-fun ...) commands in a dedicated, well-defined top-level declaration phase (or at least gate emission behind a ‘currently emitting declarations’ state).
| const array_typet &array_type=to_array_type(type); | ||
| find_symbols(array_type.size()); | ||
| find_symbols_rec(array_type.element_type(), recstack); | ||
|
|
||
| // Pre-declare auxiliary base arrays for unflatten | ||
| // when use_as_const is not available. | ||
| if(!use_as_const && array_type.size().is_constant()) | ||
| declare_unflatten_base(array_type); |
There was a problem hiding this comment.
declare_unflatten_base emits SMT-LIB commands directly to out, and it’s called from find_symbols_rec, which is conceptually a discovery pass. Emitting (declare-fun ...) during symbol/type discovery risks producing output in an invalid order (e.g., before any required prologue such as set-logic, or interleaved with other command phases), depending on how/when find_symbols_rec is invoked. A more robust approach is to make find_symbols_rec record required auxiliary decls (type -> name) without writing to out, and then emit all recorded (declare-fun ...) commands in a dedicated, well-defined top-level declaration phase (or at least gate emission behind a ‘currently emitting declarations’ state).
| else | ||
| { | ||
| // use_as_const is not available; use a | ||
| // pre-declared auxiliary base array with stores | ||
| // for all elements | ||
| auto cache_it = unflatten_cache.find(array_type); | ||
| INVARIANT( | ||
| cache_it != unflatten_cache.end(), | ||
| "unflatten auxiliary must be pre-declared"); | ||
|
|
||
| for(mp_integer i = 0; i < size; ++i) | ||
| out << "(store "; | ||
|
|
||
| out << cache_it->second; | ||
|
|
||
| std::size_t offset = 0; | ||
| for(mp_integer i = 0; i < size; ++i, offset += subtype_width) | ||
| { | ||
| out << ' '; | ||
| convert_expr(from_integer(i, array_type.index_type())); | ||
| out << ' '; | ||
| unflatten(wheret::BEGIN, array_type.element_type(), nesting + 1); | ||
| out << "((_ extract " << offset + subtype_width - 1 << " " << offset | ||
| << ") ?ufop" << nesting << ")"; | ||
| unflatten(wheret::END, array_type.element_type(), nesting + 1); | ||
| out << ")"; // store | ||
| } |
There was a problem hiding this comment.
The fallback builds the array by storing all in-range elements onto an unconstrained auxiliary base array. This changes semantics compared to the use_as_const path, where all out-of-range indices are forced to a specific default (element 0). With the unconstrained base, any select at an index not explicitly stored remains unconstrained, which can introduce spurious counterexamples (or otherwise change solver behavior) if such selects can occur. If the intention is semantic equivalence with the previous encoding, the base needs a defined default value; if that’s not expressible without as const, consider (a) documenting/guarding the assumption that only stored indices are ever selected in this mode, or (b) switching to array-theory encoding for these cases to avoid relying on an unconstrained default.
| mp_integer size = | ||
| numeric_cast_v<mp_integer>(to_constant_expr(array_type.size())); |
There was a problem hiding this comment.
The new unflatten path has a hard dependency on unflatten_cache being populated (invariant failure otherwise), but the pre-declaration in find_symbols_rec is conditional on array_type.size().is_constant(), while unflatten also assumes the size is constant (via to_constant_expr). It would be clearer and less brittle to centralize and align these preconditions: either (1) explicitly INVARIANT(array_type.size().is_constant(), ...) in unflatten with a targeted message, and ensure the pre-declaration pass is guaranteed to run for exactly the same set of array types that can reach this path, or (2) change the pre-declaration/caching to be driven by the same code path that decides unflattening is needed (so a missing pre-scan cannot crash).
| auto cache_it = unflatten_cache.find(array_type); | ||
| INVARIANT( | ||
| cache_it != unflatten_cache.end(), | ||
| "unflatten auxiliary must be pre-declared"); |
There was a problem hiding this comment.
The new unflatten path has a hard dependency on unflatten_cache being populated (invariant failure otherwise), but the pre-declaration in find_symbols_rec is conditional on array_type.size().is_constant(), while unflatten also assumes the size is constant (via to_constant_expr). It would be clearer and less brittle to centralize and align these preconditions: either (1) explicitly INVARIANT(array_type.size().is_constant(), ...) in unflatten with a targeted message, and ensure the pre-declaration pass is guaranteed to run for exactly the same set of array types that can reach this path, or (2) change the pre-declaration/caching to be driven by the same code path that decides unflattening is needed (so a missing pre-scan cannot crash).
| if(!use_as_const && array_type.size().is_constant()) | ||
| declare_unflatten_base(array_type); |
There was a problem hiding this comment.
The new unflatten path has a hard dependency on unflatten_cache being populated (invariant failure otherwise), but the pre-declaration in find_symbols_rec is conditional on array_type.size().is_constant(), while unflatten also assumes the size is constant (via to_constant_expr). It would be clearer and less brittle to centralize and align these preconditions: either (1) explicitly INVARIANT(array_type.size().is_constant(), ...) in unflatten with a targeted message, and ensure the pre-declaration pass is guaranteed to run for exactly the same set of array types that can reach this path, or (2) change the pre-declaration/caching to be driven by the same code path that decides unflattening is needed (so a missing pre-scan cannot crash).
| if(!use_as_const && array_type.size().is_constant()) | |
| declare_unflatten_base(array_type); | |
| if(!use_as_const) | |
| { | |
| INVARIANT( | |
| array_type.size().is_constant(), | |
| "unflatten is only supported for arrays with constant size"); | |
| declare_unflatten_base(array_type); | |
| } |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #8895 +/- ##
========================================
Coverage 80.41% 80.41%
========================================
Files 1703 1703
Lines 188398 188429 +31
Branches 73 73
========================================
+ Hits 151502 151530 +28
- Misses 36896 36899 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
When using --smt2 --outfile without an explicit solver, CBMC defaults to the GENERIC solver which has use_as_const=false. The unflatten function for array types had PRECONDITION(use_as_const) which caused a crash in this configuration.
The fix provides an alternative code path when use_as_const is false: instead of using the non-standard ((as const ArrayType) value) syntax, it uses a pre-declared auxiliary base array variable with nested store operations for all array elements.
The auxiliary arrays are pre-declared at the top level during find_symbols_rec (for type discovery) and in set_to (before define-fun emission), ensuring they are available before any expression context that might need them.
Exclude smt2-outfile-array test from incremental SMT backend as the test uses --smt2 --outfile which is incompatible with the incremental SMT2 solver backend (--incremental-smt2-solver). Add no-new-smt tag to exclude it from that test suite.
Fixes: #8637