Skip to content

SMT2 back-end: handle unflatten for arrays when use_as_const is false#8895

Open
tautschnig wants to merge 1 commit intodiffblue:developfrom
tautschnig:fix-8637-outfile-unflatten
Open

SMT2 back-end: handle unflatten for arrays when use_as_const is false#8895
tautschnig wants to merge 1 commit intodiffblue:developfrom
tautschnig:fix-8637-outfile-unflatten

Conversation

@tautschnig
Copy link
Copy Markdown
Collaborator

When using --smt2 --outfile without an explicit solver, CBMC defaults to the GENERIC solver which has use_as_const=false. The unflatten function for array types had PRECONDITION(use_as_const) which caused a crash in this configuration.

The fix provides an alternative code path when use_as_const is false: instead of using the non-standard ((as const ArrayType) value) syntax, it uses a pre-declared auxiliary base array variable with nested store operations for all array elements.

The auxiliary arrays are pre-declared at the top level during find_symbols_rec (for type discovery) and in set_to (before define-fun emission), ensuring they are available before any expression context that might need them.

Exclude smt2-outfile-array test from incremental SMT backend as the test uses --smt2 --outfile which is incompatible with the incremental SMT2 solver backend (--incremental-smt2-solver). Add no-new-smt tag to exclude it from that test suite.

Fixes: #8637

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@tautschnig tautschnig self-assigned this Mar 26, 2026
When using --smt2 --outfile without an explicit solver, CBMC defaults
to the GENERIC solver which has use_as_const=false. The unflatten
function for array types had PRECONDITION(use_as_const) which caused
a crash in this configuration.

Provide an alternative code path when use_as_const is false: instead
of using the non-standard ((as const ArrayType) value) syntax, use a
pre-declared auxiliary base array variable with nested store operations
for all array elements.

The auxiliary arrays are pre-declared during find_symbols_rec (for type
discovery) and in set_to (before define-fun emission), ensuring they
are available before any expression context that might need them.

Fixes: diffblue#8637

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig force-pushed the fix-8637-outfile-unflatten branch from ce172b5 to ae1e512 Compare March 26, 2026 19:21
@tautschnig tautschnig changed the title fix: handle unflatten for arrays when use_as_const is false SMT2 back-end: handle unflatten for arrays when use_as_const is false Mar 26, 2026
@tautschnig tautschnig marked this pull request as ready for review March 26, 2026 19:21
Copilot AI review requested due to automatic review settings March 26, 2026 19:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a crash in the SMT2 back-end when emitting --smt2 --outfile without an explicit solver (GENERIC solver, use_as_const=false) by adding a non-as const unflatten path for arrays and extending the regression suite accordingly.

Changes:

  • Adds an auxiliary-array mechanism to support array unflattening when use_as_const is false.
  • Pre-declares auxiliary base arrays during symbol/type discovery and before define-fun emission.
  • Adds a regression test and excludes it from the incremental SMT suite via tagging.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/solvers/smt2/smt2_conv.h Adds state and a helper for managing auxiliary base arrays used by unflatten.
src/solvers/smt2/smt2_conv.cpp Implements auxiliary array declarations and a fallback unflatten path when use_as_const is false.
regression/cbmc/smt2-outfile-array/test.desc Adds a regression test for --smt2 --outfile and tags it out of incremental SMT runs.
regression/cbmc/smt2-outfile-array/main.c New C test exercising array handling that previously triggered the crash.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +4946 to +4957
std::string smt2_convt::declare_unflatten_base(const array_typet &array_type)
{
auto it = unflatten_cache.find(array_type);
if(it != unflatten_cache.end())
return it->second;
std::string name = "unflatten_base_" + std::to_string(unflatten_counter++);
out << "(declare-fun " << name << " () ";
convert_type(array_type);
out << ")\n";
unflatten_cache[array_type] = name;
return name;
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare_unflatten_base emits SMT-LIB commands directly to out, and it’s called from find_symbols_rec, which is conceptually a discovery pass. Emitting (declare-fun ...) during symbol/type discovery risks producing output in an invalid order (e.g., before any required prologue such as set-logic, or interleaved with other command phases), depending on how/when find_symbols_rec is invoked. A more robust approach is to make find_symbols_rec record required auxiliary decls (type -> name) without writing to out, and then emit all recorded (declare-fun ...) commands in a dedicated, well-defined top-level declaration phase (or at least gate emission behind a ‘currently emitting declarations’ state).

Copilot uses AI. Check for mistakes.
Comment on lines 6078 to +6085
const array_typet &array_type=to_array_type(type);
find_symbols(array_type.size());
find_symbols_rec(array_type.element_type(), recstack);

// Pre-declare auxiliary base arrays for unflatten
// when use_as_const is not available.
if(!use_as_const && array_type.size().is_constant())
declare_unflatten_base(array_type);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare_unflatten_base emits SMT-LIB commands directly to out, and it’s called from find_symbols_rec, which is conceptually a discovery pass. Emitting (declare-fun ...) during symbol/type discovery risks producing output in an invalid order (e.g., before any required prologue such as set-logic, or interleaved with other command phases), depending on how/when find_symbols_rec is invoked. A more robust approach is to make find_symbols_rec record required auxiliary decls (type -> name) without writing to out, and then emit all recorded (declare-fun ...) commands in a dedicated, well-defined top-level declaration phase (or at least gate emission behind a ‘currently emitting declarations’ state).

Copilot uses AI. Check for mistakes.
Comment on lines +5016 to +5042
else
{
// use_as_const is not available; use a
// pre-declared auxiliary base array with stores
// for all elements
auto cache_it = unflatten_cache.find(array_type);
INVARIANT(
cache_it != unflatten_cache.end(),
"unflatten auxiliary must be pre-declared");

for(mp_integer i = 0; i < size; ++i)
out << "(store ";

out << cache_it->second;

std::size_t offset = 0;
for(mp_integer i = 0; i < size; ++i, offset += subtype_width)
{
out << ' ';
convert_expr(from_integer(i, array_type.index_type()));
out << ' ';
unflatten(wheret::BEGIN, array_type.element_type(), nesting + 1);
out << "((_ extract " << offset + subtype_width - 1 << " " << offset
<< ") ?ufop" << nesting << ")";
unflatten(wheret::END, array_type.element_type(), nesting + 1);
out << ")"; // store
}
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback builds the array by storing all in-range elements onto an unconstrained auxiliary base array. This changes semantics compared to the use_as_const path, where all out-of-range indices are forced to a specific default (element 0). With the unconstrained base, any select at an index not explicitly stored remains unconstrained, which can introduce spurious counterexamples (or otherwise change solver behavior) if such selects can occur. If the intention is semantic equivalence with the previous encoding, the base needs a defined default value; if that’s not expressible without as const, consider (a) documenting/guarding the assumption that only stored indices are ever selected in this mode, or (b) switching to array-theory encoding for these cases to avoid relying on an unconstrained default.

Copilot uses AI. Check for mistakes.
Comment on lines 4986 to 4987
mp_integer size =
numeric_cast_v<mp_integer>(to_constant_expr(array_type.size()));
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new unflatten path has a hard dependency on unflatten_cache being populated (invariant failure otherwise), but the pre-declaration in find_symbols_rec is conditional on array_type.size().is_constant(), while unflatten also assumes the size is constant (via to_constant_expr). It would be clearer and less brittle to centralize and align these preconditions: either (1) explicitly INVARIANT(array_type.size().is_constant(), ...) in unflatten with a targeted message, and ensure the pre-declaration pass is guaranteed to run for exactly the same set of array types that can reach this path, or (2) change the pre-declaration/caching to be driven by the same code path that decides unflattening is needed (so a missing pre-scan cannot crash).

Copilot uses AI. Check for mistakes.
Comment on lines +5021 to +5024
auto cache_it = unflatten_cache.find(array_type);
INVARIANT(
cache_it != unflatten_cache.end(),
"unflatten auxiliary must be pre-declared");
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new unflatten path has a hard dependency on unflatten_cache being populated (invariant failure otherwise), but the pre-declaration in find_symbols_rec is conditional on array_type.size().is_constant(), while unflatten also assumes the size is constant (via to_constant_expr). It would be clearer and less brittle to centralize and align these preconditions: either (1) explicitly INVARIANT(array_type.size().is_constant(), ...) in unflatten with a targeted message, and ensure the pre-declaration pass is guaranteed to run for exactly the same set of array types that can reach this path, or (2) change the pre-declaration/caching to be driven by the same code path that decides unflattening is needed (so a missing pre-scan cannot crash).

Copilot uses AI. Check for mistakes.
Comment on lines +6084 to +6085
if(!use_as_const && array_type.size().is_constant())
declare_unflatten_base(array_type);
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new unflatten path has a hard dependency on unflatten_cache being populated (invariant failure otherwise), but the pre-declaration in find_symbols_rec is conditional on array_type.size().is_constant(), while unflatten also assumes the size is constant (via to_constant_expr). It would be clearer and less brittle to centralize and align these preconditions: either (1) explicitly INVARIANT(array_type.size().is_constant(), ...) in unflatten with a targeted message, and ensure the pre-declaration pass is guaranteed to run for exactly the same set of array types that can reach this path, or (2) change the pre-declaration/caching to be driven by the same code path that decides unflattening is needed (so a missing pre-scan cannot crash).

Suggested change
if(!use_as_const && array_type.size().is_constant())
declare_unflatten_base(array_type);
if(!use_as_const)
{
INVARIANT(
array_type.size().is_constant(),
"unflatten is only supported for arrays with constant size");
declare_unflatten_base(array_type);
}

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.41%. Comparing base (59d2211) to head (ae1e512).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #8895   +/-   ##
========================================
  Coverage    80.41%   80.41%           
========================================
  Files         1703     1703           
  Lines       188398   188429   +31     
  Branches        73       73           
========================================
+ Hits        151502   151530   +28     
- Misses       36896    36899    +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Printing SMT file with Z3 backend crashes CBMC

2 participants