Skip to content

InsertSync: guard sync motion by branch predicate domain#515

Open
TaoTao-real wants to merge 1 commit intohw-native-sys:mainfrom
TaoTao-real:codex/issue-nested-if-sync-motion-guard-clean
Open

InsertSync: guard sync motion by branch predicate domain#515
TaoTao-real wants to merge 1 commit intohw-native-sys:mainfrom
TaoTao-real:codex/issue-nested-if-sync-motion-guard-clean

Conversation

@TaoTao-real
Copy link
Copy Markdown
Contributor

Summary

  • Add predicate-domain guard in MoveSyncState so sync motion (hoist/sink) only happens when target anchor is in the same effective branch predicate domain as the sync's paired peer (set/wait).
  • Add regression test issue_nested_if_predicate_sync_motion_guard.pto to lock nested-if behavior.

Background

  • In complex nested-if scenarios, moving wait/set across the nearest enclosing control-flow predicate can cause mismatch risk (e.g. outer-if false path executes moved sync while original producer path does not).
  • This PR makes motion conservative for predicate crossing while preserving existing legal loop/if motion opportunities inside the same predicate domain.

Design

  • New helpers in MoveSyncState:
    • GetBranchPredicateKey(syncIRIndex)
    • HasSameBranchPredicate(lhs, rhs)
    • CanMoveToPredicateCompatibleAnchor(sync, targetIndex)
  • Motion paths guarded:
    • PlanMoveOutIfWaitSync
    • PlanMoveOutIfSetSync
    • PlanMoveOutWaitSync
    • PlanMoveOutSetSync

Validation

  • Built ptoas locally on this branch.
  • Targeted checks:
    • New nested-if repro: generated C++ shows set_flag/wait_flag(PIPE_MTE1, PIPE_M, EVENT_ID0) remain inside outer if before inner if.
    • Existing regressions re-generated and inspected:
      • issue454_loop_if_else_loop_carried_sync_regression.pto
      • issue454_nested_loop_same_pipe_pair_regression.pto

Notes

  • Environment lacks FileCheck binary in this worktree, so validation used generated C++ structural inspection (rg on emitted output).

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a branch-predicate guard to ensure that synchronization operations remain within the same execution predicate as their dependency anchors during motion optimizations. It adds logic to calculate predicate keys based on enclosing branch structures and uses these keys to validate whether a sync operation can be safely hoisted or sunk across branch or loop boundaries. A regression test case for nested-if sync motion is also included. The review feedback highlights potential performance issues due to $O(N^2)$ complexity in predicate key calculation, logical fragility in handling default dependency indices, and the need to use push_front instead of push_back to maintain the original program order of sync operations when they are not moved.

Comment on lines +23 to +55
MoveSyncState::GetBranchPredicateKey(unsigned int syncIRIndex) const {
SmallVector<int, 8> key;
if (syncIRIndex >= syncIR_.size()) {
return key;
}

for (const auto &element : syncIR_) {
auto *branch = dyn_cast<BranchInstanceElement>(element.get());
if (!branch || branch->getBranchKind() != KindOfBranch::IF_BEGIN) {
continue;
}
if (!(branch->beginId < branch->endId)) {
continue;
}
if (syncIRIndex <= branch->beginId || syncIRIndex >= branch->endId) {
continue;
}

// Encode each enclosing if's side into a stable token:
// token*4+0 => then-side
// token*4+1 => else-side
// token*4+2 => else-boundary marker
const int token = static_cast<int>(branch->beginId) + 1;
if (syncIRIndex < branch->branchId) {
key.push_back(token * 4 + 0);
} else if (syncIRIndex > branch->branchId) {
key.push_back(token * 4 + 1);
} else {
key.push_back(token * 4 + 2);
}
}
return key;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of GetBranchPredicateKey iterates over the entire syncIR_ vector on every call to find enclosing branches. Since this function is called multiple times for every sync operation during the transformation (via HasSameBranchPredicate), it results in $O(N^2)$ complexity relative to the IR size.

Consider pre-calculating these predicate keys once at the beginning of the Run() method or implementing a cache to store the keys for each instruction index. This will significantly improve performance for large IRs.

Comment on lines +89 to +93
const unsigned depIndex = sync->GetDepSyncIRIndex();
if (depIndex >= syncIR_.size()) {
return false;
}
return HasSameBranchPredicate(depIndex, targetIndex);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The depSyncIRIndex in the SyncOperation class defaults to 0. If a barrier-type sync operation has not had its dependency anchor explicitly set, this fallback logic will compare the target predicate against the predicate of the first instruction in the IR (index 0). While this is likely conservative (as the first instruction is usually in the root predicate domain), it is logically fragile. It would be safer to initialize depSyncIRIndex_ to an invalid value (e.g., ~0U) and check for that value here before proceeding.

if ((waitSync->GetSyncIRIndex() >= pair.second) ||
(waitSync->GetSyncIRIndex() <= pair.first)) {
if (!CanMoveToPredicateCompatibleAnchor(s, bound.second)) {
newPipeAfter.push_back(s);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the caller PlanMoveOutBranchSync iterates over e->pipeAfter in reverse order (line 157), using push_back here will cause the relative order of sync operations that are not moved to be reversed when they are reassigned to e->pipeAfter. To maintain the original program order for these syncs, you should use push_front instead.

Suggested change
newPipeAfter.push_back(s);
newPipeAfter.push_front(s);

if ((waitSync->GetSyncIRIndex() > pair.second) ||
(waitSync->GetSyncIRIndex() < pair.first)) {
if (!CanMoveToPredicateCompatibleAnchor(s, pair.second)) {
newPipeAfter.push_back(s);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the logic in branch motion, the caller MoveOutSync iterates over e->pipeAfter in reverse order (line 270). Using push_back for syncs that stay in place will reverse their original relative order. Use push_front to preserve the order in the newPipeAfter deque.

Suggested change
newPipeAfter.push_back(s);
newPipeAfter.push_front(s);

@reedhecre
Copy link
Copy Markdown

Codex Review

该评论由 review 机器人自动更新。

  • PR: InsertSync: guard sync motion by branch predicate domain #515 InsertSync: guard sync motion by branch predicate domain
  • Author: TaoTao-real
  • Base/Head: main / codex/issue-nested-if-sync-motion-guard-clean
  • Head SHA: 884da7f15bfd
  • Trigger: 检测到新的 open PR
  • Generated At: 2026-04-18T16:00:22Z
  • Status: completed

Summary

未检查到 PR #515 存在问题

Findings

No issues found.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants