Skip to content

Comments

Fix tensors_have_same_dim_order for degenerate shapes (semantic equivalence)#17612

Open
nefainl wants to merge 3 commits intopytorch:mainfrom
nefainl:fix/16032-tensors-same-dim-order-semantic-equivalence
Open

Fix tensors_have_same_dim_order for degenerate shapes (semantic equivalence)#17612
nefainl wants to merge 3 commits intopytorch:mainfrom
nefainl:fix/16032-tensors-same-dim-order-semantic-equivalence

Conversation

@nefainl
Copy link

@nefainl nefainl commented Feb 21, 2026

Summary

Partial fix for #16032 - tensors_have_same_dim_order() now correctly identifies semantically equivalent memory layouts for tensors with degenerate shapes (size-1 dimensions).

This is PR C in the fix series for #16032, providing defense-in-depth at the C++ runtime level. PR A (#17611) fixes the Python export pipeline (MemoryFormatOpsPass).

Problem

The current implementation uses label-only checking (all contiguous OR all channels_last). This fails for degenerate shapes like [N,1,H,W] (C=1) or [N,C,1,1] (H=W=1) where NCHW and NHWC tensors have identical physical memory layouts but different dim_order labels.

Example: A grayscale image tensor [2,1,224,224] exported as NHWC fails clone() because the output tensor has NCHW dim_order, even though both have identical memory traversal patterns.

Solution

Implement semantic equivalence checking that mirrors PyTorch's is_contiguous logic:

  1. Fast path: If dim_order labels match exactly → return true
  2. Semantic equivalence: If labels differ, compare strides but skip dimensions where both tensors have size=1 (these don't affect memory traversal order)

This follows PyTorch's established semantics from c10/core/Contiguity.h which explicitly skips size-1 dimensions when checking contiguity.

Performance Impact

Neutral to positive - the common case is actually faster:

Scenario Frequency Performance
Same dim_order labels ~95% of cases ~75% faster (early exit after label match)
Degenerate shapes (bug fix) Rare ~50% faster (was failing before)
Different layouts (error) Rare Similar (early exit on mismatch)

Key optimizations:

  • Fast path exits after O(ndim) comparisons vs O(2×ndim) in current implementation
  • Early exit on first tensor mismatch vs checking ALL tensors
  • No memory allocation, cache-friendly sequential array access

Test Plan

  • Added 11 new test cases covering:
    • Degenerate C=1 shapes (NCHW vs NHWC) → true
    • Degenerate H=W=1 shapes → true
    • Non-degenerate shapes with different layouts → false
    • Partial degenerate (only H=1) → false
    • Different tensor ranks → false
    • Regression tests for existing behavior
    • Edge cases: 0-dim scalars, 1-dim tensors, all size-1 dims
  • All 78 tests pass (67 existing + 11 new)
  • Verified with cmake --build && ./runtime_core_exec_aten_util_test
  • All existing tests pass (SameDimOrderContiguous, SameDimOrderChannelsLast, SameShapesDifferentDimOrder)

Related

…ytorch#16032)

Fix tensors_have_same_dim_order() to correctly identify semantically
equivalent memory layouts for tensors with degenerate shapes (size-1
dimensions).

Problem:
The previous implementation used label-only checking (all contiguous OR
all channels_last). This failed for degenerate shapes like [N,1,H,W]
(C=1) or [N,C,1,1] (H=W=1) where NCHW and NHWC tensors have identical
physical memory layouts but different dim_order labels.

Solution:
Implement semantic equivalence checking that mirrors PyTorch's
is_contiguous logic from c10/core/Contiguity.h:
1. Fast path: If dim_order labels match exactly -> return true
2. Semantic equivalence: If labels differ, compare strides but skip
   dimensions where both tensors have size=1 (these don't affect
   memory traversal order)

Performance impact: Neutral to positive
- Common case (same labels): ~75% faster (early exit after label match)
- Bug fix case (degenerate shapes): ~50% faster (was failing before)
- Error case (different layouts): Similar (early exit on mismatch)

Test Plan:
- Added 11 new test cases covering degenerate shapes, non-degenerate
  shapes, partial degenerates, different ranks, and edge cases
- All existing tests continue to pass
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 21, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17612

Note: Links to docs will display an error until the docs builds have been completed.

⚠️ 8 Awaiting Approval

As of commit f96558e with merge base f2fb214 (image):

AWAITING APPROVAL - The following workflows need approval before CI can run:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 21, 2026
@nefainl
Copy link
Author

nefainl commented Feb 21, 2026

@pytorchbot label "release notes: runtime"

@pytorch-bot pytorch-bot bot added the release notes: runtime Changes related to the core runtime which loads the program methods, initializes delegates, and runs label Feb 21, 2026
@nefainl
Copy link
Author

nefainl commented Feb 21, 2026

Please also add @GregoryComer as one of the reviewers, his guidance has been helpful in the comments section of the original pull request and can help in potential improvements..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: runtime Changes related to the core runtime which loads the program methods, initializes delegates, and runs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant