Skip to content

Comments

[ET-VK][q8ta] Fix addmm arg indexing in QuantizedLinearMatch#17567

Open
SS-JIA wants to merge 6 commits intogh/SS-JIA/441/basefrom
gh/SS-JIA/441/head
Open

[ET-VK][q8ta] Fix addmm arg indexing in QuantizedLinearMatch#17567
SS-JIA wants to merge 6 commits intogh/SS-JIA/441/basefrom
gh/SS-JIA/441/head

Conversation

@SS-JIA
Copy link
Contributor

@SS-JIA SS-JIA commented Feb 19, 2026

Stack from ghstack (oldest at bottom):

QuantizedLinearMatch always used args[1] for the weight and args[0] for the
input, which is correct for mm(input, weight) and linear(input, weight, bias?)
but wrong for addmm(bias, input, weight) where the weight is at args[2] and the
input is at args[1].

This was exposed by a torchao change (D69887498) that added Linear+BatchNorm
fusion to prepare_pt2e(). The fusion adds a bias to Linear nodes that previously
had none, causing them to decompose to addmm instead of mm in the edge dialect.
The pattern matcher then read the input's per-tensor dequantize scale (a float
literal) as if it were the weight's per-channel scale (a Node), causing an
assertion failure.

The fix determines the correct arg indices based on whether the anchor node is
addmm. The bias handling at args[0] for addmm was already correct.

Authored-by: Claude

Differential Revision: D93768640

QuantizedLinearMatch always used args[1] for the weight and args[0] for the
input, which is correct for mm(input, weight) and linear(input, weight, bias?)
but wrong for addmm(bias, input, weight) where the weight is at args[2] and the
input is at args[1].

This was exposed by a torchao change (D69887498) that added Linear+BatchNorm
fusion to prepare_pt2e(). The fusion adds a bias to Linear nodes that previously
had none, causing them to decompose to addmm instead of mm in the edge dialect.
The pattern matcher then read the input's per-tensor dequantize scale (a float
literal) as if it were the weight's per-channel scale (a Node), causing an
assertion failure.

The fix determines the correct arg indices based on whether the anchor node is
addmm. The bias handling at args[0] for addmm was already correct.

Authored-by: Claude

Differential Revision: [D93768640](https://our.internmc.facebook.com/intern/diff/D93768640/)

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17567

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 162 Pending

As of commit 42103c0 with merge base 9a58ce8 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@github-actions
Copy link

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

ssjia added 5 commits February 20, 2026 15:58
QuantizedLinearMatch always used args[1] for the weight and args[0] for the
input, which is correct for mm(input, weight) and linear(input, weight, bias?)
but wrong for addmm(bias, input, weight) where the weight is at args[2] and the
input is at args[1].

This was exposed by a torchao change (D69887498) that added Linear+BatchNorm
fusion to prepare_pt2e(). The fusion adds a bias to Linear nodes that previously
had none, causing them to decompose to addmm instead of mm in the edge dialect.
The pattern matcher then read the input's per-tensor dequantize scale (a float
literal) as if it were the weight's per-channel scale (a Node), causing an
assertion failure.

The fix determines the correct arg indices based on whether the anchor node is
addmm. The bias handling at args[0] for addmm was already correct.

Authored-by: Claude

Differential Revision: [D93768640](https://our.internmc.facebook.com/intern/diff/D93768640/)

[ghstack-poisoned]
QuantizedLinearMatch always used args[1] for the weight and args[0] for the
input, which is correct for mm(input, weight) and linear(input, weight, bias?)
but wrong for addmm(bias, input, weight) where the weight is at args[2] and the
input is at args[1].

This was exposed by a torchao change (D69887498) that added Linear+BatchNorm
fusion to prepare_pt2e(). The fusion adds a bias to Linear nodes that previously
had none, causing them to decompose to addmm instead of mm in the edge dialect.
The pattern matcher then read the input's per-tensor dequantize scale (a float
literal) as if it were the weight's per-channel scale (a Node), causing an
assertion failure.

The fix determines the correct arg indices based on whether the anchor node is
addmm. The bias handling at args[0] for addmm was already correct.

Authored-by: Claude

Differential Revision: [D93768640](https://our.internmc.facebook.com/intern/diff/D93768640/)

[ghstack-poisoned]
QuantizedLinearMatch always used args[1] for the weight and args[0] for the
input, which is correct for mm(input, weight) and linear(input, weight, bias?)
but wrong for addmm(bias, input, weight) where the weight is at args[2] and the
input is at args[1].

This was exposed by a torchao change (D69887498) that added Linear+BatchNorm
fusion to prepare_pt2e(). The fusion adds a bias to Linear nodes that previously
had none, causing them to decompose to addmm instead of mm in the edge dialect.
The pattern matcher then read the input's per-tensor dequantize scale (a float
literal) as if it were the weight's per-channel scale (a Node), causing an
assertion failure.

The fix determines the correct arg indices based on whether the anchor node is
addmm. The bias handling at args[0] for addmm was already correct.

Authored-by: Claude

Differential Revision: [D93768640](https://our.internmc.facebook.com/intern/diff/D93768640/)

[ghstack-poisoned]
QuantizedLinearMatch always used args[1] for the weight and args[0] for the
input, which is correct for mm(input, weight) and linear(input, weight, bias?)
but wrong for addmm(bias, input, weight) where the weight is at args[2] and the
input is at args[1].

This was exposed by a torchao change (D69887498) that added Linear+BatchNorm
fusion to prepare_pt2e(). The fusion adds a bias to Linear nodes that previously
had none, causing them to decompose to addmm instead of mm in the edge dialect.
The pattern matcher then read the input's per-tensor dequantize scale (a float
literal) as if it were the weight's per-channel scale (a Node), causing an
assertion failure.

The fix determines the correct arg indices based on whether the anchor node is
addmm. The bias handling at args[0] for addmm was already correct.

Authored-by: Claude

Differential Revision: [D93768640](https://our.internmc.facebook.com/intern/diff/D93768640/)

[ghstack-poisoned]
QuantizedLinearMatch always used args[1] for the weight and args[0] for the
input, which is correct for mm(input, weight) and linear(input, weight, bias?)
but wrong for addmm(bias, input, weight) where the weight is at args[2] and the
input is at args[1].

This was exposed by a torchao change (D69887498) that added Linear+BatchNorm
fusion to prepare_pt2e(). The fusion adds a bias to Linear nodes that previously
had none, causing them to decompose to addmm instead of mm in the edge dialect.
The pattern matcher then read the input's per-tensor dequantize scale (a float
literal) as if it were the weight's per-channel scale (a Node), causing an
assertion failure.

The fix determines the correct arg indices based on whether the anchor node is
addmm. The bias handling at args[0] for addmm was already correct.

Authored-by: Claude

Differential Revision: [D93768640](https://our.internmc.facebook.com/intern/diff/D93768640/)

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants