Add quantize fused convbn bias pass#17348
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17348
Note: Links to docs will display an error until the docs builds have been completed. ❌ 8 New FailuresAs of commit 6b825e9 with merge base 6c1dc31 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@JakeStevens has exported this pull request. If you are a Meta employee, you can view the originating Diff in D92733079. |
This PR needs a
|
b935be8 to
91bbac7
Compare
Summary: When performing QAT with a model that has a conv layer with no bias followed by batch norm, the fusion process creates a bias. This is done *after* observers are attached so the resulting bias is kept as float. This diff adds a pass which grabs the proper qparams and applies them to the non-quantized bias. Differential Revision: D92733079
Summary: When performing QAT with a model that has a conv layer with no bias followed by batch norm, the fusion process creates a bias. This is done *after* observers are attached so the resulting bias is kept as float. This diff adds a pass which grabs the proper qparams and applies them to the non-quantized bias. Differential Revision: D92733079
91bbac7 to
6b825e9
Compare
StrycekSimon
left a comment
There was a problem hiding this comment.
I tried running it with our conversion pipeline but not successfully. Seems like the bias is being added as another input of the model. Can you take a look at it? Or is there some postprocessing step needed I am missing?
| return node.target in ( | ||
| exir_ops.edge.aten.convolution.default, | ||
| torch.ops.aten.convolution.default, | ||
| torch.ops.aten.conv2d.default, |
There was a problem hiding this comment.
Can you add also transposed convs here? Or is there a reason for omitting them?
| ) | ||
|
|
||
| pass_instance = QuantizeFusedConvBnBiasPass(exported_program) | ||
| result = pass_instance.call(exported_program.graph_module) |
There was a problem hiding this comment.
Is there some postprocessing needed after this line of code? I tried plugging it into our pipeline right after the calibrate_and_quantize call (same as here) and it fails on: "TypeError: missing a required argument: 'b__scale_0'" when trying to call export(result.graph_module, ...) which we do here.
There was a problem hiding this comment.
Can you put up a draft pr to show how you tried plugging it in? Then I can literate on that and make sure it works appropriately
Summary:
When performing QAT with a conv layer (bias=False) followed by batch norm, the fusion process introduces a bias after observers are attached, so the bias remains unquantized. These passes find such biases, compute the correct scale from the input and weight dequantize nodes, and insert proper quantize/dequantize nodes for the bias.
Two pass variants are provided:
- QuantizeFusedConvBnBiasPass (ExportPass) — operates on edge dialect graphs after to_edge()
- QuantizeFusedConvBnBiasAtenPass (PassBase) — operates on aten dialect graphs, supporting both plain GraphModules (get_attr nodes) and ExportedPrograms (placeholder nodes)
Differential Revision: D92733079
|
@StrycekSimon the NXP changes and test are now here: This diff is now "standalone" pass and the integration with your backend in the above |
Summary:
When performing QAT with a model that has a conv layer with no bias followed by batch norm, the fusion process creates a bias. This is done after observers are attached so the resulting bias is kept as float.
This diff adds a pass which grabs the proper qparams and applies them to the non-quantized bias.
Differential Revision: D92733079
cc @robert-kalmar @digantdesai