Skip to content

fix: Enable asm requantized max pooling with differing qinfo#1285

Open
morgolock wants to merge 1 commit intomainfrom
pr/enable_max_pool_asm_diff_qinfo
Open

fix: Enable asm requantized max pooling with differing qinfo#1285
morgolock wants to merge 1 commit intomainfrom
pr/enable_max_pool_asm_diff_qinfo

Conversation

@morgolock
Copy link
Copy Markdown
Contributor

Restrict differing src/dst quantization-info support in CpuPool2dAssemblyWrapperKernel to MAX pooling, while keeping AVG pooling on the generic fallback path.

Fix the quantized multiplier validation check and wire the requantization shifts correctly for the asm pooling path so requantized MAX pooling validates and executes correctly.

Add NEON validate coverage for padded NHWC QASYMM8 MAX with differing quantization info.

Resolves MLCE-1821

Change-Id: I33c99d0d4ea1bf57ed28d0750422403e60e2a276

@morgolock morgolock requested a review from gunes-arm April 27, 2026 22:05
Copy link
Copy Markdown
Contributor

@gunes-arm gunes-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix: enable asm requantized max pooling with differing qinfo
-->
fix: Enable asm requantized max pooling with differing qinfo

Comment thread src/cpu/kernels/internal/CpuPool2dAssemblyWrapperKernel.cpp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to accept qasymm8_signed, I think that wasn't the case.

Copy link
Copy Markdown
Contributor Author

@morgolock morgolock Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have asm kernels for this use-case when qinfos are the same. So this is correct, I'll add new tests for this in the next patchset.

See a64_s8_nhwc_max_generic_depthfirst

I'll tighten this to reject q8s avg pool when padding=true

Comment thread tests/validation/NEON/PoolingLayer.cpp Outdated
TensorInfo input_info_configured = input_info;
TensorInfo output_info_configured = output_info;

if(input_info_configured.tensor_shape() == TensorShape(3U, 15U, 11U, 1U))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit hacky, I would rather prefer a simple TEST_CASE for this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will be reworked in next patchset.

Comment thread tests/validation/NEON/PoolingLayer.cpp
Restrict differing src/dst quantization-info support in CpuPool2dAssemblyWrapperKernel to MAX pooling, while keeping AVG pooling on the generic fallback path.

Fix the quantized multiplier validation check and wire the requantization shifts correctly for the asm pooling path so requantized MAX pooling validates and executes correctly.

Reject padded QASYMM8_SIGNED AVG pooling in the asm wrapper for same-qinfo configurations, matching the existing QASYMM8 policy.

Add coverage for QASYMM8_SIGNED padded MAX pooling on the asm path for same qinfo and zero-offset differing qinfo.
Add validate coverage for padded NHWC QASYMM8 MAX with differing quantization info.

Resolves MLCE-1821

Signed-off-by: Pablo Marquez Tello <pablo.tello@arm.com>
Change-Id: I33c99d0d4ea1bf57ed28d0750422403e60e2a276
@morgolock morgolock force-pushed the pr/enable_max_pool_asm_diff_qinfo branch from e640361 to 447d3ad Compare April 28, 2026 12:00
@morgolock morgolock requested a review from gunes-arm April 28, 2026 12:03
@gunes-arm gunes-arm changed the title fix: enable asm requantized max pooling with differing qinfo fix: Enable asm requantized max pooling with differing qinfo Apr 29, 2026
input_info.set_quantization_info(QuantizationInfo(0.25f, 11));
output_info.set_quantization_info(QuantizationInfo(0.5f, 7));

const bool is_valid = bool(NEPoolingLayer::validate(&input_info.clone()->set_is_resizable(false),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to clone here. It was for constant objects as far as I recall.

}
TEST_SUITE_END() // QASYMM8
TEST_SUITE(QASYMM8_SIGNED)
FIXTURE_DATA_TEST_CASE(PaddedMaxSameQInfo,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add QASYMM8 tests, too? They use different underlying kernels. Probably, we need to put arbitrary quantization infos to src and dst as they don't have to be 0 offset.

@gunes-arm gunes-arm self-requested a review April 29, 2026 10:32
Copy link
Copy Markdown
Contributor

@gunes-arm gunes-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidental approval last time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants