Skip to content

Fix aoco_relation_size() using wrong snapshot to read pg_aocsseg#1668

Merged
yjhjstz merged 1 commit intoapache:mainfrom
yjhjstz:fix_aocs_snapshot
Apr 9, 2026
Merged

Fix aoco_relation_size() using wrong snapshot to read pg_aocsseg#1668
yjhjstz merged 1 commit intoapache:mainfrom
yjhjstz:fix_aocs_snapshot

Conversation

@yjhjstz
Copy link
Copy Markdown
Member

@yjhjstz yjhjstz commented Apr 8, 2026

The inconsistency between relpages=0 (from the wrong snapshot) and num_tuples=56 (from the correct catalog snapshot) hits an assertion in vac_update_relstats() that assumed this
combination — an AO table with zero pages but non-zero tuples — could only occur in utility mode. Since the QE runs in GP_ROLE_EXECUTE, the assertion Gp_role == GP_ROLE_UTILITY fails
and the segment process aborts.

Fix by passing NULL to GetAllAOCSFileSegInfo() so that systable_beginscan() uses GetCatalogSnapshot() internally, consistent with appendonly_relation_size() for AO row tables.

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


Copy link
Copy Markdown
Contributor

@my-ship-it my-ship-it left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

aoco_relation_size() used GetLatestSnapshot() to read pg_aocsseg
catalog metadata. During ALTER TABLE SET DISTRIBUTED BY on AOCO
tables, the reader gang's GetLatestSnapshot() cannot see pg_aocsseg
rows written by the writer gang within the same distributed
transaction (uncommitted local xid), causing the function to return
0 bytes. This led to relpages=0 being passed to vac_update_relstats()
alongside a non-zero totalrows from sampling (which correctly uses
GetCatalogSnapshot()), triggering an assertion failure:

  FailedAssertion: "Gp_role == GP_ROLE_UTILITY", vacuum.c:1738

Fix by passing NULL to GetAllAOCSFileSegInfo() so that
systable_beginscan() uses GetCatalogSnapshot() internally, consistent
with appendonly_relation_size() for AO row tables.
@yjhjstz yjhjstz force-pushed the fix_aocs_snapshot branch from 327560b to 513442e Compare April 9, 2026 13:02
@yjhjstz yjhjstz merged commit 32373aa into apache:main Apr 9, 2026
55 of 56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants