Skip to content

HBASE-30043: Truncate procedure with recovery enabled could process non-existent regions if run concurrently with merge procedure#8003

Open
dParikesit wants to merge 2 commits intoapache:masterfrom
dParikesit:6-merge-truncate
Open

HBASE-30043: Truncate procedure with recovery enabled could process non-existent regions if run concurrently with merge procedure#8003
dParikesit wants to merge 2 commits intoapache:masterfrom
dParikesit:6-merge-truncate

Conversation

@dParikesit
Copy link
Copy Markdown
Contributor

JIRA: HBASE-30043

During truncate recovery, the procedure can release the region lock between states. If that region gets merged away in that gap, the truncate procedure could still try to unassign/reassign or touch filesystem state for a region that no longer exists.

The proposed fix adds an existence check before MAKE_OFFLINE, centralizes region-node lookup in a helper that throws UnknownRegionException when the region is gone, and makes rollback only reassign if the region still exists and is in an offline/closed state.

The tests verify two things: truncate fails cleanly instead of recreating state or scheduling child procedures, and rollback also does not recreate the removed region or queue assigned work.

@Apache9
Copy link
Copy Markdown
Contributor

Apache9 commented Mar 30, 2026

Is it OK to just change holdLock to true for TruncationRegionProcedure? A region which is being truncating should not be able to be split or merge...

Copy link
Copy Markdown
Contributor

@vaijosh vaijosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dParikesit for fixing this. Code changes LGTM.

@dParikesit
Copy link
Copy Markdown
Contributor Author

Thanks for the suggestions. I'll try the holdLock fix and get back to you after I fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants