[client] Fix unbounded growth of stale partition entries in writer metadata cache#3367
Open
lilei1128 wants to merge 1 commit into
Open
[client] Fix unbounded growth of stale partition entries in writer metadata cache#3367lilei1128 wants to merge 1 commit into
lilei1128 wants to merge 1 commit into
Conversation
Author
|
close #3362 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #xxx
When partitions are continuously created and dropped, physical table path
entries in RecordAccumulator.writeBatches and Cluster.partitionsIdByPath
were never removed, causing unbounded memory growth and CPU waste from
iterating over stale entries in ready() and drain() hot loops.
Brief change log
Fix 1 (RecordAccumulator): introduce markPathsAsStale() and
removeStalePathIfEmpty() with a dedicated staleLock so that the
check-then-remove is atomic with respect to concurrent append() calls.
New appends to stale paths are rejected immediately.
Fix 2 (MetadataUtils): during partial metadata updates, remove stale
partition entries for any table whose partition list was refreshed, so
that dropped partitions no longer linger in partitionsIdByPath.
Fix 3 (Sender): call cleanupStaleWriteBatches() on every sendWriteData()
cycle to mark paths absent from the cluster as stale and remove them once
their deques are fully drained.
Tests
that append() is rejected after markPathsAsStale(), and that
removeStalePathIfEmpty() succeeds once the deque is drained.
verifies that removeStalePathIfEmpty() returns false and preserves the
path when pending batches remain.
verifies end-to-end that runOnce() removes a stale path from
writeBatches after the cluster no longer contains it and its deque
has been drained.
API and Format
Documentation