Skip to content

Fix pending HostDB DNS queue removal race#13294

Open
bneradt wants to merge 1 commit into
apache:masterfrom
bneradt:fix-hostdb-pending-dns-race
Open

Fix pending HostDB DNS queue removal race#13294
bneradt wants to merge 1 commit into
apache:masterfrom
bneradt:fix-hostdb-pending-dns-race

Conversation

@bneradt

@bneradt bneradt commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

A crash was observed while HostDB was probing pending DNS state for a
request, with the stack unwinding through the DNS lookup path:

#4 swoc::bwf::ExternalNames::operator()(...)
   at libswoc_1.5.15/include/swoc/bwf_base.h:610
#7 HostDBContinuation::do_dns(...)
   at src/iocore/hostdb/HostDB.cc:1352
#11 probe(...)
   at src/iocore/hostdb/HostDB.cc:581

The core showed an obviously corrupted ExternalNames this pointer while
HostDB was allocating a continuation from the pending DNS path. The local
signal cleanup path could edit the pending-DNS queue without the bucket
lock, leaving stale revalidation readers able to walk links while they
were being changed.

This routes that cleanup through the locked pending-DNS removal helper
and uses its result to decide whether the continuation still owns
self-cleanup. This keeps queue membership checks and removals
synchronized without changing the timeout path's lifetime behavior.

@bneradt bneradt added this to the 11.0.0 milestone Jun 17, 2026
Copilot AI review requested due to automatic review settings June 17, 2026 22:54
@bneradt bneradt self-assigned this Jun 17, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

A crash was observed while HostDB was probing pending DNS state for a
request, with the stack unwinding through the DNS lookup path:

    #4 swoc::bwf::ExternalNames::operator()(...)
       at libswoc_1.5.15/include/swoc/bwf_base.h:610
    #7 HostDBContinuation::do_dns(...)
       at src/iocore/hostdb/HostDB.cc:1352
    apache#11 probe(...)
       at src/iocore/hostdb/HostDB.cc:581

The core showed an obviously corrupted ExternalNames this pointer while
HostDB was allocating a continuation from the pending DNS path. The local
signal cleanup path could edit the pending-DNS queue without the bucket
lock, leaving stale revalidation readers able to walk links while they
were being changed.

This routes that cleanup through the locked pending-DNS removal helper
and uses its result to decide whether the continuation still owns
self-cleanup. This keeps queue membership checks and removals
synchronized without changing the timeout path's lifetime behavior.
@bneradt bneradt force-pushed the fix-hostdb-pending-dns-race branch from bd24b72 to 8de3716 Compare June 17, 2026 23:30
@bneradt bneradt requested a review from Copilot June 18, 2026 15:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status
Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants