feat: detect server-side document rejections on a 200 ingest#23
Merged
Conversation
A Quickwit ingest 200 only means the documents were queued — its response body reports num_ingested_docs / num_rejected_docs, and a parse-rejected document (bad JSON or schema) is dropped while the client still saw 200. Today the client discards the body and Acks those documents as durable: silent loss (quickwit-oss/quickwit#5356, #3190). flush() now reads the response (bounded to 1 MiB) and inspects it: - Truncated/unreadable 200 -> retryable (ambiguous, don't settle). - num_rejected_docs > 0 with num_ingested_docs == 0 (whole batch rejected) -> settle tracked items with a new terminal ReasonRejected and discard fire-and-forget items. The verdict is exact because nothing was ingested. Thanks to idle-flush, a low-volume IngestSync document is flushed as its own one-doc batch, so the common pub/sub case gets a precise per-document verdict. - Partial rejection (some ingested, some rejected) can't be attributed to specific docs in a coalesced batch, so the batch is still Acked and the count is surfaced via a new OnReject(numRejected) callback + a warning log. - Absent/empty/unparseable body or missing fields (older servers) -> accepted, preserving the accept-on-200 behavior. Adds OnReject, a ReasonRejected DiscardReason, and an opt-in SetIngestDetailedResponse(true) that requests ?detailed_response=true so per-document parse-failure reasons are logged. Fire-and-forget batching, ordering, the failure taxonomy, and the no-double-settle invariant are unchanged (the all-rejected path settles each item once and drops the batch). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The highest-stakes reliability item from the review (#1): close the silent-loss hole where Quickwit returns 200 but dropped documents.
Problem
A Quickwit ingest 200 only means the docs were queued. The response body reports
num_ingested_docs/num_rejected_docs, and a parse-rejected document (bad JSON or schema) is dropped while the client still sees 200. Todayflush()doesio.Copy(io.Discard, resp.Body)and Acks those docs as durable — silent loss (quickwit-oss/quickwit#5356, #3190).Change
flush()now reads the response (bounded to 1 MiB) and inspects it:num_rejected_docs > 0andnum_ingested_docs == 0(whole batch rejected)ReasonRejected(terminal), discard fire-and-forget — exact, nothing was ingestednum_ingested_docs > 0andnum_rejected_docs > 0)OnReject(n)+ warning — can't attribute per-doc in a coalesced batchIdle-flush makes the common case precise. A low-volume
IngestSyncdocument is flushed as its own one-doc batch (from #20), so a rejected doc hits the exactnum_ingested_docs == 0path and the caller gets a preciseReasonRejectedverdict — not a guess.API added
ReasonRejected(DiscardReason) — terminal: dead-letter and Ack.OnReject(func(numRejected int))— the signal for partial rejections the client can't attribute.SetIngestDetailedResponse(true)— opt-in?detailed_response=trueso per-document parse-failure reasons are logged (off by default; adds server cost).Invariants
Fire-and-forget batching, ordering, the 4xx failure taxonomy, and the no-double-settle guarantee are unchanged — the all-rejected path settles each item exactly once and drops the batch (so it's never retried or double-settled). Lenient parsing means servers that don't return the rejection JSON behave exactly as before.
Honest limitation
Partial rejection within a coalesced batch can't be attributed to specific documents, so those docs are Acked and surfaced only via
OnReject/log. One-doc-per-IngestSync+ idle-flush avoids this for the typical pub/sub flow; high-volume coalesced batches degrade to count-only reporting.Tests
7 new (
ingest_reject_test.go): all-rejected →ReasonRejected+OnReject; no-rejection → nil; partial → nil +OnReject(count); unknown/missing-fields body → nil; truncated 200 (hijack-close) → retried to success;?detailed_response=truesent; fire-and-forget all-rejected →OnDiscard+ Close returns. Full suite green,go vetclean, race-clean across-count=2and the reject/truncate tests held over-count=5.🤖 Generated with Claude Code