http_server: spawn internal server in threaded mode#11811
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds POSIX mutexes and a new flb_hs_buf_acquire API to make HTTP server cached buffers and health metrics thread-safe; refactors ingestion to use locked buffer set/acquire/release, updates endpoints to rely on acquisition instead of manual refcounts, and adds an integration test exercising concurrent requests. ChangesHTTP Server Thread-Safety Implementation
Sequence DiagramsequenceDiagram
participant Client as HTTP Client
participant Endpoint as Metrics Endpoint
participant Acquire as flb_hs_buf_acquire
participant BufLock as buffer mutex
participant Buffer as flb_hs_buf
participant Release as flb_hs_buf_release
Client->>Endpoint: GET /api/v1/metrics
Endpoint->>Acquire: call flb_hs_buf_acquire(buf, require_data, require_raw_data)
Acquire->>BufLock: pthread_mutex_lock(buffer->lock)
BufLock->>Buffer: validate data/raw_data and increment users
BufLock->>Acquire: pthread_mutex_unlock(buffer->lock)
Acquire-->>Endpoint: return buffer or NULL
Endpoint->>Endpoint: encode response using buffer->data / raw_data
Endpoint->>Release: flb_hs_buf_release(buffer, NULL)
Release->>BufLock: pthread_mutex_lock(buffer->lock)
BufLock->>Buffer: decrement users and perform deferred cleanup if needed
BufLock->>Release: pthread_mutex_unlock(buffer->lock)
Endpoint-->>Client: 200 OK with payload
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 11dc0c62b9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (buffer->users > 0) { | ||
| buffer->pending_free = FLB_TRUE; | ||
| pthread_mutex_unlock(&buffer->lock); | ||
| return -1; |
There was a problem hiding this comment.
Preserve metrics snapshots during concurrent scrapes
With the monitoring server now running in its own thread, this branch is hit whenever a metrics/storage push overlaps a request that has acquired the current buffer. The new snapshot is rejected and freed, while pending_free causes the reader's later release to clean the old buffer and leave data/raw_data NULL, so subsequent /api/v1/metrics, /api/v2/metrics, or storage requests return 404 until a later push wins the race. Instead, keep serving the old snapshot while queueing or swapping in the new one after active users drain.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/integration/scenarios/internal_http_server/config/internal_http_server_exec_deadlock.yaml (1)
15-17: ⚡ Quick winAdd a request timeout to the
execcurl command.Line 15 uses
curl -swithout connect/overall time limits; if the endpoint stalls, this scenario can block longer than necessary and make the test flaky.Suggested patch
- command: curl -s http://127.0.0.1:${FLUENT_BIT_HTTP_MONITORING_PORT}/api/v1/metrics/prometheus + command: curl -s --connect-timeout 1 --max-time 2 http://127.0.0.1:${FLUENT_BIT_HTTP_MONITORING_PORT}/api/v1/metrics/prometheus🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/scenarios/internal_http_server/config/internal_http_server_exec_deadlock.yaml` around lines 15 - 17, The curl command used in the exec step (the string beginning with "curl -s http://127.0.0.1:${FLUENT_BIT_HTTP_MONITORING_PORT}/api/v1/metrics/prometheus") has no timeouts; update it to include connection and overall time limits (e.g., add --connect-timeout and --max-time or short flags -m) so the exec won't hang indefinitely if the endpoint stalls, ensuring the test scenario fails fast instead of blocking.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py`:
- Around line 65-75: The lambda passed to service.service.wait_for_condition
closes over loop variables path and pattern causing late-binding (Ruff B023);
fix by binding them as default parameters in the lambda signature so each
iteration captures the current values (i.e., change the callback to use lambda
path=path, pattern=pattern: ...), keeping the same inner logic that calls
service.request(path) and checks response["status_code"] and pattern in
response["body"].
---
Nitpick comments:
In
`@tests/integration/scenarios/internal_http_server/config/internal_http_server_exec_deadlock.yaml`:
- Around line 15-17: The curl command used in the exec step (the string
beginning with "curl -s
http://127.0.0.1:${FLUENT_BIT_HTTP_MONITORING_PORT}/api/v1/metrics/prometheus")
has no timeouts; update it to include connection and overall time limits (e.g.,
add --connect-timeout and --max-time or short flags -m) so the exec won't hang
indefinitely if the endpoint stalls, ensuring the test scenario fails fast
instead of blocking.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4bf0a9f2-f6ff-49e6-a5a6-aa531bce241c
📒 Files selected for processing (9)
include/fluent-bit/http_server/flb_hs.hsrc/http_server/api/v1/health.csrc/http_server/api/v1/metrics.csrc/http_server/api/v1/storage.csrc/http_server/api/v2/metrics.csrc/http_server/flb_hs.csrc/http_server/flb_http_server_http1.ctests/integration/scenarios/internal_http_server/config/internal_http_server_exec_deadlock.yamltests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py
| for path, pattern in endpoints: | ||
| result = service.service.wait_for_condition( | ||
| lambda: ( | ||
| response | ||
| if response["status_code"] == 200 and pattern in response["body"] | ||
| else None | ||
| ) if (response := service.request(path)) else None, | ||
| timeout=10, | ||
| interval=0.5, | ||
| description=f"internal endpoint {path}", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if file exists and read the relevant section
file_path="tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py"
if [ -f "$file_path" ]; then
echo "=== File contents (lines 60-80) ==="
sed -n '60,80p' "$file_path" | cat -n
else
echo "File not found: $file_path"
# Try to find similar files
find . -name "*test_internal_http_server*" -type f 2>/dev/null | head -10
fiRepository: fluent/fluent-bit
Length of output: 1040
🏁 Script executed:
# Also check the repository structure around this file
ls -la tests/integration/scenarios/internal_http_server/tests/ 2>/dev/null || echo "Directory not found, listing integration tests:"
find tests/integration -name "*.py" -type f 2>/dev/null | head -20Repository: fluent/fluent-bit
Length of output: 1782
🏁 Script executed:
# Run Ruff linting on the file to check for B023 warnings
python_file="tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py"
echo "=== Running Ruff on the file ==="
python -m pip install ruff -q 2>/dev/null
python -m ruff check "$python_file" --select B023 2>&1 || trueRepository: fluent/fluent-bit
Length of output: 1447
🏁 Script executed:
# Get the full context around the lambda to understand the code better
echo "=== Full file content ==="
wc -l tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.pyRepository: fluent/fluent-bit
Length of output: 192
🏁 Script executed:
# Read the full file
cat -n tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.pyRepository: fluent/fluent-bit
Length of output: 3715
Bind loop variables in the lambda to avoid late-binding issues flagged by Ruff B023.
Lines 69 and 71 reference path and pattern from the loop without binding them in the lambda signature. Add them as default arguments to ensure each callback iteration captures the correct loop values.
Suggested fix
for path, pattern in endpoints:
result = service.service.wait_for_condition(
- lambda: (
+ lambda path=path, pattern=pattern: (
response
if response["status_code"] == 200 and pattern in response["body"]
else None
) if (response := service.request(path)) else None,🧰 Tools
🪛 Ruff (0.15.12)
[warning] 69-69: Function definition does not bind loop variable pattern
(B023)
[warning] 71-71: Function definition does not bind loop variable path
(B023)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py`
around lines 65 - 75, The lambda passed to service.service.wait_for_condition
closes over loop variables path and pattern causing late-binding (Ruff B023);
fix by binding them as default parameters in the lambda signature so each
iteration captures the current values (i.e., change the callback to use lambda
path=path, pattern=pattern: ...), keeping the same inner logic that calls
service.request(path) and checks response["status_code"] and pattern in
response["body"].
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
11dc0c6 to
7e48b19
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py (1)
65-75:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winBind loop variables in the lambda to avoid late-binding capture.
Line 67 closes over
path/patternfrom theforloop; this is the Ruff B023 warning and can mis-bind values if callback execution is deferred. Bind both as default args.Proposed fix
for path, pattern in endpoints: result = service.service.wait_for_condition( - lambda: ( + lambda path=path, pattern=pattern: ( response if response["status_code"] == 200 and pattern in response["body"] else None ) if (response := service.request(path)) else None,#!/bin/bash # Verify Ruff B023 on this file python -m pip install -q ruff python -m ruff check tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py --select B023🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py` around lines 65 - 75, The lambda passed to service.service.wait_for_condition closes over the loop variables path and pattern causing late-binding (Ruff B023); fix by binding them as default arguments in the lambda (e.g., lambda path=path, pattern=pattern: ...) so each callback captures the current loop values, keeping the existing condition logic that calls service.request(path) and checks response["status_code"] and pattern in response["body"] intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In
`@tests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py`:
- Around line 65-75: The lambda passed to service.service.wait_for_condition
closes over the loop variables path and pattern causing late-binding (Ruff
B023); fix by binding them as default arguments in the lambda (e.g., lambda
path=path, pattern=pattern: ...) so each callback captures the current loop
values, keeping the existing condition logic that calls service.request(path)
and checks response["status_code"] and pattern in response["body"] intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2bd8f604-663b-476e-89a5-af79256ddeaf
📒 Files selected for processing (9)
include/fluent-bit/http_server/flb_hs.hsrc/http_server/api/v1/health.csrc/http_server/api/v1/metrics.csrc/http_server/api/v1/storage.csrc/http_server/api/v2/metrics.csrc/http_server/flb_hs.csrc/http_server/flb_http_server_http1.ctests/integration/scenarios/internal_http_server/config/internal_http_server_exec_deadlock.yamltests/integration/scenarios/internal_http_server/tests/test_internal_http_server_exec_deadlock_001.py
🚧 Files skipped from review as they are similar to previous changes (8)
- src/http_server/flb_http_server_http1.c
- include/fluent-bit/http_server/flb_hs.h
- tests/integration/scenarios/internal_http_server/config/internal_http_server_exec_deadlock.yaml
- src/http_server/api/v1/metrics.c
- src/http_server/api/v2/metrics.c
- src/http_server/api/v1/storage.c
- src/http_server/api/v1/health.c
- src/http_server/flb_hs.c
Signed-off-by: Eduardo Silva <eduardo@chronosphere.io>
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
Bug Fixes
Improvements
Tests