Skip to content

[None][bugfix] Fix executor test response timeout#15502

Open
fallintoplace wants to merge 1 commit into
NVIDIA:mainfrom
fallintoplace:fix-executor-utils-timeout
Open

[None][bugfix] Fix executor test response timeout#15502
fallintoplace wants to merge 1 commit into
NVIDIA:mainfrom
fallintoplace:fix-executor-utils-timeout

Conversation

@fallintoplace

@fallintoplace fallintoplace commented Jun 19, 2026

Copy link
Copy Markdown

What changed

runThroughRequests() now enforces its timeout while reading executor responses. It passes the remaining deadline into awaitResponses() and throws if no responses arrive before all requests finish.

Why

The previous helper called wait_for(timeout) on an async reader but ignored the status, then called get() unconditionally. If responses never completed, the helper could block forever and wedge CI.

This also removes the boxed std::exception path, so executor exceptions propagate directly with their original type and context.

Validation

  • git diff --check -- cpp/tests/utils/executorUtils.cpp
  • Checked line lengths stay within 120 columns
  • c++ -std=c++17 -fsyntax-only -Icpp -Icpp/include cpp/tests/utils/executorUtils.cpp was attempted, but this local environment is missing CUDA headers (cuda_fp16.h) before compilation reaches this file.

Summary by CodeRabbit

  • Refactor
    • Improved timeout handling and request processing efficiency in test utilities with enhanced error handling and deadline-based completion detection.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

cpp/tests/utils/executorUtils.cpp gains an SPDX/Apache-2.0 license header and a <chrono> include. The runThroughRequests implementation replaces a background std::async task (which called awaitResponses() then wait_for(timeout)) with a synchronous steady_clock deadline loop that passes remaining time directly to awaitResponses and throws on timeout or empty response batches.

Changes

runThroughRequests synchronous deadline loop

Layer / File(s) Summary
License header and chrono include
cpp/tests/utils/executorUtils.cpp
Adds SPDX/Apache-2.0 copyright block and introduces <chrono> to support the new deadline computation.
Synchronous deadline-driven response loop
cpp/tests/utils/executorUtils.cpp
Replaces std::async/future pattern with a steady_clock deadline computed from timeout, a per-iteration remainingTime passed to executor.awaitResponses, and explicit throw on deadline expiration or empty response sets; retains per-response error logging, accumulation, and final-response completion counting.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main fix: enforcing timeout in executor test response handling.
Description check ✅ Passed The description clearly explains the issue, solution, and validation performed. However, it does not include a dedicated Test Coverage section as required by the template.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
cpp/tests/utils/executorUtils.cpp (1)

40-47: ⚡ Quick win

Extract timeout message literal into a named constant.

Lines 40 and 47 repeat the same string literal; please promote it to a k... constant and reuse it.

Proposed refactor
 std::unordered_map<tensorrt_llm::batch_manager::RequestIdType, std::vector<tensorrt_llm::executor::Response>>
 tensorrt_llm::testing::runThroughRequests(executor::Executor& executor, std::vector<executor::Request> const& requests,
     std::chrono::duration<float, std::milli> timeout)
 {
+    constexpr char const* kTimeoutMsg = "Timed out waiting for executor responses";
     std::unordered_map<batch_manager::RequestIdType, std::vector<executor::Response>> accumulatedResponses;
@@
         if (now >= deadline)
         {
-            TLLM_THROW("Timed out waiting for executor responses");
+            TLLM_THROW(kTimeoutMsg);
         }
@@
         if (responses.empty())
         {
-            TLLM_THROW("Timed out waiting for executor responses");
+            TLLM_THROW(kTimeoutMsg);
         }

As per coding guidelines, "All literals except 0, nullptr, true, false should only be used for variable initialization; other uses should extract the literal into a named constant."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/tests/utils/executorUtils.cpp` around lines 40 - 47, Extract the
duplicate string literal "Timed out waiting for executor responses" that appears
in both TLLM_THROW calls (at lines 40 and 47) into a named constant following
the coding convention of using a 'k' prefix (e.g., kTimeoutMessage). Define this
constant at an appropriate scope (such as at the top of the function or file
level), and then replace both occurrences of the string literal with references
to this constant to eliminate the duplication.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/tests/utils/executorUtils.cpp`:
- Around line 33-35: The `remainingRequests` variable is unsigned and decrements
on every final response without tracking which requests have already been
processed. If a request emits duplicate final responses, the unsigned
`remainingRequests` will underflow and wrap to a very large number, causing the
while loop to continue until the deadline. Fix this by implementing
deduplication logic that tracks which request IDs have already had their final
response processed using a data structure like a set, and only decrement
`remainingRequests` once per unique request ID. Update the decrement logic
around line 63 to check if the request ID has been seen before and only proceed
with the decrement if it's a new request.

---

Nitpick comments:
In `@cpp/tests/utils/executorUtils.cpp`:
- Around line 40-47: Extract the duplicate string literal "Timed out waiting for
executor responses" that appears in both TLLM_THROW calls (at lines 40 and 47)
into a named constant following the coding convention of using a 'k' prefix
(e.g., kTimeoutMessage). Define this constant at an appropriate scope (such as
at the top of the function or file level), and then replace both occurrences of
the string literal with references to this constant to eliminate the
duplication.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 57ee1276-dd97-45b0-9a48-1430242d1f08

📥 Commits

Reviewing files that changed from the base of the PR and between a76c818 and de1561a.

📒 Files selected for processing (1)
  • cpp/tests/utils/executorUtils.cpp

Comment thread cpp/tests/utils/executorUtils.cpp Outdated
Comment on lines +33 to +35
auto remainingRequests = requests.size();
auto const deadline = std::chrono::steady_clock::now() + timeout;
while (remainingRequests > 0)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard remainingRequests against duplicate final responses.

Line 63 decrements on every final response without deduplicating by request ID. If a request emits duplicate final responses, remainingRequests (unsigned) can underflow and the loop can run until deadline with a false timeout.

Proposed fix
 `#include` <chrono>
+#include <unordered_set>
@@
     auto remainingRequests = requests.size();
+    std::unordered_set<batch_manager::RequestIdType> completedRequestIds;
@@
-            auto const isFinal = response.hasError() || response.getResult().isFinal;
+            auto const isFinal = response.getResult().isFinal;
             accumulatedResponses[requestId].emplace_back(response);
             if (isFinal)
             {
-                TLLM_LOG_DEBUG("Final response received for request: %lu", requestId);
-                --remainingRequests;
+                if (completedRequestIds.emplace(requestId).second)
+                {
+                    TLLM_LOG_DEBUG("Final response received for request: %lu", requestId);
+                    --remainingRequests;
+                }
             }

Also applies to: 50-64

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/tests/utils/executorUtils.cpp` around lines 33 - 35, The
`remainingRequests` variable is unsigned and decrements on every final response
without tracking which requests have already been processed. If a request emits
duplicate final responses, the unsigned `remainingRequests` will underflow and
wrap to a very large number, causing the while loop to continue until the
deadline. Fix this by implementing deduplication logic that tracks which request
IDs have already had their final response processed using a data structure like
a set, and only decrement `remainingRequests` once per unique request ID. Update
the decrement logic around line 63 to check if the request ID has been seen
before and only proceed with the decrement if it's a new request.

Signed-off-by: Minh Vu <vuhoangminh97@gmail.com>
@fallintoplace fallintoplace force-pushed the fix-executor-utils-timeout branch from de1561a to 4c99d05 Compare June 19, 2026 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant