[None][bugfix] Fix executor test response timeout#15502
Conversation
📝 WalkthroughWalkthrough
ChangesrunThroughRequests synchronous deadline loop
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cpp/tests/utils/executorUtils.cpp (1)
40-47: ⚡ Quick winExtract timeout message literal into a named constant.
Lines 40 and 47 repeat the same string literal; please promote it to a
k...constant and reuse it.Proposed refactor
std::unordered_map<tensorrt_llm::batch_manager::RequestIdType, std::vector<tensorrt_llm::executor::Response>> tensorrt_llm::testing::runThroughRequests(executor::Executor& executor, std::vector<executor::Request> const& requests, std::chrono::duration<float, std::milli> timeout) { + constexpr char const* kTimeoutMsg = "Timed out waiting for executor responses"; std::unordered_map<batch_manager::RequestIdType, std::vector<executor::Response>> accumulatedResponses; @@ if (now >= deadline) { - TLLM_THROW("Timed out waiting for executor responses"); + TLLM_THROW(kTimeoutMsg); } @@ if (responses.empty()) { - TLLM_THROW("Timed out waiting for executor responses"); + TLLM_THROW(kTimeoutMsg); }As per coding guidelines, "All literals except
0,nullptr,true,falseshould only be used for variable initialization; other uses should extract the literal into a named constant."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/tests/utils/executorUtils.cpp` around lines 40 - 47, Extract the duplicate string literal "Timed out waiting for executor responses" that appears in both TLLM_THROW calls (at lines 40 and 47) into a named constant following the coding convention of using a 'k' prefix (e.g., kTimeoutMessage). Define this constant at an appropriate scope (such as at the top of the function or file level), and then replace both occurrences of the string literal with references to this constant to eliminate the duplication.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cpp/tests/utils/executorUtils.cpp`:
- Around line 33-35: The `remainingRequests` variable is unsigned and decrements
on every final response without tracking which requests have already been
processed. If a request emits duplicate final responses, the unsigned
`remainingRequests` will underflow and wrap to a very large number, causing the
while loop to continue until the deadline. Fix this by implementing
deduplication logic that tracks which request IDs have already had their final
response processed using a data structure like a set, and only decrement
`remainingRequests` once per unique request ID. Update the decrement logic
around line 63 to check if the request ID has been seen before and only proceed
with the decrement if it's a new request.
---
Nitpick comments:
In `@cpp/tests/utils/executorUtils.cpp`:
- Around line 40-47: Extract the duplicate string literal "Timed out waiting for
executor responses" that appears in both TLLM_THROW calls (at lines 40 and 47)
into a named constant following the coding convention of using a 'k' prefix
(e.g., kTimeoutMessage). Define this constant at an appropriate scope (such as
at the top of the function or file level), and then replace both occurrences of
the string literal with references to this constant to eliminate the
duplication.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 57ee1276-dd97-45b0-9a48-1430242d1f08
📒 Files selected for processing (1)
cpp/tests/utils/executorUtils.cpp
| auto remainingRequests = requests.size(); | ||
| auto const deadline = std::chrono::steady_clock::now() + timeout; | ||
| while (remainingRequests > 0) |
There was a problem hiding this comment.
Guard remainingRequests against duplicate final responses.
Line 63 decrements on every final response without deduplicating by request ID. If a request emits duplicate final responses, remainingRequests (unsigned) can underflow and the loop can run until deadline with a false timeout.
Proposed fix
`#include` <chrono>
+#include <unordered_set>
@@
auto remainingRequests = requests.size();
+ std::unordered_set<batch_manager::RequestIdType> completedRequestIds;
@@
- auto const isFinal = response.hasError() || response.getResult().isFinal;
+ auto const isFinal = response.getResult().isFinal;
accumulatedResponses[requestId].emplace_back(response);
if (isFinal)
{
- TLLM_LOG_DEBUG("Final response received for request: %lu", requestId);
- --remainingRequests;
+ if (completedRequestIds.emplace(requestId).second)
+ {
+ TLLM_LOG_DEBUG("Final response received for request: %lu", requestId);
+ --remainingRequests;
+ }
}Also applies to: 50-64
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/tests/utils/executorUtils.cpp` around lines 33 - 35, The
`remainingRequests` variable is unsigned and decrements on every final response
without tracking which requests have already been processed. If a request emits
duplicate final responses, the unsigned `remainingRequests` will underflow and
wrap to a very large number, causing the while loop to continue until the
deadline. Fix this by implementing deduplication logic that tracks which request
IDs have already had their final response processed using a data structure like
a set, and only decrement `remainingRequests` once per unique request ID. Update
the decrement logic around line 63 to check if the request ID has been seen
before and only proceed with the decrement if it's a new request.
Signed-off-by: Minh Vu <vuhoangminh97@gmail.com>
de1561a to
4c99d05
Compare
What changed
runThroughRequests()now enforces its timeout while reading executor responses. It passes the remaining deadline intoawaitResponses()and throws if no responses arrive before all requests finish.Why
The previous helper called
wait_for(timeout)on an async reader but ignored the status, then calledget()unconditionally. If responses never completed, the helper could block forever and wedge CI.This also removes the boxed
std::exceptionpath, so executor exceptions propagate directly with their original type and context.Validation
git diff --check -- cpp/tests/utils/executorUtils.cppc++ -std=c++17 -fsyntax-only -Icpp -Icpp/include cpp/tests/utils/executorUtils.cppwas attempted, but this local environment is missing CUDA headers (cuda_fp16.h) before compilation reaches this file.Summary by CodeRabbit