[https://nvbugs/6329227][fix] Use pkgutil.extend_path to merge the two flash_attn distributions before… by tensorrt-cicd · Pull Request #15498 · NVIDIA/TensorRT-LLM

tensorrt-cicd · 2026-06-19T14:28:25Z

Summary

Root cause: Co-installed flash-attn v2 (regular pkg with init.py) shadows the flash-attn-4 cute-only wheel so flash_attn.cute is invisible; additionally REPO_ROOT in test_visual_gen.py was not updated when the test moved one directory deeper.
Fix: Use pkgutil.extend_path to merge the two flash_attn distributions before importing flash_attn.cute.* in flash_attn4.py and parallel.py; add the missing ".." segment to REPO_ROOT in test_visual_gen.py.
Automated fix generated by repair-bot

Test plan

Verify fix on the same GPU type as the original failure
Check for regressions in related tests

Links

Bug: https://nvbugs/6329227

Summary by CodeRabbit

Bug Fixes
- Improved flash attention library compatibility by implementing namespace merging to handle scenarios where both legacy and current versions are installed.
Tests
- Re-enabled three previously waived multi-GPU visual generation test cases that validate tensor parallelism configurations.
- Enhanced distributed test infrastructure to properly propagate Python import paths across spawned child processes.

The integration test test_wan22_t2v_lpips_against_golden_tp[*] failed in post-merge CI with `ModuleNotFoundError: No module named 'tensorrt_llm.bindings'` raised inside a torch.multiprocessing.spawn worker. The worker reaches `tensorrt_llm/_torch/visual_gen/mapping.py: 246` (`from tensorrt_llm.bindings.internal.process_group import init_pg`) during VisualGenMapping.setup_communicators(), which the C++ allreduce path requires for tp_size > 1. mp.spawn launches a fresh interpreter that inherits env vars but does not replay the parent's runtime sys.path mutations. In the failing CI image, the compiled tensorrt_llm.bindings*.so is reachable from the parent through a sys.path entry that the child does not rebuild, so the import fails in the worker only. The matching non-TP variants (test_wan22_t2v_lpips_against_golden_multi_gpu[*]) skip setup_communicators() and therefore never hit this import. Propagate the parent's resolved sys.path through PYTHONPATH before mp.spawn so the spawn child sees the same import roots the parent used and init_pg resolves identically in both processes. Removes the three nvbugs/6329227 waivers added for these variants. Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>

…ualGen test relocation The TP variants of test_wan22_t2v_lpips_against_golden_tp execute the FA4 attention backend, which imports flash_attn.cute.* from the flash-attn-4 wheel. When the legacy flash-attn (v2) wheel is co-installed and ships its own __init__.py, Python pins flash_attn.__path__ to the v2 location and never sees the cute/ subpackage. Use pkgutil.extend_path so flash_attn behaves as a namespace-extending package and merges both installs. Additionally, commit d202244 moved test_visual_gen.py one directory deeper but did not update its REPO_ROOT computation, so the LPIPS eval script lookup pointed at /code/tensorrt_llm/tests instead of the repo root. Add the missing '..' so the eval script resolves correctly. Signed-off-by: Repair Bot <noreply@nvidia.com> Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>

coderabbitai · 2026-06-19T14:34:30Z

📝 Walkthrough

Walkthrough

Adds a _merge_flash_attn_namespace() utility that uses pkgutil.extend_path to merge co-installed flash-attn v2 and flash-attn-4 wheel namespaces; calls it in both flash_attn4.py and parallel.py before their respective flash_attn JIT imports. Also fixes REPO_ROOT path depth in the single-GPU test, propagates PYTHONPATH to spawned child processes in the multi-GPU test, and removes three test waivers.

Changes

Visual gen flash-attn namespace fix and test env repairs

Layer / File(s)	Summary
flash_attn namespace merge utility and callers `tensorrt_llm/_torch/visual_gen/attention_backend/utils.py`, `tensorrt_llm/_torch/visual_gen/attention_backend/flash_attn4.py`, `tensorrt_llm/_torch/visual_gen/attention_backend/parallel.py`	Adds `_merge_flash_attn_namespace()` that re-extends `flash_attn.__path__` via `pkgutil.extend_path`, then calls it in both `flash_attn4.py` and `parallel.py` inside their existing `ImportError`/`OSError` try-blocks before importing the flash_attn JIT symbols.
Test environment: PYTHONPATH propagation, REPO_ROOT fix, and waiver removal `tests/integration/defs/examples/visual_gen/test_visual_gen_multi_gpu.py`, `tests/integration/defs/examples/visual_gen/test_visual_gen.py`, `tests/integration/test_lists/waives.txt`	Propagates parent `sys.path` as `PYTHONPATH` to `mp.spawn` child processes; corrects `REPO_ROOT` traversal depth by one level; removes three `test_wan22_t2v_lpips_against_golden_tp` waivers.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

NVIDIA/TensorRT-LLM#15395: Both PRs modify tests/integration/test_lists/waives.txt to change which visual_gen WAN22 multi-GPU/t2v LPIPS test cases are skipped.
NVIDIA/TensorRT-LLM#15389: Both PRs modify tests/integration/test_lists/waives.txt by changing the set of skip/waive entries.
NVIDIA/TensorRT-LLM#15391: Both PRs modify the same tests/integration/test_lists/waives.txt file to adjust test skip entries.

Suggested reviewers

juney-nvidia
jieli-matrix

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: using pkgutil.extend_path to merge flash_attn distributions, which is the primary fix addressing the core issue.
Description check	✅ Passed	The description covers the root cause, fix, test verification, and includes the bug link. All key sections are present and adequately detailed.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/defs/examples/visual_gen/test_visual_gen_multi_gpu.py`:
- Around line 112-117: The code modifies os.environ["PYTHONPATH"] on line 115 to
set up the environment for the spawned workers but never restores the original
value after mp.spawn() completes, which can cause subsequent tests to inherit
this modified state and become order-dependent. Save the original PYTHONPATH
value before setting it (handle the case where it may not exist), then restore
it after the mp.spawn() call finishes to ensure the environment is clean for
subsequent tests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: f1a13485-c43c-4694-83ef-fc78270d1d05

📥 Commits

Reviewing files that changed from the base of the PR and between b9b132b and 422a575.

⛔ Files ignored due to path filters (1)

tests/integration/defs/examples/visual_gen/golden/visual_gen_lpips/visual_gen_lpips_golden_media.zip is excluded by !**/*.zip

📒 Files selected for processing (6)

tensorrt_llm/_torch/visual_gen/attention_backend/flash_attn4.py
tensorrt_llm/_torch/visual_gen/attention_backend/parallel.py
tensorrt_llm/_torch/visual_gen/attention_backend/utils.py
tests/integration/defs/examples/visual_gen/test_visual_gen.py
tests/integration/defs/examples/visual_gen/test_visual_gen_multi_gpu.py
tests/integration/test_lists/waives.txt

💤 Files with no reviewable changes (1)

tests/integration/test_lists/waives.txt

coderabbitai · 2026-06-19T14:34:33Z

+    # mp.spawn starts a fresh interpreter that does not inherit the parent's
+    # runtime sys.path mutations; without this, the child cannot import
+    # tensorrt_llm.bindings.internal.process_group (needed by the tp_size>1
+    # C++ allreduce path).
+    os.environ["PYTHONPATH"] = os.pathsep.join(filter(None, sys.path))
    mp.spawn(


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Restore PYTHONPATH after spawning workers.

Line 116 mutates process-global environment state and never restores it; later tests can inherit this synthetic value and become order-dependent/flaky.

🔧 Proposed fix

- os.environ["PYTHONPATH"] = os.pathsep.join(filter(None, sys.path)) - mp.spawn( - _distributed_worker, - args=(world_size, backend, test_fn, port, kwargs), - nprocs=world_size, - join=True, - ) + previous_pythonpath = os.environ.get("PYTHONPATH") + os.environ["PYTHONPATH"] = os.pathsep.join(map(str, filter(None, sys.path))) + try: + mp.spawn( + _distributed_worker, + args=(world_size, backend, test_fn, port, kwargs), + nprocs=world_size, + join=True, + ) + finally: + if previous_pythonpath is None: + os.environ.pop("PYTHONPATH", None) + else: + os.environ["PYTHONPATH"] = previous_pythonpath

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/integration/defs/examples/visual_gen/test_visual_gen_multi_gpu.py` around lines 112 - 117, The code modifies os.environ["PYTHONPATH"] on line 115 to set up the environment for the spawned workers but never restores the original value after mp.spawn() completes, which can cause subsequent tests to inherit this modified state and become order-dependent. Save the original PYTHONPATH value before setting it (handle the case where it may not exist), then restore it after the mp.spawn() call finishes to ensure the environment is clean for subsequent tests.

tensorrt-cicd added 2 commits June 19, 2026 06:36

tensorrt-cicd requested review from a team as code owners June 19, 2026 14:28

tensorrt-cicd requested a review from pengbowang-nv June 19, 2026 14:28

tensorrt-cicd assigned NVShreyas Jun 19, 2026

github-actions Bot assigned tensorrt-cicd Jun 19, 2026

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[https://nvbugs/6329227][fix] Use pkgutil.extend_path to merge the two flash_attn distributions before…#15498

[https://nvbugs/6329227][fix] Use pkgutil.extend_path to merge the two flash_attn distributions before…#15498
tensorrt-cicd wants to merge 2 commits into
NVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6329227

tensorrt-cicd commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 19, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tensorrt-cicd commented Jun 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Links

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 19, 2026

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tensorrt-cicd commented Jun 19, 2026 •

edited by coderabbitai Bot

Loading