Skip to content

feat: enable graphical GPU access out of the box for gpu=all projects#396

Open
konard wants to merge 2 commits into
ProverCoderAI:mainfrom
konard:issue-395-845b0d2bd9bb
Open

feat: enable graphical GPU access out of the box for gpu=all projects#396
konard wants to merge 2 commits into
ProverCoderAI:mainfrom
konard:issue-395-845b0d2bd9bb

Conversation

@konard

@konard konard commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Source TZ / Issues

Summary

Make graphical GPU access work out of the box for projects configured with gpu: "all". Previously, gpus: all alone only exposed the compute/utility GPU capabilities; getting EGL/GLX (graphics/display) working required manual, per-container steps:

  1. Adding NVIDIA_DRIVER_CAPABILITIES=all (and NVIDIA_VISIBLE_DEVICES=all) to the container env and recreating the container so the NVIDIA runtime injects libGLX_nvidia / libEGL_nvidia.
  2. Hand-creating the glvnd EGL vendor ICD JSON at /usr/share/glvnd/egl_vendor.d/10_nvidia.json.

This PR automates both, gated on the existing gpu === "all" flag:

  • Compose: the generated docker-compose.yml environment now includes NVIDIA_VISIBLE_DEVICES: "all" and NVIDIA_DRIVER_CAPABILITIES: "all" on the main service (only when gpu: "all").
  • Dockerfile: the generated image registers the NVIDIA EGL vendor ICD at /usr/share/glvnd/egl_vendor.d/10_nvidia.json so glvnd resolves libEGL_nvidia.so.0 at runtime (the driver library itself is injected by the NVIDIA runtime).

CUDA/nvcc is intentionally not installed: the issue investigation established it is not required for graphical EGL/GLX access, only the driver capabilities and the EGL vendor ICD.

Both source-of-truth template copies are updated to keep the API controller and the CLI consistent:

  • packages/lib/src/core/templates/* (bundled into the docker-git-api controller)
  • packages/app/src/lib/core/templates/* (vendored copy used by the docker-git CLI)

Requirements Alignment

  • Implemented:
    • renderGpuEnv in docker-compose.ts emits the NVIDIA driver-capability env for gpu: "all", wired into the main service environment block.
    • renderDockerfileGpu in dockerfile.ts registers the EGL vendor ICD JSON for gpu: "all".
    • Both pure CORE renderers return "" for non-GPU projects (established empty-fragment pattern), so non-GPU output is byte-for-byte unchanged.
    • Tests covering compose env wiring and the Dockerfile EGL ICD registration in both lib and app test suites.
    • Changeset (minor) for @prover-coder-ai/docker-git.
  • Out of scope:
    • CUDA / nvcc toolkit installation — not needed for graphical access per the issue investigation.
    • Changes to the GpuMode domain type or adding new GPU modes.
  • Security-sensitive changes: none. The new env/ICD wiring is only emitted for projects that already opt into gpu: "all"; no new privileges, mounts, or credentials.

Verification

  • bunx vitest run (packages/lib) — 283 passed
  • bunx vitest run tests/docker-git/ (packages/app) — 418 passed
  • bun run --filter @prover-coder-ai/docker-git-lib typecheck — clean (lib/app/api)
  • bun run lint (packages/lib) — exit 0
  • Rendered compose verified to parse as valid YAML with NVIDIA_VISIBLE_DEVICES: "all", NVIDIA_DRIVER_CAPABILITIES: "all", and gpus: all; rendered Dockerfile contains the EGL ICD RUN block. Non-GPU rendering contains none of these.

Adding .gitkeep for PR creation (default mode).
This file will be removed when the task is complete.

Issue: ProverCoderAI#395
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@konard, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 30 minutes and 12 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 109182db-1995-4d7b-abbc-d292439946e2

📥 Commits

Reviewing files that changed from the base of the PR and between b47351e and 3d96ba4.

📒 Files selected for processing (7)
  • .changeset/gpu-graphics-out-of-the-box.md
  • packages/app/src/lib/core/templates/docker-compose.ts
  • packages/app/src/lib/core/templates/dockerfile.ts
  • packages/app/tests/docker-git/core-templates.test.ts
  • packages/lib/src/core/templates/docker-compose.ts
  • packages/lib/src/core/templates/dockerfile.ts
  • packages/lib/tests/core/templates.test.ts
📝 Walkthrough

Обзор

В этом PR обновлён файл .gitkeep путём добавления одной строки с комментарием о метаданных автоматической генерации PR, включающим временную метку, наименование ветки и ссылку на связанный issue.

Изменения

Служебное обновление метаданных генерации

Layer / File(s) Описание
Добавление метаданных генерации в .gitkeep
`.gitkeep`
Файл обновлён с добавлением одной строки, содержащей комментарий с метаданными о автоматической генерации для PR (время, ветка, issue).

Оценка сложности проверки

🎯 1 (Тривиально) | ⏱️ ~1 минута


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 2 warnings)

Check name Status Explanation Resolution
Requirements Alignment ❌ Error PR содержит только .gitkeep; отсутствуют renderDockerfileCuda и условная установка nvidia-cuda-toolkit в dockerfile.ts вопреки требованиям issue #395. Добавить renderDockerfileCuda в dockerfile.ts с условной установкой nvidia-cuda-toolkit при config.gpu === "all" и интегрировать в renderDockerfile цепь.
Linked Issues check ⚠️ Warning PR не соответствует требованиям issue #395: фактическое изменение (добавление .gitkeep) не реализует ни долгосрочную цель (модификация Dockerfile template), ни краткосрочные альтернативы для устранения проблемы с GPU/CUDA. Реализуйте требуемые изменения: либо добавьте renderDockerfileCuda в dockerfile.ts, либо примените одну из альтернативных быстрых исправлений (apt-install CUDA, env vars, EGL vendor JSON).
Out of Scope Changes check ⚠️ Warning Добавление .gitkeep файла не связано с требованиями issue #395, которые касаются GPU/CUDA в dev контейнерах; это выходит за рамки заявленных целей. Удалите .gitkeep или объясните его назначение; сосредоточьтесь на реальной реализации поддержки GPU в docker-git контейнерах.
✅ Passed checks (4 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Security Regression ✅ Passed No security regressions found. Secret detection hooks present, Docker socket gated with safe defaults, no hardcoded credentials, appropriate workflow permissions.
Title check ✅ Passed Заголовок полностью соответствует основному изменению: описывает включение графического доступа к GPU для проектов с gpu=all, что является основной целью PR.
Description check ✅ Passed Описание PR соответствует шаблону и содержит все необходимые разделы: Source TZ/Issues (с указанием #395), подробный Summary, Requirements Alignment, и Verification с результатами тестов.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Generated dev containers with gpu: "all" now set NVIDIA_DRIVER_CAPABILITIES=all
and NVIDIA_VISIBLE_DEVICES=all so the NVIDIA runtime injects the graphics/display
libraries (libGLX_nvidia, libEGL_nvidia) at container creation, and the image
registers the NVIDIA EGL vendor ICD at /usr/share/glvnd/egl_vendor.d/10_nvidia.json.

This removes the manual per-container env edit, recreate, and vendor-JSON copy
previously required for graphical GPU/EGL over SSH. Non-GPU projects are unaffected.

Edits both source-of-truth template copies (lib + vendored app) and adds tests
covering compose env wiring and the Dockerfile EGL ICD registration.

Refs ProverCoderAI#395
@konard konard changed the title [WIP] Сделать подключение GPU более удобным feat: enable graphical GPU access out of the box for gpu=all projects Jun 10, 2026
@konard konard marked this pull request as ready for review June 10, 2026 14:32
@konard

konard commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

Working session summary

Done. Issue #395 is fully addressed and PR #396 is ready for review.

Summary

Problem: Graphical GPU access inside docker-git dev containers required manual, per-container steps over SSH — adding NVIDIA_DRIVER_CAPABILITIES=all/NVIDIA_VISIBLE_DEVICES=all to the env and recreating the container, plus hand-creating the glvnd EGL vendor ICD JSON. gpus: all alone only exposes compute, not graphics/display (EGL/GLX).

Fix (gated on the existing gpu === "all" flag):

  1. Compose — generated docker-compose.yml now emits NVIDIA_VISIBLE_DEVICES: "all" and NVIDIA_DRIVER_CAPABILITIES: "all" on the main service, so the NVIDIA runtime injects libGLX_nvidia/libEGL_nvidia at container creation.
  2. Dockerfile — generated image registers the NVIDIA EGL vendor ICD at /usr/share/glvnd/egl_vendor.d/10_nvidia.json.

Both source-of-truth template copies were updated (packages/lib → API controller, packages/app → CLI). CUDA/nvcc was intentionally not added — the issue investigation established it isn't needed for graphical access. Non-GPU projects are byte-for-byte unchanged (empty-fragment pattern).

Verification:

  • lib tests: 283 passed · app docker-git tests: 418 passed
  • typecheck clean (lib/app/api) · lint exit 0
  • New tests cover both compose env wiring and the Dockerfile EGL ICD, for gpu: "all" and gpu: "none"
  • Changeset added (minor bump for the published @prover-coder-ai/docker-git)
  • All 4 CI checks pass on the pushed commit (Check, Snapshot, Checking Dependencies, Final Build)

Note on CodeRabbit: its review was rate-limited and only saw the base placeholder commit (.gitkeep), so its "add nvidia-cuda-toolkit" suggestion is stale and contradicts the issue's own conclusion that CUDA isn't required. The valid feedback (PR title/description format) was addressed by rewriting the description to the repo template.

PR: #396 (ready for review, title feat: enable graphical GPU access out of the box for gpu=all projects).


This summary was automatically extracted from the AI working session output.

@konard

konard commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

🤖 Solution Draft Log

This log file contains the complete execution trace of the AI solution draft process.

💰 Cost: $6.894868

📊 Context and tokens usage:

Claude Opus 4.8: (2 sub-sessions)

  1. 116.8K / 1M (12%) input tokens, 30.5K / 128K (24%) output tokens
  2. 59.7K / 1M (6%) input tokens, 9.3K / 128K (7%) output tokens

Total: (11.5K new + 203.7K cache writes + 8.8M cache reads) input tokens, 47.2K output tokens, $6.894869 cost

🤖 Models used:

  • Tool: Anthropic Claude Code
  • Requested: opus
  • Model: Claude Opus 4.8 (claude-opus-4-8)

📎 Log file uploaded as Gist (3298KB)


Now working session is ended, feel free to review and add any feedback on the solution draft.

@konard

konard commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

🟡 Ready for review

Hive Mind stopped automatic restart because the remaining failed check is an external review quota/credit limit, not a code failure it can fix.

Checks not executed:

  • CodeRabbit — Insufficient review credits

Checks completed successfully:

Action required:

  • Restore the external review credits/rate limit and rerun the review, or decide manually whether this PR can proceed.
  • No new AI session was started for this blocker.

Monitored by hive-mind with --auto-restart-until-mergeable flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Сделать подключение GPU более удобным

1 participant