-
Notifications
You must be signed in to change notification settings - Fork 18
chore: github installations rfc #1017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+323
−0
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,323 @@ | ||
| --- | ||
| title: "RFC 0012: First-Class GitHub Installations" | ||
| --- | ||
|
|
||
| | Category | Status | Created | Author | | ||
| | ------------ | --------------------------------- | ---------- | ---------------- | | ||
| | Integrations | <Badge color="gray">Draft</Badge> | 2026-04-20 | Aditya Choudhari | | ||
|
|
||
| ## Summary | ||
|
|
||
| Introduce a first-class `github_installation` table that binds a GitHub App | ||
| installation to a ctrlplane workspace, with creation restricted to a verified | ||
| OAuth flow. This closes a multi-tenancy gap where any workspace on a shared | ||
| ctrlplane instance can currently impersonate any GitHub installation by typing | ||
| its numeric ID into a job agent config. The RFC covers the schema, the linking | ||
| flow, and the migration of the existing job agent integration to reference | ||
| installations by foreign key. | ||
|
|
||
| ## Motivation | ||
|
|
||
| ### The current state | ||
|
|
||
| Ctrlplane runs a single GitHub App (`GITHUB_BOT_APP_ID`, `GITHUB_BOT_PRIVATE_KEY`) | ||
| that users install on their GitHub org. The installation is represented inside | ||
| `job_agent.config` as an untyped JSON blob: | ||
|
|
||
| ```json | ||
| { | ||
| "type": "github-app", | ||
| "installationId": 12345678, | ||
| "owner": "acme-corp" | ||
| } | ||
| ``` | ||
|
|
||
| There is no dedicated schema for GitHub installations. The zod validator in | ||
| `packages/trpc/src/routes/job-agents.ts` accepts any `installationId: | ||
| z.number()` without cross-checking it against the calling workspace or the | ||
| calling user's GitHub identity. At runtime, the workspace engine mints an | ||
| installation token using the App's private key and whatever `installationId` | ||
| the job agent config carries — the server has no way to know whether the | ||
| workspace should legitimately have access to that installation. | ||
|
|
||
| ### The multi-tenancy gap | ||
|
|
||
| Installation IDs are not secret. They are visible in the URL of any GitHub App | ||
| installation settings page (`https://github.com/organizations/<org>/settings/installations/<id>`) | ||
| and are sequential integers. | ||
|
|
||
| On a shared ctrlplane instance, the following attack is trivial: | ||
|
|
||
| 1. A user in Workspace B discovers (or guesses) the installation ID of | ||
| Workspace A's GitHub org. | ||
| 2. The user creates a job agent in Workspace B with that installation ID. | ||
| 3. Ctrlplane's server, holding the App's private key, successfully mints an | ||
| installation token for Workspace A's org. | ||
| 4. Workspace B can now list repos, dispatch workflows, and read repo metadata | ||
| for Workspace A's org. | ||
|
|
||
| The only implicit defense today is "you have to know the installation ID," | ||
| which is not an auth boundary. | ||
|
|
||
| ### Why the fix has to happen at link-time | ||
|
|
||
| The workspace engine's GitHub dispatcher at | ||
| `apps/workspace-engine/pkg/jobagents/github/workflow_dispatcher.go` is a thin | ||
| wrapper that calls `gh.CreateClientForInstallation(ctx, cfg.InstallationId)` | ||
| with whatever `InstallationId` is in the config. Runtime validation in the | ||
| engine is the wrong layer — by then, the config has already been persisted and | ||
| the damage is done. The check must happen at the point of creation, i.e. | ||
| when a workspace first claims an installation. | ||
|
|
||
| ## Goals | ||
|
|
||
| - Give GitHub installations a first-class table scoped to a workspace. | ||
| - Eliminate free-text `installationId` input from every user-facing surface. | ||
| - Require GitHub-side proof of admin rights (via OAuth + `GET /user/installations`) | ||
| before a link is accepted. | ||
| - Allow one installation to be linked by multiple workspaces, as long as each | ||
| link is independently verified. | ||
| - Migrate the existing GitHub Actions job agent to reference installations by | ||
| FK instead of duplicating the ID in JSON. | ||
|
|
||
| ## Non-Goals | ||
|
|
||
| - A CLI flow for linking. The UI flow is the v1 surface; CLI is a follow-up | ||
| that plugs into the same backend handler. | ||
| - Supporting user-level GitHub OAuth as a general-purpose login mechanism. The | ||
| OAuth token acquired during linking is used only to verify installation | ||
| access for that single request and is discarded immediately after. | ||
| - Terraform-based creation of installation rows. Linking requires a human | ||
| browser session; a data-source or CLI-assisted workflow will follow. | ||
|
|
||
| ## Proposal | ||
|
|
||
| ### Schema | ||
|
|
||
| ```sql | ||
| CREATE TABLE github_installation ( | ||
| id uuid PRIMARY KEY DEFAULT gen_random_uuid(), | ||
| workspace_id uuid NOT NULL REFERENCES workspace(id) ON DELETE CASCADE, | ||
|
|
||
| installation_id bigint NOT NULL, -- GitHub's numeric ID | ||
| owner text NOT NULL, -- GitHub org or user login | ||
| account_type text NOT NULL, -- 'Organization' | 'User' | ||
|
|
||
| created_by_user_id uuid NOT NULL REFERENCES "user"(id), | ||
| created_at timestamptz NOT NULL DEFAULT now(), | ||
| updated_at timestamptz NOT NULL DEFAULT now(), | ||
|
|
||
| UNIQUE (workspace_id, installation_id) | ||
| ); | ||
|
|
||
| CREATE INDEX github_installation_installation_id_idx | ||
| ON github_installation(installation_id); | ||
| ``` | ||
|
|
||
| Key points: | ||
|
|
||
| - **`UNIQUE (workspace_id, installation_id)`** — one installation can be | ||
| linked to many workspaces, but each `(workspace, installation)` pair exists | ||
| at most once. A shared-org scenario (e.g., dev workspace and prod workspace | ||
| both watching the same repos) is supported natively. | ||
| - **No global uniqueness on `installation_id`** — multiple workspaces | ||
| legitimately share a real-world org installation. | ||
| - **`created_by_user_id`** — records which ctrlplane user completed the link. | ||
| Used for audit and for the per-user OAuth verification flow. | ||
|
|
||
| ### Linking flow | ||
|
|
||
| The **only** backend entrypoint that creates rows in `github_installation` is a | ||
| single Setup URL handler. No tRPC mutation, no REST endpoint, no job-agent | ||
| config path accepts a raw `installationId`. | ||
|
|
||
| ``` | ||
| apps/api/src/routes/github/setup.ts | ||
| ``` | ||
|
|
||
| **Step 1 — User initiates the link from the UI.** | ||
| A "Connect GitHub" action in workspace settings opens: | ||
|
|
||
| ``` | ||
| https://github.com/apps/<app-slug>/installations/new?state=<signedJWT> | ||
| ``` | ||
|
|
||
| The `state` is a short-lived (≤5 min), single-use, HMAC-signed token containing | ||
| `{ workspaceId, userId, nonce }`. This binds the subsequent redirect to the | ||
| initiating ctrlplane session and prevents CSRF. | ||
|
|
||
| **Step 2 — GitHub install UX happens on github.com.** | ||
| The user picks an org, selects which repos to grant access to, and installs | ||
| (or reconfigures) the App. GitHub redirects the browser to ctrlplane's Setup | ||
| URL with `?installation_id=<id>&setup_action=install&state=<jwt>`. | ||
|
|
||
| **Step 3 — Ctrlplane verifies state and requires GitHub user OAuth.** | ||
|
|
||
| 1. Verify `state` JWT signature, expiry, and single-use nonce. | ||
| 2. Confirm the signed-in ctrlplane session matches `userId` in the state. | ||
| 3. Redirect the user through the GitHub OAuth authorize endpoint to obtain a | ||
| GitHub user access token (scope: minimal, enough to call | ||
| `GET /user/installations`). The token is held only for the duration of the | ||
| current linking request — it is **not persisted**. Once verification | ||
| completes (success or failure), the token is discarded along with the | ||
| request context. | ||
| 4. Call `GET /user/installations` with the user's OAuth token. If the | ||
| `installation_id` from the redirect is not in the returned list, **reject | ||
| the link**. This is the teeth of the check: even if a malicious user | ||
| replays a redirect with someone else's `installation_id`, the API will not | ||
| list an installation they don't admin. | ||
|
|
||
| **Step 4 — Fetch installation metadata and insert.** | ||
| Call `GET /app/installations/<id>` with App JWT auth to fetch `account.login` | ||
| (`owner`), `account.type` (`accountType`), and other metadata. Insert one row | ||
| into `github_installation` scoped to the workspace. | ||
|
|
||
| ### Why this flow is sufficient | ||
|
|
||
| - **No free-text input path exists.** The API never accepts `installationId` | ||
| as user input. There is nothing to spoof. | ||
| - **`state` handles CSRF.** A user in Workspace B cannot mint a valid `state` | ||
| for Workspace A. | ||
| - **User-level OAuth handles authorization.** Even if a `state` is somehow | ||
| intercepted, the final `GET /user/installations` check fails unless the | ||
| signed-in user is an admin of the target org — in which case they could | ||
| install the App on that org themselves anyway. | ||
| - **`UNIQUE (workspace_id, installation_id)`** blocks duplicate claims within a | ||
| workspace and permits legitimate cross-workspace sharing. | ||
|
|
||
| ### Migration of the existing job agent | ||
|
|
||
| Today, `job_agent.config` for type `github-app` contains | ||
| `{ installationId, owner }` inline. After this RFC: | ||
|
|
||
| 1. Add column `github_installation_id uuid NULL REFERENCES github_installation(id)` | ||
| to `job_agent` (or continue storing it inside `config` as a UUID — TBD | ||
| during implementation). | ||
| 2. Update the zod schema in `packages/trpc/src/routes/job-agents.ts` to accept | ||
| `{ type: "github-app", githubInstallationId: z.string().uuid() }` instead | ||
| of `{ installationId, owner }`. The dropdown in the job agent creation UI | ||
| becomes "pick from your workspace's linked installations" rather than a | ||
| free-text ID field. | ||
| 3. Update the workspace-engine dispatcher to resolve the `installationId` | ||
| through a getter keyed on `githubInstallationId`, scoped to the job's | ||
| workspace. The engine never sees a raw installation ID supplied by user | ||
| input. | ||
|
|
||
| ### UI changes | ||
|
|
||
| - **Workspace settings → Integrations → GitHub**: a new page listing linked | ||
| installations, with "Connect GitHub" triggering the flow above and | ||
| "Disconnect" removing the workspace's row (without uninstalling the App on | ||
| GitHub). | ||
| - **Job agent creation (GitHub App type)**: the `installationId` free-text | ||
| input is removed. Replace with a dropdown populated from | ||
| `github_installation` rows in the current workspace. If the list is empty, | ||
| link to the GitHub integrations page. | ||
|
|
||
| ### Webhook handling | ||
|
|
||
| Unchanged by this RFC. The existing `workflow_run` handler at | ||
| `apps/api/src/routes/github/workflow_run.ts` continues to operate on | ||
| `installation.id` from the webhook payload. Since `installation.id` comes from | ||
| GitHub (not from a workspace), no trust change is needed there. | ||
|
|
||
| Lifecycle events (`installation.deleted`, `installation_repositories.*`) will | ||
| be handled alongside this RFC to keep `github_installation` rows in sync when | ||
| users uninstall the App or change repo access on GitHub. | ||
|
|
||
| ## Alternatives Considered | ||
|
|
||
| ### 1. Keep installation data in `job_agent.config`, add a validation step | ||
|
|
||
| Rejected. Layering validation on top of a free-text schema means every new | ||
| caller must remember to apply the check. A malformed or forgotten validation | ||
| in any future code path reopens the gap. Making the schema itself the source | ||
| of truth eliminates the class of bug. | ||
|
|
||
| ### 2. Use installation ID as a globally unique primary key | ||
|
|
||
| Rejected. Forbids the legitimate case of one GitHub org serving multiple | ||
| ctrlplane workspaces (dev/staging/prod separation, multi-team monorepos). | ||
| `UNIQUE (workspace_id, installation_id)` captures the right invariant. | ||
|
|
||
| ### 3. Rely on the `state` CSRF token alone, skip GitHub OAuth | ||
|
|
||
| Rejected. The `state` parameter proves the redirect was initiated from a | ||
| ctrlplane session but does not prove the ctrlplane user has admin rights on | ||
| the target org. Without the `GET /user/installations` check, a determined | ||
| user can still initiate an install flow, swap the `installation_id` in the | ||
| redirect (if they intercept it), and claim an installation they don't own. | ||
| OAuth verification is what makes the boundary real. | ||
|
|
||
| ### 4. Allow Terraform / API creation of installation rows | ||
|
|
||
| Rejected for v1. The verification fundamentally requires a browser session | ||
| for GitHub OAuth; any programmatic path would need to either skip the check | ||
| (defeating the purpose) or accept pre-issued OAuth tokens (significant added | ||
| complexity and credential-handling surface). IaC users can still manage | ||
| everything *downstream* of the installation (job agents, deployments, | ||
| deployment sources) via Terraform once the installation row exists. | ||
|
|
||
| ## Tradeoffs | ||
|
|
||
| ### Pros | ||
|
|
||
| - Closes the cross-workspace installation access gap. | ||
| - Existing `job_agent.config` shape is simplified. | ||
| - Clear audit trail: `created_by_user_id` + timestamp on every link. | ||
|
|
||
| ### Cons | ||
|
|
||
| - Introduces a user-level GitHub OAuth flow that ctrlplane did not previously | ||
| need. Additional client ID/secret management for the App's OAuth surface. | ||
| - Terraform and API users cannot create installations programmatically; they | ||
| must complete the UI flow once per workspace per installation. | ||
|
|
||
| ## Migration Strategy | ||
|
|
||
| Ctrlplane is still in an internal-only phase, so the set of existing GitHub | ||
| installations in the wild is small and known. There is no need for an | ||
| automated backfill. | ||
|
|
||
| 1. Add the `github_installation` table as a purely additive schema change. | ||
| Nothing else in the system references it yet. | ||
| 2. Ship the UI + Setup URL handler so new links go through the verified flow. | ||
| 3. **Manually re-link each existing installation through the UI.** For the | ||
| handful of existing internal workspaces that have a GitHub installation | ||
| in their `job_agent.config` today, an authorized user in each workspace | ||
| clicks "Connect GitHub" and completes the same verified flow. This | ||
| produces a `github_installation` row with correct `created_by_user_id` | ||
| and audit metadata — no raw SQL inserts, no nullable-column carve-outs. | ||
| 4. Once all known installations have corresponding `github_installation` | ||
| rows, flip the job agent config schema to require `githubInstallationId` | ||
| (the FK) and update the workspace-engine dispatcher to resolve via getter. | ||
| At this point the raw `installationId` free-text path is removed. | ||
| 5. Remove the old `{ installationId, owner }` JSON shape from `job_agent.config`. | ||
|
|
||
| Manual re-linking is acceptable precisely because the population is tiny and | ||
| each re-link takes less than a minute. If ctrlplane were public or had many | ||
| installations, we'd want an automated backfill instead; at current scale, | ||
| going through the UI is simpler than writing migration code. | ||
|
|
||
| ## Open Questions | ||
|
|
||
| 1. **OAuth client.** Does the GitHub App already have OAuth credentials | ||
| configured (`GITHUB_BOT_CLIENT_ID`/`_CLIENT_SECRET`), or do we need a | ||
| separate OAuth app registration? If the App's OAuth surface is sufficient, | ||
| we avoid an extra credential. | ||
| 2. **Unlink semantics.** When a workspace disconnects an installation, should | ||
| linked job agents be disabled, deleted, or left in a broken state pointing | ||
| at a now-missing FK? Soft-disable seems safest. | ||
| 3. **Reinstall handling.** If a user uninstalls the App on GitHub and | ||
| reinstalls it later, GitHub may issue a new `installation_id`. Should | ||
| ctrlplane detect this (via `installation.deleted` + subsequent | ||
| `installation.created` events from the same `account.id`) and offer a | ||
| "relink" UX, or require the user to re-run the flow manually? | ||
|
|
||
| ## Conclusion | ||
|
|
||
| The GitHub integration today conflates "the App can reach this installation" | ||
| with "this workspace is entitled to use this installation." Adding a | ||
| first-class `github_installation` table, with a verified link flow as the | ||
| only creation path, separates the two and closes a real multi-tenancy gap on | ||
| shared instances. The work is small, contained, and additive. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolve
created_by_user_idnullability before implementation.created_by_user_idisNOT NULL(Line 106), but the migration plan uses manual inserts (Line 282) and the RFC still leaves nullability undecided (Line 305). This is a schema/migration conflict that should be decided in the RFC now to avoid blocking rollout or forcing fake attribution.Also applies to: 282-285, 305-307
🤖 Prompt for AI Agents