feat: automatic oss license audit by msukkari · Pull Request #1003 · sourcebot-dev/sourcebot

msukkari · 2026-03-13T20:05:24Z

Summary by CodeRabbit

New Features
- Automated Open Source Software (OSS) license auditing for pull requests to main.
- License audit results posted as PR comments with detailed summaries, including categorization of copyleft licenses and identification of unresolved licenses.
Chores
- Updated configuration files to support the auditing process.

github-actions · 2026-03-13T20:05:34Z

@msukkari your pull request is missing a changelog!

scripts/fetchLicenses.mjs

github-actions · 2026-03-13T20:08:57Z

License Audit

❌ Audit failed to produce results. Check the workflow logs for details.

coderabbitai · 2026-03-13T20:11:56Z

Walkthrough

Introduces a GitHub Actions workflow that audits open source software licenses in project dependencies. The workflow fetches license information from npm, summarizes results, uses Claude to categorize and validate licenses, and posts detailed audit results as PR comments.

Changes

Cohort / File(s)	Summary
License Audit Workflow `.github/workflows/license-audit.yml`	New GitHub Actions workflow that triggers on PRs to main, executing multi-step license auditing with Node.js setup, dependency installation, license fetching/summarization, Claude-powered validation, result validation, and PR commenting with structured audit reports.
License Audit Support Scripts `scripts/fetchLicenses.mjs`, `scripts/summarizeLicenses.mjs`	New Node.js ES modules: fetchLicenses parses yarn.lock, fetches package licenses from npm registry with batching, applies license overrides, and outputs structured report; summarizeLicenses reads license data, aggregates by type, and generates summary statistics.
Configuration & Gitignore `scripts/npmLicenseMap.json`, `.gitignore`	Empty license mapping file for future license overrides, and updated .gitignore to exclude generated license audit artifacts (oss-licenses.json, oss-license-summary.json, license-audit-result.json).

Sequence Diagram(s)

sequenceDiagram
    participant GHA as GitHub Actions
    participant YL as Yarn Lock
    participant NPM as npm Registry
    participant CLA as Claude API
    participant PR as PR Comment
    
    GHA->>YL: Parse yarn.lock
    GHA->>NPM: Fetch licenses for packages (batch, concurrency=10)
    NPM-->>GHA: License data + overrides applied
    GHA->>GHA: Summarize licenses by type
    GHA->>CLA: Send license data for validation & categorization
    CLA->>CLA: Identify unresolved & copyleft licenses
    CLA-->>GHA: license-audit-result.json
    GHA->>GHA: Validate results (unresolved/copyleft thresholds)
    GHA->>PR: Post audit summary with structured table
    PR-->>GHA: Comment created

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

feat(ci): add Claude automated bug fixer workflow #916: Adds a separate GitHub Actions workflow integrating the anthropics/claude-code-action for automated bug fixes, sharing similar CI/CD orchestration and Claude integration patterns.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: automatic oss license audit' directly and accurately describes the main change: introduction of an automated OSS license audit workflow and supporting scripts.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch msukkari/oss_licenses

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

scripts/fetchLicenses.mjs (1)
99-99: ⚠️ Potential issue | 🟡 Minor

Incomplete string replacement for scoped packages.

The .replace("%40", "@") only replaces the first occurrence. While scoped packages typically have only one @ at the start (e.g., @scope/package), using replaceAll or a regex with the global flag is more robust and clearer in intent.
✏️ Suggested fix
-    const url = `${NPM_REGISTRY}/${encodeURIComponent(name).replace("%40", "@")}/${version}`;
+    const url = `${NPM_REGISTRY}/${encodeURIComponent(name).replaceAll("%40", "@")}/${version}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/fetchLicenses.mjs` at line 99, The URL construction in
scripts/fetchLicenses.mjs uses encodeURIComponent(name).replace("%40", "@")
which only replaces the first encoded '@' and can miss additional occurrences;
update the replacement to be global (e.g., use replaceAll("%40","@") or
replace(/%40/g, "@")) when building the const url that uses NPM_REGISTRY so all
encoded '@' sequences in the package name are properly restored.

🧹 Nitpick comments (5)

.github/workflows/license-audit.yml (2)

43-47: Consider adding a timeout for the Claude action.

The Claude audit step could potentially run for an extended period when processing large dependency lists. Consider whether a timeout would be appropriate to prevent runaway CI costs.

You can add a timeout-minutes at the step level:

      - name: Audit licenses with Claude
        timeout-minutes: 15
        uses: anthropics/claude-code-action@v1

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/license-audit.yml around lines 43 - 47, The "Audit
licenses with Claude" GitHub Actions step (uses:
anthropics/claude-code-action@v1) lacks a timeout and may run indefinitely; add
a step-level timeout by inserting a timeout-minutes setting (e.g.,
timeout-minutes: 15) into that step to cap runtime and prevent runaway CI costs,
keeping the step name "Audit licenses with Claude" and the existing uses/with
configuration unchanged.

125-129: Add error handling for malformed JSON in validation step.

If license-audit-result.json exists but contains invalid JSON or is missing expected fields, the node -e commands will throw exceptions with potentially unclear error messages.

🛡️ Suggested defensive approach

-          STATUS=$(node -e "const r = require('./license-audit-result.json'); console.log(r.status)")
-          UNRESOLVED=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary.unresolvedCount)")
-          STRONG=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary.strongCopyleftCount)")
-          WEAK=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary.weakCopyleftCount)")
-          RESOLVED=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary.resolvedCount)")
+          STATUS=$(node -e "const r = require('./license-audit-result.json'); console.log(r.status ?? 'UNKNOWN')" 2>/dev/null || echo "UNKNOWN")
+          UNRESOLVED=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary?.unresolvedCount ?? 0)" 2>/dev/null || echo "0")
+          STRONG=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary?.strongCopyleftCount ?? 0)" 2>/dev/null || echo "0")
+          WEAK=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary?.weakCopyleftCount ?? 0)" 2>/dev/null || echo "0")
+          RESOLVED=$(node -e "const r = require('./license-audit-result.json'); console.log(r.summary?.resolvedCount ?? 0)" 2>/dev/null || echo "0")

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/license-audit.yml around lines 125 - 129, The current
shell step assigns STATUS, UNRESOLVED, STRONG, WEAK, and RESOLVED by running
inline node -e scripts that assume license-audit-result.json is valid; instead
replace each node -e invocation with a small defensive snippet that reads the
file, wraps JSON.parse in try/catch, verifies the expected fields (status,
summary.unresolvedCount, summary.strongCopyleftCount, summary.weakCopyleftCount,
summary.resolvedCount), and on parse/validation error prints a clear sentinel
(e.g. "INVALID_JSON" for STATUS and "0" for counts) and exits non‑zero or
returns defaults; update the assignments for STATUS, UNRESOLVED, STRONG, WEAK,
and RESOLVED to use that snippet so malformed JSON or missing fields are handled
gracefully.

scripts/npmLicenseMap.json (1)

1-1: Add trailing newline for POSIX compliance.

The file is missing a newline at the end, which is a common convention for text files.

✏️ Suggested fix

-{}
+{}
+

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/npmLicenseMap.json` at line 1, The JSON file npmLicenseMap.json
currently contains "{}" without a trailing newline; update npmLicenseMap.json by
adding a single newline character at EOF so the file ends with a newline
(POSIX-compliant), ensuring the existing content "{}" remains unchanged except
for the added newline.

scripts/fetchLicenses.mjs (1)

31-33: Consider edge cases in deepEqual implementation.

Using JSON.stringify for deep equality has known limitations: it's sensitive to key ordering and doesn't handle undefined values or circular references. This is acceptable for simple license object comparisons, but worth documenting or using a library if the license map grows complex.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/fetchLicenses.mjs` around lines 31 - 33, The current deepEqual
function (deepEqual) uses JSON.stringify which fails on key ordering, undefined
values, and circular refs; replace it with a robust deep equality check (e.g.,
import and use lodash/isEqual or deep-equal) and update calls to use that
function, or if you intentionally want the simple behavior, add a clear comment
above deepEqual explaining these limitations and when it is safe to use;
reference the deepEqual function name so you can locate and replace or annotate
it.

scripts/summarizeLicenses.mjs (1)

30-31: Add error handling for JSON parsing.

If the input file contains malformed JSON, JSON.parse will throw an exception with a generic error. Consider wrapping this in a try-catch for a more user-friendly error message.

🛡️ Suggested fix

-    const data = JSON.parse(fs.readFileSync(inputPath, "utf-8"));
+    let data;
+    try {
+        data = JSON.parse(fs.readFileSync(inputPath, "utf-8"));
+    } catch (err) {
+        console.error(`Failed to parse ${inputFile}: ${err.message}`);
+        process.exit(1);
+    }
     const packages = data.packages || [];

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/summarizeLicenses.mjs` around lines 30 - 31, Wrap the
JSON.parse(fs.readFileSync(inputPath, "utf-8")) call in a try-catch to handle
malformed JSON: read the file into a string, then try JSON.parse and assign to
data, and on error log a clear, user-facing message that includes inputPath and
the error.message (use the same variable names data, packages, inputPath) and
exit with a non-zero code; ensure packages = data.packages || [] still runs only
after successful parsing so downstream code is not executed on parse failure.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/fetchLicenses.mjs`:
- Line 22: The current construction of LICENSE_MAP_PATH uses
path.join(path.dirname(new URL(import.meta.url).pathname), ...) which breaks on
Windows; import fileURLToPath from 'url' and replace the dirname/new URL usage
with path.dirname(fileURLToPath(import.meta.url)) so LICENSE_MAP_PATH is built
from a proper filesystem path; update any top-level imports to include
fileURLToPath and change the expression that computes the directory for
LICENSE_MAP_PATH accordingly.

---

Duplicate comments:
In `@scripts/fetchLicenses.mjs`:
- Line 99: The URL construction in scripts/fetchLicenses.mjs uses
encodeURIComponent(name).replace("%40", "@") which only replaces the first
encoded '@' and can miss additional occurrences; update the replacement to be
global (e.g., use replaceAll("%40","@") or replace(/%40/g, "@")) when building
the const url that uses NPM_REGISTRY so all encoded '@' sequences in the package
name are properly restored.

---

Nitpick comments:
In @.github/workflows/license-audit.yml:
- Around line 43-47: The "Audit licenses with Claude" GitHub Actions step (uses:
anthropics/claude-code-action@v1) lacks a timeout and may run indefinitely; add
a step-level timeout by inserting a timeout-minutes setting (e.g.,
timeout-minutes: 15) into that step to cap runtime and prevent runaway CI costs,
keeping the step name "Audit licenses with Claude" and the existing uses/with
configuration unchanged.
- Around line 125-129: The current shell step assigns STATUS, UNRESOLVED,
STRONG, WEAK, and RESOLVED by running inline node -e scripts that assume
license-audit-result.json is valid; instead replace each node -e invocation with
a small defensive snippet that reads the file, wraps JSON.parse in try/catch,
verifies the expected fields (status, summary.unresolvedCount,
summary.strongCopyleftCount, summary.weakCopyleftCount, summary.resolvedCount),
and on parse/validation error prints a clear sentinel (e.g. "INVALID_JSON" for
STATUS and "0" for counts) and exits non‑zero or returns defaults; update the
assignments for STATUS, UNRESOLVED, STRONG, WEAK, and RESOLVED to use that
snippet so malformed JSON or missing fields are handled gracefully.

In `@scripts/fetchLicenses.mjs`:
- Around line 31-33: The current deepEqual function (deepEqual) uses
JSON.stringify which fails on key ordering, undefined values, and circular refs;
replace it with a robust deep equality check (e.g., import and use
lodash/isEqual or deep-equal) and update calls to use that function, or if you
intentionally want the simple behavior, add a clear comment above deepEqual
explaining these limitations and when it is safe to use; reference the deepEqual
function name so you can locate and replace or annotate it.

In `@scripts/npmLicenseMap.json`:
- Line 1: The JSON file npmLicenseMap.json currently contains "{}" without a
trailing newline; update npmLicenseMap.json by adding a single newline character
at EOF so the file ends with a newline (POSIX-compliant), ensuring the existing
content "{}" remains unchanged except for the added newline.

In `@scripts/summarizeLicenses.mjs`:
- Around line 30-31: Wrap the JSON.parse(fs.readFileSync(inputPath, "utf-8"))
call in a try-catch to handle malformed JSON: read the file into a string, then
try JSON.parse and assign to data, and on error log a clear, user-facing message
that includes inputPath and the error.message (use the same variable names data,
packages, inputPath) and exit with a non-zero code; ensure packages =
data.packages || [] still runs only after successful parsing so downstream code
is not executed on parse failure.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0b7afb64-3fbf-46ba-ba86-a09664a7870a

📥 Commits

Reviewing files that changed from the base of the PR and between 3036433 and afce27b.

📒 Files selected for processing (5)

.github/workflows/license-audit.yml
.gitignore
scripts/fetchLicenses.mjs
scripts/npmLicenseMap.json
scripts/summarizeLicenses.mjs

scripts/fetchLicenses.mjs

oss license audit

bebf44f

github-advanced-security bot found potential problems Mar 13, 2026

View reviewed changes

scripts/fetchLicenses.mjs Dismissed Show dismissed Hide dismissed

fix error in license audit

afce27b

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

scripts/fetchLicenses.mjs Show resolved Hide resolved

msukkari merged commit 5c4b0ce into main Mar 13, 2026
10 of 11 checks passed

github-actions bot mentioned this pull request Mar 13, 2026

Sourcebot Roadmap 🚀 #459

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: automatic oss license audit#1003

feat: automatic oss license audit#1003
msukkari merged 2 commits intomainfrom
msukkari/oss_licenses

msukkari commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

coderabbitai bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

msukkari commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Mar 13, 2026

Uh oh!

Uh oh!

github-actions bot commented Mar 13, 2026

License Audit

Uh oh!

coderabbitai bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

msukkari commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 13, 2026 •

edited

Loading