redesign: replace docs with new IA from Pixee-Marketing-OS PR #117#256
redesign: replace docs with new IA from Pixee-Marketing-OS PR #117#256
Conversation
b176d34 to
0c0b51f
Compare
daharmattan1
left a comment
There was a problem hiding this comment.
Review: New IA Migration from PR #117
Reviewed across 7 dimensions: source fidelity, technical accuracy, redirect correctness, Docusaurus config, content tone, IA completeness, and migration archive.
Summary
Strong migration. All 71 source pages from PR #117 ported faithfully with correct frontmatter normalization. Redirects are comprehensive and all targets resolve. Content tone is clean — no marketing language leakage. The integrations restructure into scanners/ and scms/ subdirectories is a good IA improvement over the flat structure.
Blocking Issues
None.
Non-Blocking Issues
1. /running_on_public_github_repos redirect content gap (docusaurus.config.js)
The redirect to /configuration/repositories is functional but the target page doesn't cover the original content (step-by-step guide for running Pixee on public GitHub repos without tools). Neither does any other page in the new IA. Suggest adding a paragraph to /getting-started/github or the repositories config page covering this use case, or changing the redirect target to /getting-started/github which is a closer topical match.
Nits
2. Source sonarqube.md had duplicate frontmatter keys — PR correctly deduped title and slug that each appeared twice. Nice catch by the migration script. (No action needed, just noting.)
Dimension-by-Dimension Detail
Source Fidelity (5/5 pages sampled): fix-safety, security, agentic-security-engineering, sonarqube, enterprise-overview — all faithful. Body content identical. Frontmatter correctly normalized: meta_description → description, sidebar_position injected, track lowercased, duplicate keys deduped.
Technical Accuracy: Contrast page is well-structured, consistent with CodeQL and Semgrep integration pages. Tone is appropriate for docs.
Redirect Correctness (30+ rules validated): All redirect to targets confirmed to exist via slug frontmatter in the new file set. The expanded redirect set covers old top-level pages, code-scanning-tools/* → integrations/scanners/*, and flat integrations/<name> → integrations/scanners/<name> or integrations/scms/<name>. Comprehensive.
Docusaurus Config: Organization JSON-LD data looks correct. docusaurus-plugin-llms registered. _category_.json files sampled (integrations, how-it-works, platform) have correct labels, positions, and link references.
Content Tone (5 pages spot-checked): phased-rollout, faq-general, java, operations-config, commercial-scanners — all factual and neutral. No SEO keyword stuffing, no customer quotes, no JSON-LD in FAQ pages, no competitive comparison tables, no CTAs. Clean docs tone.
IA Completeness: Integrations restructured into scanners/ and scms/ subdirectories — good improvement. 5 new scanner pages added (DefectDojo, Fortify, GitLab SCA, Polaris, Trivy). Consolidated pages verified: operations-config.md covers scheduling + notifications + reporting.
Migration Archive: Located at repo root (migration/), not inside docs/. README has clear "Do not re-run" warnings with explanation of why scripts are destructive. ✅
| { from: "/open-pixee", to: "/open-source/overview" }, | ||
| { | ||
| to: "/code-scanning-tools/overview", | ||
| from: "/integrations", |
There was a problem hiding this comment.
The redirect works, but the original page had specific setup instructions for public repos without tools that don't exist anywhere in the new IA. Consider adding a paragraph to /getting-started/github covering this use case, or retargeting to /getting-started/github as a closer match.
|
Thanks for the close read across all 7 dimensions, Victor. On the non-blocking item: agreed — the Pushing a follow-up commit that:
Will follow up here once the commit is in. The dedup nit needs no action — noted, thanks. |
Addresses Victor's review feedback on PR #256. The pre-migration site had a /running_on_public_github_repos page that walked new users through setting up Pixee on a public GitHub repo with no existing scanner: enable Issues for the dashboard, pick a free-tier scanner (CodeQL via GHAS or SonarQube Cloud), install Pixeebot. The initial migration redirected that URL to /configuration/repositories, which is functional as a redirect but does not actually cover the original use case. This commit: 1. Adds a "Public Repositories Without an Existing Scanner" section to docs/getting-started/github.md covering the three steps (enable Issues, connect a free scanner, install Pixeebot), re-toned to match the new docs voice. Cross-links to the CodeQL and SonarQube scanner integration pages for deeper detail. 2. Retargets the redirect: /running_on_public_github_repos now points at /getting-started/github (was /configuration/repositories). Verification: yarn build clean. Redirect HTML correctly points at the new target. New section renders in the production build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Follow-up landed in
CI green. Ready for another look when you have a minute. |
Replaces the existing ~9-page docs.pixee.ai with a 76-page redesigned site sourced from Pixee-Marketing-OS PR #117 (merged 2026-04-28), restructured and corrected against the canonical Pixee Enterprise Server install reference. ## Content - 76 markdown pages across 11 sections: api, configuration, enterprise, faq, getting-started, how-it-works, integrations (with scms/ and scanners/ subcategories), languages, open-source, platform. - Welcome doc lives at /. Pre-existing PixeeDocs React landing (src/pages/index.js + HomepageFeatures component) removed. - /integrations/contrast and /integrations/scanners/gitlab-sca are newly authored content (PR #117 dropped Contrast from the IA; SCA coverage was missing entirely). ## Information architecture - Integrations split into two subcategories: Source Control (4 pages — GitHub, GitLab, Azure DevOps, Bitbucket) and Scanning Tools (14 pages — CodeQL, Semgrep, Checkmarx, Veracode, Snyk Code, SonarQube, AppScan, Polaris, Fortify, Contrast, GitLab SAST, GitLab SCA, Trivy, DefectDojo). Each subcategory has a generated-index landing. - Three consolidated wrapper pages from the original PR (commercial- scanners, oss-aggregator-scanners, scm-platform-reference) split into individual scanner / SCM pages. - Pages /getting-started/<scm> and /integrations/scms/<scm> split by purpose: tutorial vs canonical reference. Permission tables live on the integration page only; install steps live on the getting-started page. - Frontmatter normalized: numeric file prefixes dropped (replaced with sidebar_position), track field lowercased, duplicate keys deduped in 29 files, meta_description renamed to Docusaurus-standard description. ## Canonical-source corrections Sourced from pixee/pixee-enterprise-server (charts/.../docs/src) — used to correct several details that were understated, ahead of the docs, or plain wrong: - GitHub: full 9-permission Repository table (was 6, with Repository contents incorrectly listed as Read-only — Pixee needs Read & Write to create fix branches), plus Organization and Account permissions and the 12 webhook events. GHES section now walks through the full custom GitHub App registration flow. - GitLab: 8 PAT scopes (was 5; added ai_features, read_registry, read_virtual_registry); webhook URL format documented. - Bitbucket: corrected to require username + email + API token (API tokens authenticate by email, Git ops by username — both are required); 6-scope table replaces the previous vague "API token + R/W." Bitbucket Server / Data Center separated as a distinct product. - Azure DevOps: PAT requirement corrected to "custom scope with full Code access (not 'Full access')"; webhook user/password documented as optional. The previous Work-Item Linking section pulled until it can be re-added with a cited source. - AppScan: full webhook setup walkthrough added — Basic Auth (preferred) vs deprecated webhook-secret mode, plus the two AppScan webhook registrations Pixee needs (Scan Execution Completed, New Patch Request) with JSON request bodies. ## Pixee CLI The previous /getting-started/cli described a fictional `pixee fix --sarif` local-fix command. The actual CLI at github.com/pixee/pixee-cli is a thin client for the Pixee REST API. Rewritten: - Install via Homebrew (brew tap pixee/pixee && brew install pixee) or binary download — not pip. - Documents the real subcommands: pixee auth, pixee repo, pixee scan, pixee workflow, pixee api. - Covers credential resolution (PIXEE_TOKEN/PIXEE_SERVER), exit codes (0/1/2/3), output formats (text/json), HAL link traversal. - Mentions the bundled skills.sh skills for Claude Code / Codex. - /getting-started/ci-cd correspondingly rewritten: removed every `pixee fix --sarif` and `pixee/pixee-action@v1` reference (both fictional). Replaced with the correct flow — scanners write to the SCM's code-scanning surface; Pixee ingests through the SCM integration; no separate "Pixee step" needed in pipelines. - Welcome page table dropped the "CLI: ~5 min" row (CLI is not a platform setup option) and added a one-line pointer below. ## Public-repo onboarding Added a "Public Repositories Without an Existing Scanner" section to /getting-started/github covering the legacy /running_on_public_github_repos content (enable Issues for the dashboard, pick a free-tier scanner, install Pixeebot). Retargeted the legacy redirect to /getting-started/github. ## Sidebar Autogen sidebar with per-section _category_.json files providing curated ordering and clean labels (no track badges — tried [DEV]/[LEADER]/[BOTH] labels initially, dropped after eyeballing in dev). Each section's overview page is the category landing via link.id; how-it-works and the two integrations subcategories use generated-index landings. ## Redirects (~25 rules in docusaurus.config.js) Every old URL maps to its closest new equivalent — covers /intro, /installing, /faqs, /languages, /open-pixee, /supported-scms, /using-pixeebot, /running_on_public_github_repos, the entire /code-scanning-tools/* tree, the flat /integrations/<x> URLs from the mid-PR restructure, the removed consolidated wrappers, and the deleted /configuration/scheduling page. Pre-existing /integrations/* aliases flipped direction to point at the new IA. ## SEO additions (config-only, no React components) - Site-wide Organization JSON-LD via headTags in docusaurus.config.js. - docusaurus-plugin-llms generates llms.txt and llms-full.txt at build. - static/robots.txt explicitly allows GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, OAI-SearchBot. - Sitemap and canonical URLs verified on the new tree (default Docusaurus behavior). Deferred to v2: AudienceBadge / SchemaOrg / FeedbackWidget React components, per-page FAQPage / HowTo JSON-LD, .md alternates for AI agents, Algolia DocSearch, HubSpot lead capture, GA4 custom events. ## Migration archive migration/ at the repo root contains migrate.py, fixup_links.py, integrations_restructure.py, ASSESSMENT.md, and README.md — historical record only. The README has a clear DO-NOT-RE-RUN warning explaining why the scripts are now destructive (manually-authored Contrast and GitLab SCA pages, slug change on the welcome doc, layout changes). ## Dedup pass A whole-page Jaccard / paragraph-level overlap audit found two redundancies that were collapsed: configuration/scheduling.md was a 90-line subset of operations-config.md (deleted, redirect added), and three near-identical paragraphs about the independent fix evaluator were duplicated between platform/remediation.md and how-it-works/ fix-safety.md (kept fix-safety as canonical, summarized in remediation). ## Verification - yarn build clean (76 docs processed, zero broken links/anchors). - yarn serve verified all page slugs return 200, all category landings render, redirects emit correct meta-refresh + canonical, JSON-LD on every page, sitemap and robots.txt present in build output. - yarn check-format clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1727b6c to
5348700
Compare
|
Squashed all 10 commits into a single commit ( What's changed since Victor's review at
CI green. Build is 76 docs, zero broken links/anchors. Diff vs |
Removes the CodeTF Specification page and every reference to CodeTF across the docs. Affected: - Deleted docs/api/codetf.md. - Renumbered sidebar_position on the surviving API pages: sarif (3→2), webhooks (4→3), changelog (5→4). API overview stays at 1. - Added redirect /api/codetf -> /api/overview in docusaurus.config.js. - Stripped CodeTF mentions from api-overview, sarif, webhooks (frontmatter description, intro paragraphs, endpoint table, output-format section, related-pages list, FAQ). - Stripped CodeTF mentions from oss-overview, codemodder, contributing, and custom-codemods. Replaced "CodeTF output" prose with neutral "structured JSON report" language; renamed example output filenames from codetf-output.json / codetf-results.json to results.json. Verification: zero remaining 'codetf' references in docs/ (one in docusaurus.config.js — the redirect rule). yarn build clean (75 docs, down from 76). Redirect emits meta-refresh + canonical to /api/overview. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@dhafley @daharmattan1 - I've started to review the pages and will compile my feedback here. There is a LOT of content here and unfortunately a fair amount of slop as well. To make these usable (digestable by a human), I think we need to cut down some and get rid of the slop. After I compile my feedback, I can try to come back through and propose some commits as well:
|
- Merge how-it-works/ + platform/ into single platform/ section (triage, remediation, sca, security — 4 duplicate page pairs → 4 pages) - Move scanner-integration.md into platform/ - Delete how-it-works/ directory entirely - Move getting-started/ci-cd.md → integrations/ci-cd.md - Move getting-started/cli.md → api/cli.md - Consolidate 3 FAQ files → single faq/faq.md (General/Enterprise/Troubleshooting) - Delete faq-enterprise.md, faq-general.md, faq-troubleshooting.md - Strip embedded FAQ sections from all 72 content pages - Scanner connection marked REQUIRED throughout getting-started - Scanner count corrected to 13 across all pages - Fix broken how-it-works/* internal links site-wide - Remove marketing adjectives, self-referential meta-commentary - Rewrite page openings to 1-2 sentences throughout Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Replace file-path links with slug links site-wide: /integrations/integrations-overview → /integrations/overview /languages/languages-overview → /languages/overview /api/api-overview → /api/overview /configuration/config-overview → /configuration/overview /configuration/operations-config → /configuration/operations /enterprise/enterprise-overview → /enterprise/overview Fix /faqs redirect target: faq.md slug → /faq/general Build: zero broken links. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove stale version callouts (SARIF 2.1.0, scanner counts, K8s/Helm versions) - Collapse 4 SCM getting-started pages → single source-control.md - New platform/what-pixee-fixes.md — superset vulnerability/fix-mode reference - New platform/context-memory.md — context, memory & preferences - Enterprise deployment pages nested under enterprise/deployment/ subdirectory - security-architecture.md: add Pixee Trust Center link - byom.md: expand to 7 provider platforms (AWS Bedrock, Google Vertex AI, OCI, etc.) + model families + recommendation section - Language support: collapse to single overview page, delete all 6 individual language pages - Delete api/changelog.md - Update docusaurus.config.js redirects for deleted getting-started SCM pages Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…anup, language headers - Remove all "MagicMod"/"MagicMods" references → "AI-powered fix(es)" throughout - Expand What Pixee Fixes: add Open Redirect, Code Injection, Prototype Pollution, Template Injection, SSTI, Insecure File Upload, Missing Security Headers, CORS Misconfiguration, Race Conditions, Input Validation, Integer Overflow; add Secrets Detection section; add Custom Rules section - Strip remaining tool-specific finding type callouts from scanner pages - BYOM: remove Use Case column, replace model families table with indicative bullet list, abstract named routing tiers → generic hierarchical routing description - Language overview: rename Notes column → "Common Frameworks (examples)" Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove adaptive analyzer caching mechanism details → outcome-focused language - Remove four-tier dataflow quality classification names (Strong Multi-File etc.) - Remove tech stack disclosure (Java/Quarkus, Python/FastAPI, React/TypeScript) - Abstract scanner-aware dispatcher specifics → generic normalization description - Remove fix evaluation failure mode examples (keep rubric names) - Remove SCA verification cache details - Remove all specific codemod counts (51+, 60+, 120+) → "library of deterministic codemods" Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…simplify Secrets, drop Notes on Coverage
- Remove scope-of-learning placeholder (TMI; repo-level detail not needed) - Finalize observing improvement: "improved triage outcomes and higher merge rates" - Document three override paths: individual triage result feedback, PIXEE.yaml, Organization Preferences UI - Confirm override propagation description is accurate; remove placeholder - Add Organization Preferences as the org-level mechanism for enterprise context configuration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Docs rewrite complete — v2 content ready for reviewThis PR now contains a full rewrite of the docs site. Summary of what changed: Structure & IA
Content quality
IP protection
Pages updated
|
…im duplication - Delete REFACTOR_AUDIT.md (was rendering in published site) - Strip embedded FAQ section from trivy.md; fold container image note into body - Replace stale "12 native integrations" count with "growing list" in 4 files - Merge oss-overview.md + contributing.md into codemodder.md; delete absorbed pages - Update redirects: /open-pixee, /open-source/overview, /open-source/contributing → /open-source/codemodder - Trim sarif-universal.md field tables; link to api/sarif as authoritative field reference - Fix custom-codemods.md contributing link to point to new anchor Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add integrations/scanners/arnica.md — native SAST integration page - Add integrations/scanners/datadog-sast.md — native SAST integration page - Add both to integrations overview coverage matrix - Expand Dashboard section in operations-config.md with primary views table (Overview, Findings, Pull Requests, Repositories, Reports) and Scanner filter Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- faq.md: add to native scanner list - platform/architecture.md: add to native handler list - platform/remediation.md: update count 13 → 15, add names to table Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eleted oss-overview) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…egration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Dedicated SaaS Admonitions: - getting-started.md: :::warning for required scanner connection - first-fix.md: :::note wrapping the troubleshooting checklist - platform/triage.md: :::info surfacing 98% FP reduction stat - platform/remediation.md: :::info for fix evaluation gate - enterprise/byom.md: :::info for governance controls section - enterprise/air-gap.md: :::warning for license validation network requirement Triage page: - Simplify 6-column tier table to 4 columns — drops Auditability/LLM Cost as standalone columns, folds cost note into Strategy cell Dedicated SaaS rename: - Global find-and-replace of "Cloud SaaS" → "Dedicated SaaS" across all 17 occurrences - deployment-overview.md: add single-tenant callout in Dedicated SaaS Architecture section - security-architecture.md, compliance.md, platform/security.md, faq.md: updated labels Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Unnecessary to call out language support in the getting started prereqs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
Replaces the existing ~9-page docs.pixee.ai with the 71-page redesigned IA from
Pixee-Marketing-OSPR #117 (merged 2026-04-28).meta_descriptionrenamed todescription,sidebar_positioninjected./. Pre-existing React landing (src/pages/index.js+HomepageFeaturescomponent) deleted. Sidebar's "Getting Started" category now lands at root./integrations/contrastauthored from scratch (PR Update canonical URL #117 dropped Contrast from the IA; we kept it because it's still in the new sidebar). Drafted in the new docs voice from public Contrast Security docs + the existing 4-line stub. Worth a careful read before merge.docusaurus.config.jsmap every old URL to its closest new equivalent. Existing/integrations/* → /code-scanning-tools/*redirects flipped to point the new direction.headTagsdocusaurus-plugin-llmsgeneratesllms.txt+llms-full.txtat buildstatic/robots.txtexplicitly allowsGPTBot,ClaudeBot,PerplexityBot,Google-Extended, etc.migration/archive at repo root containsmigrate.py,fixup_links.py,ASSESSMENT.md, andREADME.md— historical record only. Do not re-run.Deferred to v2 (per scope agreement, not in this PR): in-page audience badge,
<SchemaOrg>per-page JSON-LD (FAQPage / HowTo), raw-.mdalternates for AI agents, Algolia DocSearch, HubSpot lead capture, GA4 custom events.What to review
docs/integrations/contrast.md— wholly new content, needs technical accuracy checkdocusaurus.config.jsredirects — confirm/running_on_public_github_repos → /configuration/repositoriesis the right target, otherwise we should change to/getting-started/githubmigration/ASSESSMENT.md— captures all decisions and tradeoffsTest plan
yarn buildclean — 72 docs processed, zero broken linksyarn serve— all 72 page slugs return 200, all 10 category landings render with correct titles/intro → /,/code-scanning-tools/sonar → /integrations/sonarqube,/faqs → /faq/general, etc.)/running_on_public_github_reposredirect target is correct🤖 Generated with Claude Code