fix: harden CDN script inlining with linkedom#352
Conversation
jrusso1020
left a comment
There was a problem hiding this comment.
Approving — nice fix
The diagnosis is right and the fix is the right shape. String.prototype.replace(regex, stringReplacement) interprets $&, $`, $', $$, and $n as back-reference tokens, so piping third-party script bytes through it as the replacement argument is exactly how d3 (and any regex-heavy lib) gets silently mangled. Moving to el.replaceWith(...) on an actual DOM node removes the replacement-string surface area entirely, rather than papering over it with $ escaping — that would have been the wrong fix.
Side benefits worth calling out:
- Attribute preservation (
type="module",nomodule,crossorigin, etc.) — the old regex silently dropped these, which is a real correctness win for ESM CDN bundles. - Fixes a latent bug in the old regex path where duplicate CDN URLs would only have the first occurrence replaced (
el.replaceWithhandles each node independently now). Array.from(el.attributes)before iterating is the right defensive move against live-collection mutation.
Scope is tight, CI is fully green across all 24 checks including the render-compat regression shards, and the Raycast Showcase manual verification is the right gate for this class of change.
One ask before merge — please add a fragment-input test
inlineExternalScripts is exported and the new wrappedFragment ? document.body.innerHTML : document.toString() branch is only exercised when callers pass a fragment. The main compileForRender pipeline always hands it a full document (via ensureFullDocument upstream), so this new branch is currently only covered implicitly. Something like:
it("returns a fragment when the input has no <html>/<body> wrapper", async () => {
const originalFetch = globalThis.fetch;
globalThis.fetch = mock(async () => new Response("var d3 = {};", { status: 200 })) as any;
try {
const html = '<script src="https://cdn.example.com/d3.min.js"></script><div>tail</div>';
const result = await inlineExternalScripts(html);
expect(result).not.toMatch(/<!DOCTYPE|<html|<head|<body/i);
expect(result).toContain("<div>tail</div>");
expect(result).toContain("var d3 = {};");
} finally {
globalThis.fetch = originalFetch;
}
});would lock in the fragment-round-trip contract so a future refactor can't silently regress it.
Minor note (take or leave)
When external scripts exist but all downloads fail, the function now returns document.toString() rather than the original html — so attribute quote style and boolean-attr serialization could differ slightly from input for external consumers of this exported function. The producer's own pipeline round-trips through linkedom anyway, so this is a no-op in practice. If you wanted to be strict, a symmetric early-return html when all downloads reject would preserve byte-identity. Not blocking.
Approving now — add the fragment test and merge.
Summary
This fixes a renderer failure in the Raycast Showcase composition caused by our CDN script inlining path corrupting third-party JavaScript during HTML compilation.
HyperFrames intentionally inlines external scripts so compiled compositions stay self-contained and deterministic for local renders, CI, Docker, and offline execution. That path therefore needs to preserve arbitrary library source exactly, not mostly work for a subset of bundles.
Problem
inlineExternalScripts()previously rewrote<script src="...">tags with string replacement. That is unsafe for arbitrary JavaScript because replacement-string semantics treat sequences like$&,$',$\`` and$1` as interpolation tokens.In practice, D3 contained those byte patterns in its distributed source. During inlining, the replacement engine injected surrounding HTML into the middle of the fetched script, which produced browser-side
Invalid or unexpected tokenerrors during frame capture. Once the D3 bundle failed to parse, downstream code then failed withd3 is not defined.What changed
linkedomsrcscript attributes when replacing external tags with inline tags</script>escaping so embedded script text cannot terminate the wrapper earlydocument.body.innerHTMLReview follow-up
inlineExternalScripts()so the exported fragment-return branch is covered directlypackages/producer/tests/spanish-empire-cdn-inlineworld-atlasJSON locally inside the fixture so the regression remains deterministic and does not depend on a render-time network fetchWhy this approach
This is a small, targeted hardening of the compiler path that already exists for deterministic rendering. It avoids introducing a new policy change around external scripts while removing a brittle implementation detail.
Using a DOM rewrite is also the right abstraction boundary here: the problem is not "D3 is special" but "we were performing HTML structure edits with replacement-string semantics." Parsing once and mutating actual script nodes makes the inliner safe for any third-party bundle content, including regex-heavy and minified libraries.
Verification
$&,$', `$``) are preserved in downloaded script contentsrcscript attributes are preserved during inliningbun test packages/producer/src/services/htmlCompiler.test.tsbunx oxfmt packages/producer/src/services/htmlCompiler.ts packages/producer/src/services/htmlCompiler.test.tsbunx oxlint packages/producer/src/services/htmlCompiler.ts packages/producer/src/services/htmlCompiler.test.tsInvalid or unexpected tokenord3 is not definedfailurespackages/producer/src/regression-harness.tsScope
This change is intentionally narrow. It only affects the producer HTML compiler's external script inlining path and adds regression coverage around it; it does not change script ordering, CDN fetch policy, or composition execution semantics.