fix: parse Coder authorities with Punycode (xn--) domain labels#930
fix: parse Coder authorities with Punycode (xn--) domain labels#930
Conversation
parseRemoteAuthority splits the SSH host name on "--", which breaks any
domain containing a Punycode/IDNA label (xn--{encoded}). After the
existing split, walk the segments and merge each pair back together
while the prefix ends in ".xn" -- the cut landed inside an "xn--..."
label. This covers mid-domain, apex, and consecutive Punycode labels.
Refactors the parseRemoteAuthority describe block to it.each.
| // Reassemble Punycode labels (xn--...) the split broke apart: when the | ||
| // prefix ends in ".xn", the cut landed inside an "xn--..." label. | ||
| while (parts.length >= 2 && parts[0].endsWith(".xn")) { | ||
| parts.splice(0, 2, `${parts[0]}--${parts[1]}`); | ||
| } | ||
|
|
There was a problem hiding this comment.
Apparently double dashes are only disallowed as the third and fourth characters, and even then it can depend on the tld?
I imagine this is exceedingly rare, but it means a domain like https://test--ドメイン名例.jp (xn--test---8o4epknhtgo724awkl.jp, becomes a triple dash apparently) is actually valid. And also things like test--domain.com are valid.
So I think to fully fix any domain issues we might have to take more of a parser approach. Something like scan through until the last dot, then scan to the first -- that is not part of a ---, then split on everything after that index. Something like that lol
The current change still seems like an improvement though.
There was a problem hiding this comment.
Yes this was considered but it would tie us to Coderd semantics. Like this only works because usernames cannot have dots and thus we can keep appending the segments until there is no dot (this would work on all domains even if rare). Do you prefer this approach?
Summary
parseRemoteAuthority(src/util.ts) splits the SSH host name on--, which breaks any domain containing a Punycode label (xn--{encoded}). After the existing split, the loop merges segment pairs back while the prefix ends in.xn(the cut landed inside anxn--...label). This covers mid-domain, apex (xn--p1ai), and consecutive Punycode labels.The
.xncheck is structural: Punycode's ACE prefix is IANA-reserved. It does not depend on Coder's username, workspace, or agent character rules.The
parseRemoteAuthoritydescribe block is refactored toit.each.Closes
Closes #929. Builds on #917, extending the same idea to the apex case.
Scope
A deployment on a non-Punycode
--label (e.g.foo--bar.example.com) is not handled. Registries block--outsidexn--at the TLD level, leaving operator-chosen subdomains as the only path. Supporting that case would require slug-shape validation of the tail, which couples this code to Coder's name rules.