Conversation
A compliant HTTP/2 server advertises support for h2 in the TLS ALPN field. See https://datatracker.ietf.org/doc/html/rfc7540#section-3.1. grpc-go did not always check for the presence and validity of this extension. It did so in a relatively recent version. This didn't affect boulder, because neither a boulder client nor boulder was advertising the ALPN. However, now that we're trying to update Consul, we run into a problem with failing healthchecks. Consul now uses an upgraded grpc-go with ALPN enforcement. When Consul makes a gRPC healthcheck against boulder, it rejects the healthcheck due to the missing ALPN. Fix this problem by appropriately advertising the h2 ALPN. Normally, grpc-go would handle this for us, but because we're handling a raw tls.Config struct, we have to do it ourselves.
|
This is ready to be reviewed, but I still need to verify whether or not this requires a three legged deploy. So don't merge it just yet. |
grpc/server.go
Outdated
|
|
||
| // Advertise support for h2 in the ALPN, required by HTTP/2. grpc-go can't do | ||
| // this for us, since we're handling a raw tls.Config. | ||
| tlsConfig.NextProtos = []string{"h2"} |
There was a problem hiding this comment.
I somewhat prefer to do this in cmd/config.go TLSConfig.Load() where the *tls.Config is initially constructed:
return &tls.Config{
RootCAs: rootCAs,
ClientCAs: rootCAs,
ClientAuth: tls.RequireAndVerifyClientCert,
Certificates: []tls.Certificate{cert},
// Set the only acceptable TLS to v1.3.
MinVersion: tls.VersionTLS13,
NextProtos: []string{"h2"},
}, nil
The place where you've added NextProtos for the client side seems fine, though.
Also worth noting that WFE and SFE don't use TLSConfig.Load() for their HTTPS servers and so wouldn't be affected by this.
There was a problem hiding this comment.
Sounds good. I did try to hoist it out as far as I could but missed this :)
|
I'm quite confident this requires a three-legged deploy, just from a reading of the ALPN spec. https://www.rfc-editor.org/rfc/rfc7301.html#section-3.2 So consider during a deploy: if a newer boulder client advertises h2 to an older boulder server, the server will respond with no_application_protocol. We'll have to turn off ALPN enforcement via environment variable fleetwide first, before taking this change. I don't think this is something we can get around with feature flags. |
|
In https://pkg.go.dev/crypto/tls#Config:
(emphasis mine) I think this means we could set diff --git i/cmd/config.go w/cmd/config.go
index e006a12ad..8a32d6622 100644
--- i/cmd/config.go
+++ w/cmd/config.go
@@ -218,6 +218,7 @@ func (t *TLSConfig) Load(scope prometheus.Registerer) (*tls>
Certificates: []tls.Certificate{cert},
// Set the only acceptable TLS to v1.3.
MinVersion: tls.VersionTLS13,
+ NextProtos: []string{"h2"},
}, nil
} |
|
Fascinating. I even tried the reverse (client advertises while server doesn't). And that passes tests as well. I can coerce tests to fail by setting, say, the server to advertise Therefore indeed a single legged deploy ought to work. I wonder if we even need to take the client change at all then. Thanks for finding that! |
|
I like that plan. Still a three-sided change, but the first side (adding H2 on the server side) should unblock the Consul upgrade, and we can follow up with the client side change in a future tag. |
|
Ah hang on, #8696 is spitting at that change. |
|
Hmm, that seems to have been a flake. I reran the tests on both these PRs a few times now and haven't seen it again. Let's proceed? |
A compliant HTTP/2 server advertises support for h2 in the TLS ALPN
field. See https://datatracker.ietf.org/doc/html/rfc7540#section-3.1.
grpc-go did not always check for the presence and validity of this
extension. It did so in a relatively recent version.
This didn't affect boulder, because neither a boulder client nor boulder
server was advertising the ALPN.
However, now that we're trying to update Consul, we run into a problem
with failing healthchecks. Consul now uses an upgraded grpc-go with ALPN
enforcement. When Consul makes a gRPC healthcheck against boulder, it
rejects the healthcheck due to the missing ALPN.
Fix this problem by appropriately advertising the h2 ALPN.
Normally, grpc-go would handle this for us, but because we're handling
a raw tls.Config struct, we have to do it ourselves.