Add OpenTelemetry tracing spans to VM startup pipeline#60
Merged
Conversation
Instrument the critical path through microvm.Run() with OTel trace spans so consumers can identify performance bottlenecks. When no TracerProvider is configured (the default), all tracing is no-op with zero overhead. Spans added: - microvm.Run (root) with image/name/cpus/memory attributes - microvm.Preflight, microvm.ImagePull, microvm.RootfsClone - microvm.RootfsHooks, microvm.BackendPrepare, microvm.NetworkStart - microvm.VMSpawn, microvm.PostBoot - microvm.image.CacheLookup/Fetch/Extract/CacheStore (image/pull.go) - microvm.SSHWaitReady with per-probe events (ssh/client.go) - microvm.preflight.RunAll + per-check spans (preflight/checker.go) - microvm.backend.Start + ResolveRuntime/ResolveFirmware (backend.go) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
microvm.Run()with OpenTelemetry trace spans so consumers can identify performance bottlenecks in VM startupTracerProvideris configured (the default), all tracing is no-op with zero overheadgo.opentelemetry.io/otelandgo.opentelemetry.io/otel/tracefrom indirect to direct dependencies (already in the dep tree via transitive deps)Spans added
microvm.go— root span + 8 child spans covering each sequential phase:microvm.Run(root) with image/name/cpus/memory attributesmicrovm.Preflight,microvm.ImagePull,microvm.RootfsClonemicrovm.RootfsHooks,microvm.BackendPrepare,microvm.NetworkStartmicrovm.VMSpawn,microvm.PostBootimage/pull.go— sub-spans inPullWithFetcher():microvm.image.CacheLookup(withcache_hitattribute)microvm.image.Fetch,microvm.image.Extract(withlayeredattribute)microvm.image.CacheStoressh/client.go— span + events inWaitForReady():microvm.SSHWaitReadywith host/port/user attributesssh.probe_failedevent per poll iteration with probe countpreflight/checker.go— parent + per-check spans:microvm.preflight.RunAllwith check countmicrovm.preflight.Checkper check with name/required attributeshypervisor/libkrun/backend.go— span inStart():microvm.backend.Startwith sub-spans forResolveRuntimeandResolveFirmwareMotivation
Profiling brood-box startup showed "Sandbox ready" taking ~20s. The
microvm.Run()call was a black box. With these spans, a consumer that configures aTracerProvidercan now see exactly where time goes:Test plan
go test ./...— all existing tests pass (tracing is no-op without provider)golangci-lint run ./...— 0 issues--traceflag produces expected span hierarchy🤖 Generated with Claude Code