adds containerd-shim-lcow-v1 shim by rawahars · Pull Request #2627 · microsoft/hcsshim

rawahars · 2026-03-10T11:31:39Z

Summary

This pull request introduces a new implementation of the containerd-shim-lcow-v1 for Windows, used for running Linux Containers on Windows.

The changes cover the main entrypoint, shim manager logic, plugin registration, shared service implementation, and a manifest for proper Windows compatibility. The uber changes are-

The main entrypoint and supporting logic to start + serve the shim server.
The stubs for Task Service and Shimdiag service.
The implementation for Sandbox service.
The implementation for VM Controller which manages the VM lifecycle and operations.

This shim follows the upstream pattern of using containerd/pkg/shim for managing it's lifecycle. It uses plugin mechanism for registering service with containerd and is initialized during startup.

This PR is dependent upon PR #2612 and PR #2616

jterry75 · 2026-03-10T16:15:40Z

cmd/containerd-shim-lcow-v1/service/plugin/plugin.go

+// the guest Linux stacks – to the logrus logger tagged with the sandbox ID.
+// This provides an out-of-band diagnostic snapshot without requiring the shim
+// to be paused or restarted.
+func ETWCallback(sourceID guid.GUID, state etw.ProviderState, level etw.Level, matchAnyKeyword uint64, matchAllKeyword uint64, filterData uintptr) {


If we do the etw setup here do we need to export this fn or can we just have the plugin do the etw registration and callback registration?

jterry75 · 2026-03-10T16:16:48Z

cmd/containerd-shim-lcow-v2/service/service.go

+	events chan interface{}
+
+	// sandboxID is the unique identifier for the sandbox managed by this Service instance.
+	sandboxID string


In LCOWs case the sandbox is always the UVM id right? Can we just call it that here or at least say that in the comment.

So, the UVM ID is not this identifier, though it is derived from this one. This sandboxID corresponds with the ID which is stored and used by containerd. The VM's ID is "SandboxID@vm".

Maybe we can add in the comments-

For LCOW shim, sandboxID corresponds 1-1 with the UtilityVM managed by the shim.

jterry75 · 2026-03-10T16:17:44Z

cmd/containerd-shim-lcow-v1/service/service.go

+	sandboxID string
+
+	// vmHcsDocument holds the HCS compute-system document used to create the VM.
+	vmHcsDocument *hcsschema.ComputeSystem


I that the whole point of the VM controller was to have things like this hidden away so they are always up to date always ref counted etc when a modification was made?

jterry75 · 2026-03-10T16:21:42Z

cmd/containerd-shim-lcow-v1/service/service_sandbox.go

+	span.AddAttributes(
+		trace.StringAttribute("sandbox-id", request.SandboxID),
+		trace.StringAttribute("bundle", request.BundlePath),
+		trace.StringAttribute("net-ns-path", request.NetnsPath))


The property is netns one word. Should we use a prop constants file to control this for later updates and consistency?

cmd/containerd-shim-lcow-v1/service/service_sandbox.go

jterry75 · 2026-03-10T17:02:20Z

internal/controller/vm/status.go

+// atomicState is a concurrency-safe VM state holder backed by an atomic int32.
+// All reads and writes go through atomic operations, so no mutex is required
+// for state itself.
+type atomicState struct {


This is fine but seems excessive

jterry75 · 2026-03-10T17:03:55Z

internal/controller/vm/vm.go

+	// We will create the guest connection via GuestManager during StartVM.
+	c.guest = guestmanager.New(ctx, uvm)
+
+	c.vmState.store(StateCreated)


See you already had the mutex, so setting this didnt need to be atomic

jterry75 · 2026-03-10T17:16:36Z

internal/controller/vm/vm.go

+	c.mu.Lock()
+	defer c.mu.Unlock()
+
+	if c.vmState.load() == StateRunning {


For these I'd prefer a single read and then handle the if cases if we stay with atomic. If its just a property guarded by the mutex its fine

jterry75 · 2026-03-10T17:17:41Z

internal/controller/vm/vm.go

+		// use parent context, to prevent 2 minute timout (set above) from overridding terminate operation's
+		// timeout and erroring out prematurely
+		_ = c.uvm.Terminate(pCtx)
+		return fmt.Errorf("failed to start VM: %w", err)


Shouldnt cases like this set it to state stopped as well? I wonder if you want any error from this method to set state stopped?

jterry75 · 2026-03-10T17:19:25Z

internal/controller/vm/vm.go

+	}
+
+	for _, driver := range drivers {
+		_ = driver


In this commit, we are adding the initial draft of `containerd-shim-lcow-v1`. This shim is used for running Linux Containers on Windows. As part of this commit, we are adding the following- - The main entrypoint and supporting logic to start + serve the shim server. - The stubs for Task Service and Shimdiag service. - The implementation for `Sandbox` service. - The implementation for `VM Controller` which manages the VM lifecycle and operations. Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>

helsaawy · 2026-03-16T20:09:10Z

cmd/containerd-shim-lcow-v2/service/service.go

+	events chan interface{}
+
+	// sandboxID is the unique identifier for the sandbox managed by this Service instance.
+	// For LCOW shim, sandboxID corresponds 1-1 with the UtilityVM managed by the shim.


are we not planning on adding multi-pod support?

So multipod support will be added using podControllers where each controller owns the lifecycle of the specific pod. The sandboxID corresponds with the UVM which hosts the pods.

The flow from CRI would be-

- Call SandboxController.Create -> Creates the UVM aka Sandbox to host all the pods - Call NewTask while passing in the SandboxID via WithSandbox -> Creates a pod in the sandbox

This way the annotation of io.kubernetes.cri.sandbox-id on the CreateTask request tells the story of the pod wherein the container needs to be added.

helsaawy · 2026-03-16T20:15:56Z

cmd/containerd-shim-lcow-v2/service/service_shimdiag_internal.go

+func (s *Service) diagTasksInternal(ctx context.Context, request *shimdiag.TasksRequest) (*shimdiag.TasksResponse, error) {
+	_ = ctx
+	_ = request
+	return nil, errdefs.ErrNotImplemented
+}


Suggested change

func (s *Service) diagTasksInternal(ctx context.Context, request *shimdiag.TasksRequest) (*shimdiag.TasksResponse, error) {

_ = ctx

_ = request

return nil, errdefs.ErrNotImplemented

}

func (s *Service) diagTasksInternal(_ context.Context, _*shimdiag.TasksRequest) (*shimdiag.TasksResponse, error) {

return nil, errdefs.ErrNotImplemented

}

same for the below 2, and methods in service_task_internal.go

helsaawy · 2026-03-16T20:17:30Z

cmd/containerd-shim-lcow-v2/service/service_shimdiag_internal.go

+// diagExecInHostInternal is the implementation for DiagExecInHost.
+//
+// It is used to create an exec session into the hosting UVM.
+func (s *Service) diagExecInHostInternal(ctx context.Context, request *shimdiag.ExecProcessRequest) (*shimdiag.ExecProcessResponse, error) {


helsaawy · 2026-03-16T20:21:51Z

internal/builder/vm/lcow/specs.go

 // document used to create an LCOW Utility VM.
 func BuildSandboxConfig(
 	ctx context.Context,
-	owner string,


why hard-code the shim name here and not pass it from cmd/containerd-shim-lcow-v2 code?
we do set different values for owners in uvmboot and internal test code

Right. Let me revert it back,

helsaawy · 2026-03-16T20:25:21Z

internal/controller/vm/vm.go

+	vmState State
+
+	// mu guards the concurrent access to the Manager's fields and operations.
+	mu sync.Mutex


sync.RWMutex might make sense here

Awesome suggestion. I have incorporated the same.

helsaawy · 2026-03-16T20:27:11Z

internal/controller/vm/vm.go

+		c.mu.Unlock()
+		return "", fmt.Errorf("cannot dump stacks: VM is in incorrect state %s", c.vmState)
+	}
+	c.mu.Unlock()


probably best to keep c.mu locked while making the DumpStacks guest request so the uVM doesn't get mutated/terminated during the operation

That makes sense. Implemented the change.

helsaawy · 2026-03-16T20:33:45Z

internal/logfields/fields.go

+	Width            = "width"
+	Height           = "height"


do we use these anywhere besides the one location in service_task.go?

Nope there are not used elsewhere but Justin was of the opinion that we should standardize them in logfields.

helsaawy · 2026-03-16T20:34:28Z

internal/logfields/fields.go

+	Version          = "version"
+	ShimPid          = "shim-pid"
+	TaskPid          = "task-pid"
+	ExecSpanID       = "exec-id"


theres already an ExecID above, and ExecSpanID doesn't really make sense since its used to log the exec ID, and not an associated trace/span

Makes sense. Implemented the change.

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>

shreyanshjain7174

Review of the critical items — each comment is a question for clarification.

shreyanshjain7174 · 2026-03-17T09:34:40Z

internal/controller/vm/vm.go

+func NewController() *Manager {
+	return &Manager{
+		logOutputDone: make(chan struct{}),
+		vmState:       StateNotCreated,
+	}
+}


so this returning *Manager instead of Controller interface ?

Yes, this aligns with Go’s principle of “accept interfaces, return concrete types.”

shreyanshjain7174 · 2026-03-17T09:42:22Z

internal/controller/vm/vm_wcow.go

+			close(c.logOutputDone)
+		}
+	}()
+}


The if _, ok := <-c.logOutputDone; ok { close(c.logOutputDone) } pattern looks broken — logOutputDone is an unbuffered chan struct{} that nobody ever sends on, only closes. So the receive here blocks forever until someone else closes it, at which point ok == false and the close branch is skipped.

This means Wait() will hang on WCOW because logOutputDone is never signaled from this goroutine?

Should this just be close(c.logOutputDone) after wg.Wait(), matching the LCOW pattern in vm_lcow.go?

Nice catch. This seems broken in old shim too.
Fixed at both places.

shreyanshjain7174 · 2026-03-17T09:42:33Z

internal/controller/vm/vm_wcow.go

+			logrus.WithError(err).Fatal("failed to listen for windows logging connections")
+		}
+
+		// Use a WaitGroup to track active log processing goroutines.


logrus.Fatal calls os.Exit(1) — this kills the entire shim process if the log listener fails to bind. Isn't this a recoverable/non-critical failure? Should this use logrus.Error + close(c.logOutputDone) instead, so Wait() doesn't block and the shim can continue operating without log forwarding?

Added the same.

shreyanshjain7174 · 2026-03-17T09:42:43Z

internal/controller/vm/vm.go

+		Args:     request.Args,
+		Workdir:  request.Workdir,
+		Terminal: request.Terminal,
+		Stdin:    request.Stdin,


DumpStacks only reads vmState and calls guest.DumpStacks() — there's no state mutation here. Should this be mu.RLock() / mu.RUnlock() like ExecIntoHost and Wait do, to avoid unnecessarily blocking concurrent readers?

Yes sure. We can implement that.

shreyanshjain7174 · 2026-03-17T09:42:54Z

internal/controller/vm/vm.go

+	return "", nil
+}
+
+// Wait blocks until the VM exits and all log output processing has completed.


Stats() takes mu.Lock() because of the lazy vmmemProcess initialization, but this blocks all other operations (including ExecIntoHost, Wait, DumpStacks). Would it make sense to initialize vmmemProcess during StartVM, or use a separate sync.Once for the lazy init so Stats() can use RLock for the state check?

I like the idea of RLock in Stats but vmmemProcess init should remain in this method as that simplifies the LM scenario.

shreyanshjain7174 · 2026-03-17T09:43:04Z

cmd/containerd-shim-lcow-v2/manager.go

+	// Create the serve command.
+	cmd, err := newCommand(ctx, id, opts.Address, address, f)
+	if err != nil {
+		return params, err


This uses stdlib log.Fatalf (not logrus), which calls os.Exit(1). Shouldn't this return the error instead — return params, fmt.Errorf("failed to create event: %w", err) — so the caller can handle it gracefully?

Yes, sure that works.

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>

rawahars requested a review from a team as a code owner March 10, 2026 11:31

jterry75 reviewed Mar 10, 2026

View reviewed changes

rawahars force-pushed the lcow-shim-draft branch 5 times, most recently from 6777136 to 1bb1938 Compare March 15, 2026 11:31

address review comments

a457c22

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>

rawahars force-pushed the lcow-shim-draft branch from 1bb1938 to a457c22 Compare March 15, 2026 19:48

helsaawy reviewed Mar 16, 2026

View reviewed changes

address review comments: 2

49b2b66

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>

rawahars force-pushed the lcow-shim-draft branch from 6fb16bd to 49b2b66 Compare March 16, 2026 21:33

shreyanshjain7174 reviewed Mar 17, 2026

View reviewed changes

address review comments: 3

fb7c159

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>

Conversation

rawahars commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shreyanshjain7174 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

rawahars commented Mar 10, 2026 •

edited

Loading