Skip to content

[LCOW Builder] fix CPUGroup, VPMEM controller and StorageQoS parsing#2630

Open
rawahars wants to merge 1 commit intomicrosoft:mainfrom
rawahars:fix-lcow-builderV2
Open

[LCOW Builder] fix CPUGroup, VPMEM controller and StorageQoS parsing#2630
rawahars wants to merge 1 commit intomicrosoft:mainfrom
rawahars:fix-lcow-builderV2

Conversation

@rawahars
Copy link
Contributor

Summary

  • The CPUGroup was being set unconditionally while it needs to be set only if the value is non-empty.

  • StorageQoS was always set unconditionally but should only be set if IopsMaximum and BandwidthMaximum are both non-zero.

  • If VPMEM count is greater than 0, we must initialize the controller.

  • Unit tests are augmented to account for these changes.

@rawahars rawahars requested a review from a team as a code owner March 16, 2026 10:04
Copy link
Contributor

@shreyanshjain7174 shreyanshjain7174 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

@rawahars rawahars force-pushed the fix-lcow-builderV2 branch from afdedb3 to f10adfc Compare March 16, 2026 10:14
@rawahars rawahars changed the title fix CPUGroup, VPMEM controller and StorageQoS parsing [LCOW Builder] fix CPUGroup, VPMEM controller and StorageQoS parsing Mar 16, 2026
@shreyanshjain7174
Copy link
Contributor

shreyanshjain7174 commented Mar 16, 2026

Verified this PR against our parity tests (test/parity/vm/). Pulled the fixes into the parity/lcow-permutations branch and ran all 24 tests as admin.

Results: 23/24 PASS (up from 21/24 before this fix)

Gap test Before fix After fix
gap: no CPUGroupID FAIL PASS
gap: no StorageQoS FAIL PASS
gap: initrd boot (VPMem) FAIL FAIL (partial fix — see below)

The VPMem controller fix works correctly — v2 now allocates the controller even for initrd boot. The remaining initrd failure is a kernel cmdline leading space difference in the legacy builder:

  • Legacy: " initcall_blacklist=..." (leading space from create_lcow.go string concatenation)
  • V2: "initcall_blacklist=..." (no leading space)

This is a harmless legacy quirk, not a v2 issue. HCS trims whitespace from kernel args.

Otherwise LGTM.

@shreyanshjain7174
Copy link
Contributor

Adding the exact diff for the remaining initrd kernel cmdline failure:

Test case: gap: initrd boot (VPMem nil vs empty controller) with annotations PreferredRootFSType=initrd, KernelDirectBoot=true

Legacy KernelCmdLine:

" initcall_blacklist=virtio_vsock_init 8250_core.nr_uarts=0 panic=-1 quiet pci=off nr_cpus=2 brd.rd_nr=0 pmtmr=0 -- -e 1 /bin/vsockexec -e 109 /bin/gcs -v4 -log-format json -loglevel info"

V2 KernelCmdLine:

"initcall_blacklist=virtio_vsock_init 8250_core.nr_uarts=0 panic=-1 quiet pci=off nr_cpus=2 brd.rd_nr=0 pmtmr=0 -- -e 1 /bin/vsockexec -e 109 /bin/gcs -v4 -log-format json -loglevel info"

The only difference is the leading space in the legacy output. This happens in create_lcow.go because for initrd+KernelDirect boot, kernelArgs starts as an empty string (no root= prefix, no initrd=/ prefix), and then " initcall_blacklist=..." is appended with kernelArgs += " initcall_blacklist=..." — producing a leading space.

The v2 builder in kernel_args.go constructs the args without this leading space.

VPMem controller, StorageQoS, CpuGroup, and all other fields match perfectly after this fix PR.

@shreyanshjain7174 shreyanshjain7174 self-requested a review March 16, 2026 10:24
Copy link
Contributor

@shreyanshjain7174 shreyanshjain7174 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shreyanshjain7174
Copy link
Contributor

Adding the exact diff for the remaining initrd kernel cmdline failure:

Test case: gap: initrd boot (VPMem nil vs empty controller) with annotations PreferredRootFSType=initrd, KernelDirectBoot=true

Legacy KernelCmdLine:

" initcall_blacklist=virtio_vsock_init 8250_core.nr_uarts=0 panic=-1 quiet pci=off nr_cpus=2 brd.rd_nr=0 pmtmr=0 -- -e 1 /bin/vsockexec -e 109 /bin/gcs -v4 -log-format json -loglevel info"

V2 KernelCmdLine:

"initcall_blacklist=virtio_vsock_init 8250_core.nr_uarts=0 panic=-1 quiet pci=off nr_cpus=2 brd.rd_nr=0 pmtmr=0 -- -e 1 /bin/vsockexec -e 109 /bin/gcs -v4 -log-format json -loglevel info"

The only difference is the leading space in the legacy output. This happens in create_lcow.go because for initrd+KernelDirect boot, kernelArgs starts as an empty string (no root= prefix, no initrd=/ prefix), and then " initcall_blacklist=..." is appended with kernelArgs += " initcall_blacklist=..." — producing a leading space.

The v2 builder in kernel_args.go constructs the args without this leading space.

VPMem controller, StorageQoS, CpuGroup, and all other fields match perfectly after this fix PR.

As this is an improvement in the newer v2 shim we are ignoring it in testing.

shreyanshjain7174 pushed a commit to shreyanshjain7174/hcsshim that referenced this pull request Mar 16, 2026
…uirk

Add 19 permutation tests covering CPU, memory, boot mode, feature flags,
device interactions, cross-group combos, override priority, and kernel args.

Kernel cmdline leading space difference (legacy initrd+KernelDirect quirk)
is logged as a WARNING instead of failing — this is a known harmless legacy
behavior that v2 correctly omits.

3 gap tests expose v2 builder bugs (CpuGroup, StorageQoS, VPMem initrd)
which are being fixed in PR microsoft#2630.

Signed-off-by: Shreyansh Sancheti <shsancheti@microsoft.com>
Copy link
Contributor

@helsaawy helsaawy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, but lgtm overall

The CPUGroup was being set unconditionally while it needs to be set only if the value is non-empty.

StorageQoS should only be set if IopsMaximum and BandwidthMaximum are both non-zero.

If VPMEM count is greater than 0, we must initialize the controller.

Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants