Linstor: fix create volume from snapshot on primary storage#13043
Open
Kukunin wants to merge 1 commit intoapache:mainfrom
Open
Linstor: fix create volume from snapshot on primary storage#13043Kukunin wants to merge 1 commit intoapache:mainfrom
Kukunin wants to merge 1 commit intoapache:mainfrom
Conversation
When creating a volume from a snapshot on Linstor primary storage (with lin.backup.snapshots=false), the operation fails with: "Only the following image types are currently supported: VHD, OVA, QCOW2, RAW (for PowerFlex and FiberChannel)" Root cause: the Linstor driver does not handle SNAPSHOT -> VOLUME in its canCopy()/copyAsync() methods. This causes DataMotionServiceImpl to fall through to StorageSystemDataMotionStrategy (selected because Linstor advertises STORAGE_SYSTEM_SNAPSHOT=true). That strategy's verifyFormatWithPoolType() rejects RAW format for Linstor pools, since RAW is only allowed for PowerFlex and FiberChannel. Additionally, VolumeOrchestrator.createVolumeFromSnapshot() attempts to back up the snapshot to secondary storage when the storage plugin does not advertise CAN_CREATE_TEMPLATE_FROM_SNAPSHOT. This backup fails because the snapshot only exists on Linstor primary storage. Fix: - Add CAN_CREATE_TEMPLATE_FROM_SNAPSHOT capability so the orchestrator skips the backup-to-secondary path - Add canCopySnapshotToVolumeCond() to match SNAPSHOT -> VOLUME when both are on the same Linstor primary store - Wire it into canCopy() to intercept at DataMotionServiceImpl before strategy selection, bypassing StorageSystemDataMotionStrategy - Implement copySnapshotToVolume() which delegates to the existing createResourceFromSnapshot() for native Linstor snapshot restore This follows the same pattern used by the StorPool plugin, which handles SNAPSHOT -> VOLUME directly in its driver rather than going through StorageSystemDataMotionStrategy. Tested on CloudStack 4.22 with Linstor LVM_THIN storage, creating a volume from a 1TB CNPG Postgres database snapshot. Volume creates successfully with correct path and deletes cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Congratulations on your first Pull Request and welcome to the Apache CloudStack community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
I run a private cloud using 4.22 CloudStack, with Linstor primary storage, Kubernetes, CloudStack CSI driver with additional
registry.k8s.io/sig-storage/csi-snapshotter:v8.2.1sidecar andsnapshot-controller.I wanted to duplicate PVC from kubectl, by creating a snapshot and restoring another PVC from the snapshot. The main problem is that the snapshot wanted to be copied to SecondaryStorage, which is not what I wanted. Secondary storage is slow and outside of the network, so transferring 1TB volume is long and silly. I got a chain of errors, identified those, and prepared a patch that solved my issues. I built and pushed only
cloud-plugin-storage-volume-linstor-4.22.0.0.jarto my servers, and after restarting both management / agent services, the PVC copy via snapshots worked fine. Also I modified the following cloudstack settings:Description
When creating a volume from a snapshot on Linstor primary storage (with
lin.backup.snapshots=false), the operation fails with:Root cause: The Linstor driver does not handle SNAPSHOT → VOLUME in its
canCopy()/copyAsync()methods. This causesDataMotionServiceImplto fall through toStorageSystemDataMotionStrategy(selected because Linstor advertisesSTORAGE_SYSTEM_SNAPSHOT=true). That strategy'sverifyFormatWithPoolType()rejects RAW format for Linstor pools, since RAW is only allowed for PowerFlex and FiberChannel.Additionally,
VolumeOrchestrator.createVolumeFromSnapshot()attempts to back up the snapshot to secondary storage when the storage plugin does not advertiseCAN_CREATE_TEMPLATE_FROM_SNAPSHOT. This backup fails because the snapshot only exists on Linstor primary storage.Fix:
CAN_CREATE_TEMPLATE_FROM_SNAPSHOTcapability so the orchestrator skips the backup-to-secondary pathcanCopySnapshotToVolumeCond()to match SNAPSHOT → VOLUME when both are on the same Linstor primary storecanCopy()to intercept atDataMotionServiceImplbefore strategy selection, bypassingStorageSystemDataMotionStrategyentirelycopySnapshotToVolume()which delegates to the existingcreateResourceFromSnapshot()for native Linstor snapshot restoreThis follows the same pattern used by the StorPool plugin, which handles SNAPSHOT → VOLUME directly in its driver rather than going through
StorageSystemDataMotionStrategy.Fixes: #11451
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
How Has This Been Tested?
Unit tests: 5 new tests added to
LinstorPrimaryDataStoreDriverImplTest:testGetCapabilitiesIncludesCreateTemplateFromSnapshot— verifies the capability is advertisedtestCanCopySnapshotToVolumeOnSamePrimary— verifiescanCopy()returns true for SNAPSHOT → VOLUME on same Linstor primarytestCanCopySnapshotToVolumeRejectsNonLinstor— verifiescanCopy()returns false for non-Linstor storagetestCanCopySnapshotToVolumeRejectsCrossPrimary— verifiescanCopy()returns false across different primary storestestCanCopySnapshotToVolumeRejectsImageDest— verifiescanCopy()returns false when destination is Image storeIntegration test: Tested on CloudStack 4.22 with Linstor LVM_THIN storage (DRBD-replicated across 3 nodes), creating a volume from a 1TB CNPG Postgres database snapshot via
createVolumeAPI:resourceSnapshotRestoreAPI)How did you try to break this feature and the system with this change?
canCopy()paths (SNAPSHOT→SNAPSHOT to Image, TEMPLATE→TEMPLATE, VOLUME→VOLUME/TEMPLATE) are not affected by the new condition being checked first