public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [PATCH container 0/2] Improve volume deactivation
@ 2021-11-26 10:19 Aaron Lauterer
  2021-11-26 10:19 ` [pve-devel] [PATCH container 1/2] template_create: remove volume activation Aaron Lauterer
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Aaron Lauterer @ 2021-11-26 10:19 UTC (permalink / raw)
  To: pve-devel

While working on the reassign feature we (F.Ebner & I) discovered that
it is possible, mainly with RBD volumes, to get into situations where it
is not possible to remove that volume as an old orphaned RBD mapping
still exists.

Mainly when converting a container on RBD storage to a template and when
adding a new MP to a container that is not running and reassigning that
MP right away to another container.

Aaron Lauterer (2):
  template_create: remove volume activation
  apply_pending_mountpoint: deactivate volumes if not running

 src/PVE/LXC.pm        | 2 --
 src/PVE/LXC/Config.pm | 2 ++
 2 files changed, 2 insertions(+), 2 deletions(-)

-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH container 1/2] template_create: remove volume activation
  2021-11-26 10:19 [pve-devel] [PATCH container 0/2] Improve volume deactivation Aaron Lauterer
@ 2021-11-26 10:19 ` Aaron Lauterer
  2021-11-26 10:19 ` [pve-devel] [PATCH container 2/2] apply_pending_mountpoint: deactivate volumes if not running Aaron Lauterer
  2021-12-01 10:12 ` [pve-devel] [PATCH container 0/2] Improve volume deactivation Fabian Ebner
  2 siblings, 0 replies; 7+ messages in thread
From: Aaron Lauterer @ 2021-11-26 10:19 UTC (permalink / raw)
  To: pve-devel

This caused problems, especially with RBD volumes as they would get
mapped under the current vm-xxx-disk-y name and never unmapped. After
the rename to base-xxx-disk-y the old, now orphaned, mapping still
exists which can then cause problems when removing that MP or the whole
container as RBD won't let the image be removed. Only a manual `rbd
unmap` of the orphaned mapping or a reboot of the node would help.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
I am not sure why we activated the volumes anyway at this step as we do
not do that for VMs over in qemu-server. If we want to make sure they
are present we should probably rather use
PVE::Storage::activate_storage or activate_storage_list.

 src/PVE/LXC.pm | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/PVE/LXC.pm b/src/PVE/LXC.pm
index b07d986..ce10c55 100644
--- a/src/PVE/LXC.pm
+++ b/src/PVE/LXC.pm
@@ -1241,8 +1241,6 @@ sub template_create {
 
 	my $volid = $mountpoint->{volume};
 
-	PVE::Storage::activate_volumes($storecfg, [$volid]);
-
 	my $template_volid = PVE::Storage::vdisk_create_base($storecfg, $volid);
 	$mountpoint->{volume} = $template_volid;
 	$conf->{$ms} = PVE::LXC::Config->print_ct_mountpoint($mountpoint, $ms eq "rootfs");
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* [pve-devel] [PATCH container 2/2] apply_pending_mountpoint: deactivate volumes if not running
  2021-11-26 10:19 [pve-devel] [PATCH container 0/2] Improve volume deactivation Aaron Lauterer
  2021-11-26 10:19 ` [pve-devel] [PATCH container 1/2] template_create: remove volume activation Aaron Lauterer
@ 2021-11-26 10:19 ` Aaron Lauterer
  2021-12-01 10:12 ` [pve-devel] [PATCH container 0/2] Improve volume deactivation Fabian Ebner
  2 siblings, 0 replies; 7+ messages in thread
From: Aaron Lauterer @ 2021-11-26 10:19 UTC (permalink / raw)
  To: pve-devel

Container volumes need to be activated to be formatted. Not deactivating
them afterwards, if the container is not currently running, can lead to
problems.
For example RBD images will get mapped by krbd but not unmapped
afterwards. Reassigning an MP to another CT right after would then lead
to a mapping that needs to be unmapped manually with `rbd unmap` or by
rebooting the node.

Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
---
 src/PVE/LXC/Config.pm | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/PVE/LXC/Config.pm b/src/PVE/LXC/Config.pm
index 1e28a88..ddfeed8 100644
--- a/src/PVE/LXC/Config.pm
+++ b/src/PVE/LXC/Config.pm
@@ -1456,6 +1456,8 @@ sub apply_pending_mountpoint {
 		    $conf->{pending}->{$opt} = $original_value;
 		    die $err;
 		}
+	    } else {
+		PVE::Storage::deactivate_volumes($storecfg, $vollist);
 	    }
 	} else {
 	    die "skip\n" if $running && defined($old); # TODO: "changing" mount points?
-- 
2.30.2





^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH container 0/2] Improve volume deactivation
  2021-11-26 10:19 [pve-devel] [PATCH container 0/2] Improve volume deactivation Aaron Lauterer
  2021-11-26 10:19 ` [pve-devel] [PATCH container 1/2] template_create: remove volume activation Aaron Lauterer
  2021-11-26 10:19 ` [pve-devel] [PATCH container 2/2] apply_pending_mountpoint: deactivate volumes if not running Aaron Lauterer
@ 2021-12-01 10:12 ` Fabian Ebner
  2021-12-01 16:27   ` Aaron Lauterer
  2 siblings, 1 reply; 7+ messages in thread
From: Fabian Ebner @ 2021-12-01 10:12 UTC (permalink / raw)
  To: pve-devel, Aaron Lauterer

Am 26.11.21 um 11:19 schrieb Aaron Lauterer:
> While working on the reassign feature we (F.Ebner & I) discovered that
> it is possible, mainly with RBD volumes, to get into situations where it
> is not possible to remove that volume as an old orphaned RBD mapping
> still exists.
> 
> Mainly when converting a container on RBD storage to a template and when
> adding a new MP to a container that is not running and reassigning that
> MP right away to another container.
> 

I feel like cleaning up such things should be the responsibility of the 
storage plugin itself. It knows best when a volume gets a new name and 
what needs to happen if there is still something using the old name around.

For example, after a full clone, volumes from both containers will be 
active and then reassigning or converting to template will lead to the 
issue again. There are likely other places where we don't cleanly 
deactivate. Of course we could try and hunt them all down ;), but 
quoting from [0]:

this is fundamentally how volume activation works in PVE - we activate 
(and skip the expensive parts if already activated) often, but are very 
careful about de-activating only where necessary (shared volumes when 
migrating) or clearly 100% right (error handling before removing a newly 
allocated volume for example).

[0]: https://bugzilla.proxmox.com/show_bug.cgi?id=3756#c3

> Aaron Lauterer (2):
>    template_create: remove volume activation
>    apply_pending_mountpoint: deactivate volumes if not running
> 
>   src/PVE/LXC.pm        | 2 --
>   src/PVE/LXC/Config.pm | 2 ++
>   2 files changed, 2 insertions(+), 2 deletions(-)
> 




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH container 0/2] Improve volume deactivation
  2021-12-01 10:12 ` [pve-devel] [PATCH container 0/2] Improve volume deactivation Fabian Ebner
@ 2021-12-01 16:27   ` Aaron Lauterer
  2021-12-02  7:40     ` Fabian Ebner
  0 siblings, 1 reply; 7+ messages in thread
From: Aaron Lauterer @ 2021-12-01 16:27 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel



On 12/1/21 11:12, Fabian Ebner wrote:
> Am 26.11.21 um 11:19 schrieb Aaron Lauterer:
>> While working on the reassign feature we (F.Ebner & I) discovered that
>> it is possible, mainly with RBD volumes, to get into situations where it
>> is not possible to remove that volume as an old orphaned RBD mapping
>> still exists.
>>
>> Mainly when converting a container on RBD storage to a template and when
>> adding a new MP to a container that is not running and reassigning that
>> MP right away to another container.
>>
> 
> I feel like cleaning up such things should be the responsibility of the storage plugin itself. It knows best when a volume gets a new name and what needs to happen if there is still something using the old name around.
> 
> For example, after a full clone, volumes from both containers will be active and then reassigning or converting to template will lead to the issue again. There are likely other places where we don't cleanly deactivate. Of course we could try and hunt them all down ;), but quoting from [0]:
> 
> this is fundamentally how volume activation works in PVE - we activate (and skip the expensive parts if already activated) often, but are very careful about de-activating only where necessary (shared volumes when migrating) or clearly 100% right (error handling before removing a newly allocated volume for example).
> 
> [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=3756#c3

Hmm okay yeah, definitely valid regarding the second patch. But the first one would still be valid AFAIU because I don't understand why we activate the volumes when creating a template for containers only, but not for VMs if we don't need to do anything in the volume. So not activating it in the first place would help at least in that case.

> 
>> Aaron Lauterer (2):
>>    template_create: remove volume activation
>>    apply_pending_mountpoint: deactivate volumes if not running
>>
>>   src/PVE/LXC.pm        | 2 --
>>   src/PVE/LXC/Config.pm | 2 ++
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH container 0/2] Improve volume deactivation
  2021-12-01 16:27   ` Aaron Lauterer
@ 2021-12-02  7:40     ` Fabian Ebner
  2021-12-02  7:49       ` Aaron Lauterer
  0 siblings, 1 reply; 7+ messages in thread
From: Fabian Ebner @ 2021-12-02  7:40 UTC (permalink / raw)
  To: Aaron Lauterer, pve-devel

Am 01.12.21 um 17:27 schrieb Aaron Lauterer:
> 
> 
> On 12/1/21 11:12, Fabian Ebner wrote:
>> Am 26.11.21 um 11:19 schrieb Aaron Lauterer:
>>> While working on the reassign feature we (F.Ebner & I) discovered that
>>> it is possible, mainly with RBD volumes, to get into situations where it
>>> is not possible to remove that volume as an old orphaned RBD mapping
>>> still exists.
>>>
>>> Mainly when converting a container on RBD storage to a template and when
>>> adding a new MP to a container that is not running and reassigning that
>>> MP right away to another container.
>>>
>>
>> I feel like cleaning up such things should be the responsibility of 
>> the storage plugin itself. It knows best when a volume gets a new name 
>> and what needs to happen if there is still something using the old 
>> name around.
>>
>> For example, after a full clone, volumes from both containers will be 
>> active and then reassigning or converting to template will lead to the 
>> issue again. There are likely other places where we don't cleanly 
>> deactivate. Of course we could try and hunt them all down ;), but 
>> quoting from [0]:
>>
>> this is fundamentally how volume activation works in PVE - we activate 
>> (and skip the expensive parts if already activated) often, but are 
>> very careful about de-activating only where necessary (shared volumes 
>> when migrating) or clearly 100% right (error handling before removing 
>> a newly allocated volume for example).
>>
>> [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=3756#c3
> 
> Hmm okay yeah, definitely valid regarding the second patch. But the 
> first one would still be valid AFAIU because I don't understand why we 
> activate the volumes when creating a template for containers only, but 
> not for VMs if we don't need to do anything in the volume. So not 
> activating it in the first place would help at least in that case.
> 

Sorry, I didn't mean to imply that the patches were wrong, just wanted 
to point out that they don't fully address the issue.

>>
>>> Aaron Lauterer (2):
>>>    template_create: remove volume activation
>>>    apply_pending_mountpoint: deactivate volumes if not running
>>>
>>>   src/PVE/LXC.pm        | 2 --
>>>   src/PVE/LXC/Config.pm | 2 ++
>>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>>




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [pve-devel] [PATCH container 0/2] Improve volume deactivation
  2021-12-02  7:40     ` Fabian Ebner
@ 2021-12-02  7:49       ` Aaron Lauterer
  0 siblings, 0 replies; 7+ messages in thread
From: Aaron Lauterer @ 2021-12-02  7:49 UTC (permalink / raw)
  To: Fabian Ebner, pve-devel



On 12/2/21 08:40, Fabian Ebner wrote:
> Am 01.12.21 um 17:27 schrieb Aaron Lauterer:
>>
>>
>> On 12/1/21 11:12, Fabian Ebner wrote:
>>> Am 26.11.21 um 11:19 schrieb Aaron Lauterer:
>>>> While working on the reassign feature we (F.Ebner & I) discovered that
>>>> it is possible, mainly with RBD volumes, to get into situations where it
>>>> is not possible to remove that volume as an old orphaned RBD mapping
>>>> still exists.
>>>>
>>>> Mainly when converting a container on RBD storage to a template and when
>>>> adding a new MP to a container that is not running and reassigning that
>>>> MP right away to another container.
>>>>
>>>
>>> I feel like cleaning up such things should be the responsibility of the storage plugin itself. It knows best when a volume gets a new name and what needs to happen if there is still something using the old name around.
>>>
>>> For example, after a full clone, volumes from both containers will be active and then reassigning or converting to template will lead to the issue again. There are likely other places where we don't cleanly deactivate. Of course we could try and hunt them all down ;), but quoting from [0]:
>>>
>>> this is fundamentally how volume activation works in PVE - we activate (and skip the expensive parts if already activated) often, but are very careful about de-activating only where necessary (shared volumes when migrating) or clearly 100% right (error handling before removing a newly allocated volume for example).
>>>
>>> [0]: https://bugzilla.proxmox.com/show_bug.cgi?id=3756#c3
>>
>> Hmm okay yeah, definitely valid regarding the second patch. But the first one would still be valid AFAIU because I don't understand why we activate the volumes when creating a template for containers only, but not for VMs if we don't need to do anything in the volume. So not activating it in the first place would help at least in that case.
>>
> 
> Sorry, I didn't mean to imply that the patches were wrong, just wanted to point out that they don't fully address the issue.

No need to excuse yourself, I came across much meaner than I intended to. :) I'll have to think about how to handle the second patch better to avoid these issues of potentially deactivating a storage when we shouldn't.

>>>
>>>> Aaron Lauterer (2):
>>>>    template_create: remove volume activation
>>>>    apply_pending_mountpoint: deactivate volumes if not running
>>>>
>>>>   src/PVE/LXC.pm        | 2 --
>>>>   src/PVE/LXC/Config.pm | 2 ++
>>>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>>>




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-12-02  7:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-26 10:19 [pve-devel] [PATCH container 0/2] Improve volume deactivation Aaron Lauterer
2021-11-26 10:19 ` [pve-devel] [PATCH container 1/2] template_create: remove volume activation Aaron Lauterer
2021-11-26 10:19 ` [pve-devel] [PATCH container 2/2] apply_pending_mountpoint: deactivate volumes if not running Aaron Lauterer
2021-12-01 10:12 ` [pve-devel] [PATCH container 0/2] Improve volume deactivation Fabian Ebner
2021-12-01 16:27   ` Aaron Lauterer
2021-12-02  7:40     ` Fabian Ebner
2021-12-02  7:49       ` Aaron Lauterer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal