* [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore
@ 2020-08-12 10:01 Fabian Ebner
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 2/3] restore_vma_archive: get rid of oldtimeout handling Fabian Ebner
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Fabian Ebner @ 2020-08-12 10:01 UTC (permalink / raw)
To: pve-devel
qcow2 images are allocated with --preallocation=metadata,
which can take a while for large images.
A 5 second timeout is set before reading the device map, so it's
necessary to restore the old timeout before calling print_devmap().
Time spent allocating now falls under that old timeout.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
Changes from v1:
* instead of increasing the allocation timeout,
get rid of it as Fabian suggested
PVE/QemuServer.pm | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index a9c0dac..7169006 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6274,15 +6274,13 @@ sub restore_vma_archive {
my ($dev_id, $size, $devname) = ($1, $2, $3);
$devinfo->{$devname} = { size => $size, dev_id => $dev_id };
} elsif ($line =~ m/^CTIME: /) {
- # we correctly received the vma config, so we can disable
- # the timeout now for disk allocation (set to 10 minutes, so
- # that we always timeout if something goes wrong)
- alarm(600);
- &$print_devmap();
- print $fifofh "done\n";
+ # we correctly received the vma config, so restore old timeout
my $tmp = $oldtimeout || 0;
$oldtimeout = undef;
alarm($tmp);
+
+ &$print_devmap();
+ print $fifofh "done\n";
close($fifofh);
}
};
--
2.20.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH/RFC v2 qemu-server 2/3] restore_vma_archive: get rid of oldtimeout handling
2020-08-12 10:01 [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Fabian Ebner
@ 2020-08-12 10:01 ` Fabian Ebner
2020-08-20 8:55 ` Thomas Lamprecht
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 3/3] restore_vma_archive: remove timeout for reading the device map Fabian Ebner
2020-08-20 8:56 ` [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Thomas Lamprecht
2 siblings, 1 reply; 8+ messages in thread
From: Fabian Ebner @ 2020-08-12 10:01 UTC (permalink / raw)
To: pve-devel
Assume that the function is called within a worker not restricted by
any timeout. This is true currently, because the only path leading to
restore_vma_archive is via restore_file_archive being called within a
worker by the create_vm API call.
Avoid generic timeout error message.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
New in v2
PVE/QemuServer.pm | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 7169006..794819b 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6162,7 +6162,6 @@ sub restore_vma_archive {
$add_pipe->(['vma', 'extract', '-v', '-r', $mapfifo, $readfrom, $tmpdir]);
- my $oldtimeout;
my $timeout = 5;
my $devinfo = {};
@@ -6261,9 +6260,9 @@ sub restore_vma_archive {
local $SIG{QUIT} =
local $SIG{HUP} =
local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
- local $SIG{ALRM} = sub { die "got timeout\n"; };
+ local $SIG{ALRM} = sub { die "got timeout reading device map\n"; };
- $oldtimeout = alarm($timeout);
+ alarm($timeout);
my $parser = sub {
my $line = shift;
@@ -6274,11 +6273,7 @@ sub restore_vma_archive {
my ($dev_id, $size, $devname) = ($1, $2, $3);
$devinfo->{$devname} = { size => $size, dev_id => $dev_id };
} elsif ($line =~ m/^CTIME: /) {
- # we correctly received the vma config, so restore old timeout
- my $tmp = $oldtimeout || 0;
- $oldtimeout = undef;
- alarm($tmp);
-
+ alarm(0);
&$print_devmap();
print $fifofh "done\n";
close($fifofh);
@@ -6290,7 +6285,7 @@ sub restore_vma_archive {
};
my $err = $@;
- alarm($oldtimeout) if $oldtimeout;
+ alarm(0);
$restore_deactivate_volumes->($cfg, $devinfo);
--
2.20.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [pve-devel] [PATCH/RFC v2 qemu-server 3/3] restore_vma_archive: remove timeout for reading the device map
2020-08-12 10:01 [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Fabian Ebner
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 2/3] restore_vma_archive: get rid of oldtimeout handling Fabian Ebner
@ 2020-08-12 10:01 ` Fabian Ebner
2020-08-20 8:53 ` Thomas Lamprecht
2020-08-20 8:56 ` [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Thomas Lamprecht
2 siblings, 1 reply; 8+ messages in thread
From: Fabian Ebner @ 2020-08-12 10:01 UTC (permalink / raw)
To: pve-devel
If there is no serious problem, it shouldn't be possible to run into
this timeout anyways. It's just (extracting and) reading the header of
the (compressed) vma file. And if there is a serious problem, then the
commands will most likely fail for a different reason, e.g. unable to open,
corrupt vma, etc.
Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---
New in v2
Hope I'm not missing anything important.
PVE/QemuServer.pm | 8 --------
1 file changed, 8 deletions(-)
diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
index 794819b..342114d 100644
--- a/PVE/QemuServer.pm
+++ b/PVE/QemuServer.pm
@@ -6162,8 +6162,6 @@ sub restore_vma_archive {
$add_pipe->(['vma', 'extract', '-v', '-r', $mapfifo, $readfrom, $tmpdir]);
- my $timeout = 5;
-
my $devinfo = {};
my $rpcenv = PVE::RPCEnvironment::get();
@@ -6260,9 +6258,6 @@ sub restore_vma_archive {
local $SIG{QUIT} =
local $SIG{HUP} =
local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
- local $SIG{ALRM} = sub { die "got timeout reading device map\n"; };
-
- alarm($timeout);
my $parser = sub {
my $line = shift;
@@ -6273,7 +6268,6 @@ sub restore_vma_archive {
my ($dev_id, $size, $devname) = ($1, $2, $3);
$devinfo->{$devname} = { size => $size, dev_id => $dev_id };
} elsif ($line =~ m/^CTIME: /) {
- alarm(0);
&$print_devmap();
print $fifofh "done\n";
close($fifofh);
@@ -6285,8 +6279,6 @@ sub restore_vma_archive {
};
my $err = $@;
- alarm(0);
-
$restore_deactivate_volumes->($cfg, $devinfo);
unlink $mapfifo;
--
2.20.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH/RFC v2 qemu-server 3/3] restore_vma_archive: remove timeout for reading the device map
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 3/3] restore_vma_archive: remove timeout for reading the device map Fabian Ebner
@ 2020-08-20 8:53 ` Thomas Lamprecht
2020-08-20 9:22 ` Fabian Ebner
0 siblings, 1 reply; 8+ messages in thread
From: Thomas Lamprecht @ 2020-08-20 8:53 UTC (permalink / raw)
To: Proxmox VE development discussion, Fabian Ebner
On 12.08.20 12:01, Fabian Ebner wrote:
> If there is no serious problem, it shouldn't be possible to run into
> this timeout anyways. It's just (extracting and) reading the header of
> the (compressed) vma file. And if there is a serious problem, then the
> commands will most likely fail for a different reason, e.g. unable to open,
> corrupt vma, etc.
>
> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
> ---
>
> New in v2
>
> Hope I'm not missing anything important.
the process doing the read can hang on IO for a long time, that can be pretty normal,
especially on network attached storage and some IO load.
Not saying that this isn't OK at all, but your commit message suggests misleadingly
that this would either be very short or an immediate error, which isn't exactly true.
>
> PVE/QemuServer.pm | 8 --------
> 1 file changed, 8 deletions(-)
>
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 794819b..342114d 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -6162,8 +6162,6 @@ sub restore_vma_archive {
>
> $add_pipe->(['vma', 'extract', '-v', '-r', $mapfifo, $readfrom, $tmpdir]);
>
> - my $timeout = 5;
> -
> my $devinfo = {};
>
> my $rpcenv = PVE::RPCEnvironment::get();
> @@ -6260,9 +6258,6 @@ sub restore_vma_archive {
> local $SIG{QUIT} =
> local $SIG{HUP} =
> local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
> - local $SIG{ALRM} = sub { die "got timeout reading device map\n"; };
> -
> - alarm($timeout);
>
> my $parser = sub {
> my $line = shift;
> @@ -6273,7 +6268,6 @@ sub restore_vma_archive {
> my ($dev_id, $size, $devname) = ($1, $2, $3);
> $devinfo->{$devname} = { size => $size, dev_id => $dev_id };
> } elsif ($line =~ m/^CTIME: /) {
> - alarm(0);
> &$print_devmap();
> print $fifofh "done\n";
> close($fifofh);
> @@ -6285,8 +6279,6 @@ sub restore_vma_archive {
> };
> my $err = $@;
>
> - alarm(0);
> -
> $restore_deactivate_volumes->($cfg, $devinfo);
>
> unlink $mapfifo;
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH/RFC v2 qemu-server 2/3] restore_vma_archive: get rid of oldtimeout handling
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 2/3] restore_vma_archive: get rid of oldtimeout handling Fabian Ebner
@ 2020-08-20 8:55 ` Thomas Lamprecht
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Lamprecht @ 2020-08-20 8:55 UTC (permalink / raw)
To: Proxmox VE development discussion, Fabian Ebner
On 12.08.20 12:01, Fabian Ebner wrote:
> Assume that the function is called within a worker not restricted by
> any timeout. This is true currently, because the only path leading to
> restore_vma_archive is via restore_file_archive being called within a
> worker by the create_vm API call.
you could branch on, or maybe even assert, the RESTEnvrionment is_worker()
helpers result, to tighten this assumption
>
> Avoid generic timeout error message.
>
> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
> ---
>
> New in v2
>
> PVE/QemuServer.pm | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index 7169006..794819b 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -6162,7 +6162,6 @@ sub restore_vma_archive {
>
> $add_pipe->(['vma', 'extract', '-v', '-r', $mapfifo, $readfrom, $tmpdir]);
>
> - my $oldtimeout;
> my $timeout = 5;
>
> my $devinfo = {};
> @@ -6261,9 +6260,9 @@ sub restore_vma_archive {
> local $SIG{QUIT} =
> local $SIG{HUP} =
> local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
> - local $SIG{ALRM} = sub { die "got timeout\n"; };
> + local $SIG{ALRM} = sub { die "got timeout reading device map\n"; };
>
> - $oldtimeout = alarm($timeout);
> + alarm($timeout);
>
> my $parser = sub {
> my $line = shift;
> @@ -6274,11 +6273,7 @@ sub restore_vma_archive {
> my ($dev_id, $size, $devname) = ($1, $2, $3);
> $devinfo->{$devname} = { size => $size, dev_id => $dev_id };
> } elsif ($line =~ m/^CTIME: /) {
> - # we correctly received the vma config, so restore old timeout
> - my $tmp = $oldtimeout || 0;
> - $oldtimeout = undef;
> - alarm($tmp);
> -
> + alarm(0);
> &$print_devmap();
> print $fifofh "done\n";
> close($fifofh);
> @@ -6290,7 +6285,7 @@ sub restore_vma_archive {
> };
> my $err = $@;
>
> - alarm($oldtimeout) if $oldtimeout;
> + alarm(0);
>
> $restore_deactivate_volumes->($cfg, $devinfo);
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore
2020-08-12 10:01 [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Fabian Ebner
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 2/3] restore_vma_archive: get rid of oldtimeout handling Fabian Ebner
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 3/3] restore_vma_archive: remove timeout for reading the device map Fabian Ebner
@ 2020-08-20 8:56 ` Thomas Lamprecht
2020-08-20 9:21 ` Fabian Ebner
2 siblings, 1 reply; 8+ messages in thread
From: Thomas Lamprecht @ 2020-08-20 8:56 UTC (permalink / raw)
To: Proxmox VE development discussion, Fabian Ebner
On 12.08.20 12:01, Fabian Ebner wrote:
> qcow2 images are allocated with --preallocation=metadata,
> which can take a while for large images.
>
> A 5 second timeout is set before reading the device map, so it's
s/seconds/minutes/ ?
> necessary to restore the old timeout before calling print_devmap().
> Time spent allocating now falls under that old timeout.
>
> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
> ---
>
> Changes from v1:
> * instead of increasing the allocation timeout,
> get rid of it as Fabian suggested
>
> PVE/QemuServer.pm | 10 ++++------
> 1 file changed, 4 insertions(+), 6 deletions(-)
>
> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
> index a9c0dac..7169006 100644
> --- a/PVE/QemuServer.pm
> +++ b/PVE/QemuServer.pm
> @@ -6274,15 +6274,13 @@ sub restore_vma_archive {
> my ($dev_id, $size, $devname) = ($1, $2, $3);
> $devinfo->{$devname} = { size => $size, dev_id => $dev_id };
> } elsif ($line =~ m/^CTIME: /) {
> - # we correctly received the vma config, so we can disable
> - # the timeout now for disk allocation (set to 10 minutes, so
> - # that we always timeout if something goes wrong)
> - alarm(600);
> - &$print_devmap();
> - print $fifofh "done\n";
> + # we correctly received the vma config, so restore old timeout
> my $tmp = $oldtimeout || 0;
> $oldtimeout = undef;
> alarm($tmp);
> +
> + &$print_devmap();
> + print $fifofh "done\n";
> close($fifofh);
> }
> };
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore
2020-08-20 8:56 ` [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Thomas Lamprecht
@ 2020-08-20 9:21 ` Fabian Ebner
0 siblings, 0 replies; 8+ messages in thread
From: Fabian Ebner @ 2020-08-20 9:21 UTC (permalink / raw)
To: Proxmox VE development discussion; +Cc: Thomas Lamprecht
Am 20.08.20 um 10:56 schrieb Thomas Lamprecht:
> On 12.08.20 12:01, Fabian Ebner wrote:
>> qcow2 images are allocated with --preallocation=metadata,
>> which can take a while for large images.
>>
>> A 5 second timeout is set before reading the device map, so it's
>
> s/seconds/minutes/ ?
>
No, $timeout = 5; is defined further up, see the second patch for the
relevant code pieces.
>> necessary to restore the old timeout before calling print_devmap().
>> Time spent allocating now falls under that old timeout.
>>
>> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>> ---
>>
>> Changes from v1:
>> * instead of increasing the allocation timeout,
>> get rid of it as Fabian suggested
>>
>> PVE/QemuServer.pm | 10 ++++------
>> 1 file changed, 4 insertions(+), 6 deletions(-)
>>
>> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
>> index a9c0dac..7169006 100644
>> --- a/PVE/QemuServer.pm
>> +++ b/PVE/QemuServer.pm
>> @@ -6274,15 +6274,13 @@ sub restore_vma_archive {
>> my ($dev_id, $size, $devname) = ($1, $2, $3);
>> $devinfo->{$devname} = { size => $size, dev_id => $dev_id };
>> } elsif ($line =~ m/^CTIME: /) {
>> - # we correctly received the vma config, so we can disable
>> - # the timeout now for disk allocation (set to 10 minutes, so
>> - # that we always timeout if something goes wrong)
>> - alarm(600);
>> - &$print_devmap();
>> - print $fifofh "done\n";
>> + # we correctly received the vma config, so restore old timeout
>> my $tmp = $oldtimeout || 0;
>> $oldtimeout = undef;
>> alarm($tmp);
>> +
>> + &$print_devmap();
>> + print $fifofh "done\n";
>> close($fifofh);
>> }
>> };
>>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [pve-devel] [PATCH/RFC v2 qemu-server 3/3] restore_vma_archive: remove timeout for reading the device map
2020-08-20 8:53 ` Thomas Lamprecht
@ 2020-08-20 9:22 ` Fabian Ebner
0 siblings, 0 replies; 8+ messages in thread
From: Fabian Ebner @ 2020-08-20 9:22 UTC (permalink / raw)
To: Thomas Lamprecht, Proxmox VE development discussion
Am 20.08.20 um 10:53 schrieb Thomas Lamprecht:
> On 12.08.20 12:01, Fabian Ebner wrote:
>> If there is no serious problem, it shouldn't be possible to run into
>> this timeout anyways. It's just (extracting and) reading the header of
>> the (compressed) vma file. And if there is a serious problem, then the
>> commands will most likely fail for a different reason, e.g. unable to open,
>> corrupt vma, etc.
>>
>> Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
>> ---
>>
>> New in v2
>>
>> Hope I'm not missing anything important.
>
> the process doing the read can hang on IO for a long time, that can be pretty normal,
> especially on network attached storage and some IO load.
> Not saying that this isn't OK at all, but your commit message suggests misleadingly
> that this would either be very short or an immediate error, which isn't exactly true.
>
True, so this timeout can trigger in such cases. The question is if we
want to die after 5 seconds in that case (current behavior) or if we
just give it as much time as it needs and let the user cancel if
something hangs completely.
>>
>> PVE/QemuServer.pm | 8 --------
>> 1 file changed, 8 deletions(-)
>>
>> diff --git a/PVE/QemuServer.pm b/PVE/QemuServer.pm
>> index 794819b..342114d 100644
>> --- a/PVE/QemuServer.pm
>> +++ b/PVE/QemuServer.pm
>> @@ -6162,8 +6162,6 @@ sub restore_vma_archive {
>>
>> $add_pipe->(['vma', 'extract', '-v', '-r', $mapfifo, $readfrom, $tmpdir]);
>>
>> - my $timeout = 5;
>> -
>> my $devinfo = {};
>>
>> my $rpcenv = PVE::RPCEnvironment::get();
>> @@ -6260,9 +6258,6 @@ sub restore_vma_archive {
>> local $SIG{QUIT} =
>> local $SIG{HUP} =
>> local $SIG{PIPE} = sub { die "interrupted by signal\n"; };
>> - local $SIG{ALRM} = sub { die "got timeout reading device map\n"; };
>> -
>> - alarm($timeout);
>>
>> my $parser = sub {
>> my $line = shift;
>> @@ -6273,7 +6268,6 @@ sub restore_vma_archive {
>> my ($dev_id, $size, $devname) = ($1, $2, $3);
>> $devinfo->{$devname} = { size => $size, dev_id => $dev_id };
>> } elsif ($line =~ m/^CTIME: /) {
>> - alarm(0);
>> &$print_devmap();
>> print $fifofh "done\n";
>> close($fifofh);
>> @@ -6285,8 +6279,6 @@ sub restore_vma_archive {
>> };
>> my $err = $@;
>>
>> - alarm(0);
>> -
>> $restore_deactivate_volumes->($cfg, $devinfo);
>>
>> unlink $mapfifo;
>>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-08-20 9:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-12 10:01 [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Fabian Ebner
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 2/3] restore_vma_archive: get rid of oldtimeout handling Fabian Ebner
2020-08-20 8:55 ` Thomas Lamprecht
2020-08-12 10:01 ` [pve-devel] [PATCH/RFC v2 qemu-server 3/3] restore_vma_archive: remove timeout for reading the device map Fabian Ebner
2020-08-20 8:53 ` Thomas Lamprecht
2020-08-20 9:22 ` Fabian Ebner
2020-08-20 8:56 ` [pve-devel] [PATCH v2 qemu-server 1/3] Fix #2816: remove timeout for allocation on restore Thomas Lamprecht
2020-08-20 9:21 ` Fabian Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox