all lists on lists.proxmox.com
 help / color / mirror / Atom feed
* [pve-devel] [RFC/PATCH qemu] PVE-Backup: avoid segfault issues upon backup-cancel
@ 2022-05-24 11:30 Fabian Ebner
  2022-05-25  8:10 ` Fabian Ebner
  0 siblings, 1 reply; 2+ messages in thread
From: Fabian Ebner @ 2022-05-24 11:30 UTC (permalink / raw)
  To: pve-devel

When canceling a backup in PVE via a signal it's easy to run into a
situation where the job is already failing when the backup_cancel QMP
command comes in. With a bit of unlucky timing on top, it can happen
that job_exit() runs between schedulung of job_cancel_bh() and
execution of job_cancel_bh(). But job_cancel_sync() does not expect
that the job is already finalized (in fact, the job might've been
freed already, but even if it isn't, job_cancel_sync() would try to
deref job->txn which would be NULL at that point).

It is not possible to simply use the job_cancel() (which is advertised
as being async but isn't in all cases) in qmp_backup_cancel() for the
same reason job_cancel_sync() cannot be used. Namely, because it can
invoke job_finish_sync() (which uses AIO_WAIT_WHILE and thus hangs if
called from a coroutine). This happens when there's multiple jobs in
the transaction and job->deferred_to_main_loop is true (is set before
scheduling job_exit()) or if the job was not started yet.

Fix the issue by selecting the job to cancel in job_cancel_bh() itself
using the first job that's not completed yet. This is not necessarily
the first job in the list, because pvebackup_co_complete_stream()
might not yet have removed a completed job when job_cancel_bh() runs.

An alternative would be to continue using only the first job and
checking against JOB_STATUS_CONCLUDED|JOB_STATUS_NULL to decide if
it's still necessary and possible to cancel, but the approach with
using the first non-completed job seemed more robust.

Signed-off-by: Fabian Ebner <f.ebner@proxmox.com>
---

Intended to be ordered after
0038-PVE-Backup-Don-t-block-on-finishing-and-cleanup-crea.patch
or could also be squashed into that (while lifting the commit message
to the main repo). Of course I can also send that directly if this is
ACKed.

 pve-backup.c | 72 +++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 52 insertions(+), 20 deletions(-)

diff --git a/pve-backup.c b/pve-backup.c
index 6f05796fad..d0da6b63be 100644
--- a/pve-backup.c
+++ b/pve-backup.c
@@ -345,15 +345,45 @@ static void pvebackup_complete_cb(void *opaque, int ret)
 
 /*
  * job_cancel(_sync) does not like to be called from coroutines, so defer to
- * main loop processing via a bottom half.
+ * main loop processing via a bottom half. Assumes that caller holds
+ * backup_mutex and called job_ref on all jobs in backup_state.di_list.
  */
 static void job_cancel_bh(void *opaque) {
     CoCtxData *data = (CoCtxData*)opaque;
-    Job *job = (Job*)data->data;
-    AioContext *job_ctx = job->aio_context;
-    aio_context_acquire(job_ctx);
-    job_cancel_sync(job, true);
-    aio_context_release(job_ctx);
+
+    /*
+     * It's enough to cancel one job in the transaction, the rest will follow
+     * automatically.
+     */
+    bool canceled = false;
+
+    /*
+     * Be careful to pick a valid job to cancel:
+     * 1. job_cancel_sync() does not expect the job to be finalized already.
+     * 2. job_exit() might run between scheduling and running job_cancel_bh()
+     *    and pvebackup_co_complete_stream() might not have removed the job from
+     *    the list yet (in fact, cannot, because it waits for the backup_mutex).
+     * Requiring !job_is_completed() ensures that no finalized job is picked.
+     */
+    GList *bdi = g_list_first(backup_state.di_list);
+    while (bdi) {
+        if (bdi->data) {
+            BlockJob *bj = ((PVEBackupDevInfo *)bdi->data)->job;
+            if (bj) {
+                Job *job = &bj->job;
+                if (!canceled && !job_is_completed(job)) {
+                    AioContext *job_ctx = job->aio_context;
+                    aio_context_acquire(job_ctx);
+                    job_cancel_sync(job, true);
+                    aio_context_release(job_ctx);
+                    canceled = true;
+                }
+                job_unref(job);
+            }
+        }
+        bdi = g_list_next(bdi);
+    }
+
     aio_co_enter(data->ctx, data->co);
 }
 
@@ -374,23 +404,25 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque)
         proxmox_backup_abort(backup_state.pbs, "backup canceled");
     }
 
-    /* it's enough to cancel one job in the transaction, the rest will follow
-     * automatically */
     GList *bdi = g_list_first(backup_state.di_list);
-    BlockJob *cancel_job = bdi && bdi->data ?
-        ((PVEBackupDevInfo *)bdi->data)->job :
-        NULL;
-
-    if (cancel_job) {
-        CoCtxData data = {
-            .ctx = qemu_get_current_aio_context(),
-            .co = qemu_coroutine_self(),
-            .data = &cancel_job->job,
-        };
-        aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
-        qemu_coroutine_yield();
+    while (bdi) {
+        if (bdi->data) {
+            BlockJob *bj = ((PVEBackupDevInfo *)bdi->data)->job;
+            if (bj) {
+                /* ensure job is not freed before job_cancel_bh() runs */
+                job_ref(&bj->job);
+            }
+        }
+        bdi = g_list_next(bdi);
     }
 
+    CoCtxData data = {
+        .ctx = qemu_get_current_aio_context(),
+        .co = qemu_coroutine_self(),
+    };
+    aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data);
+    qemu_coroutine_yield();
+
     qemu_co_mutex_unlock(&backup_state.backup_mutex);
 }
 
-- 
2.30.2





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [pve-devel] [RFC/PATCH qemu] PVE-Backup: avoid segfault issues upon backup-cancel
  2022-05-24 11:30 [pve-devel] [RFC/PATCH qemu] PVE-Backup: avoid segfault issues upon backup-cancel Fabian Ebner
@ 2022-05-25  8:10 ` Fabian Ebner
  0 siblings, 0 replies; 2+ messages in thread
From: Fabian Ebner @ 2022-05-25  8:10 UTC (permalink / raw)
  To: pve-devel

There might still be an edge case where completion and cancel race (I
didn't run into this in practice yet, but at a first glance it seems
possible):

1. job_exit -> job_completed -> job_finalize_single starts
2. pvebackup_co_complete_stream gets spawned in completion callback
3. job finalize_single finishes -> job's refcount hits zero -> job is freed
4. qmp_backup_cancel comes in and locks backup_state.backup_mutex before
pvebackup_co_complete_stream can remove the job from the di_list
5. qmp_backup_cancel/job_cancel_bh will operate on the already freed memory

It /would/ be fine if pvebackup_co_complete_stream is guaranteed to
run/take the backup_mutex before qmp_backup_cancel. It *is* spawned
earlier so maybe it is, but I haven't looked into ordering guarantees
for coroutines yet and it does have another yield point when taking
&backup_state.stat.lock, so I'm not so sure.

Possible fix: ref jobs when adding them to di_list and unref them when
removing them from di_list (instead of the proposed ref/unref used in
this patch).


Yet another issue (not directly related, but thematically): in
create_backup_jobs_bh, in the error case, job_cancel_sync is called for
each job, but since it's a transaction, the first call will cancel and
free all jobs, also leading to segfaults in scenarios where creation of
a non-first job fails. And even if the first one fails, the job_unref
there is also wrong, since the job was freed during job_cancel_sync.




^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-05-25  8:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24 11:30 [pve-devel] [RFC/PATCH qemu] PVE-Backup: avoid segfault issues upon backup-cancel Fabian Ebner
2022-05-25  8:10 ` Fabian Ebner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal