From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.proxmox.com (Postfix) with ESMTPS id B3A1D63F8D for ; Thu, 29 Oct 2020 14:11:01 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with ESMTP id 79BAB958A for ; Thu, 29 Oct 2020 14:11:01 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with ESMTPS id 5C5119551 for ; Thu, 29 Oct 2020 14:10:44 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 1FF9A45F91 for ; Thu, 29 Oct 2020 14:10:44 +0100 (CET) From: Stefan Reiter To: pve-devel@lists.proxmox.com Date: Thu, 29 Oct 2020 14:10:34 +0100 Message-Id: <20201029131036.11786-5-s.reiter@proxmox.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20201029131036.11786-1-s.reiter@proxmox.com> References: <20201029131036.11786-1-s.reiter@proxmox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-SPAM-LEVEL: Spam detection results: 0 AWL -0.038 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: [pve-devel] [PATCH v2 qemu 4/6] PVE: Don't call job_cancel in coroutines X-BeenThere: pve-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox VE development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Oct 2020 13:11:01 -0000 ...because it hangs on cancelling other jobs in the txn if you do. Signed-off-by: Stefan Reiter --- v2: * use new CoCtxData * use aio_co_enter vs aio_co_schedule for BH return * cache job_ctx since job_cancel_sync might switch the job to a different context (when iothreads are in use) thus making us drop the wrong AioContext if we access job->aio_context again. This is incidentally the same bug I once fixed for upstream, almost made it in again... pve-backup.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/pve-backup.c b/pve-backup.c index 92eaada0bc..0466145bec 100644 --- a/pve-backup.c +++ b/pve-backup.c @@ -332,6 +332,20 @@ static void pvebackup_complete_cb(void *opaque, int ret) aio_co_enter(qemu_get_aio_context(), co); } +/* + * job_cancel(_sync) does not like to be called from coroutines, so defer to + * main loop processing via a bottom half. + */ +static void job_cancel_bh(void *opaque) { + CoCtxData *data = (CoCtxData*)opaque; + Job *job = (Job*)data->data; + AioContext *job_ctx = job->aio_context; + aio_context_acquire(job_ctx); + job_cancel_sync(job); + aio_context_release(job_ctx); + aio_co_enter(data->ctx, data->co); +} + static void coroutine_fn pvebackup_co_cancel(void *opaque) { Error *cancel_err = NULL; @@ -357,7 +371,13 @@ static void coroutine_fn pvebackup_co_cancel(void *opaque) NULL; if (cancel_job) { - job_cancel(&cancel_job->job, false); + CoCtxData data = { + .ctx = qemu_get_current_aio_context(), + .co = qemu_coroutine_self(), + .data = &cancel_job->job, + }; + aio_bh_schedule_oneshot(data.ctx, job_cancel_bh, &data); + qemu_coroutine_yield(); } qemu_co_mutex_unlock(&backup_state.backup_mutex); -- 2.20.1