From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.proxmox.com (Postfix) with UTF8SMTPS id 2D8F76BADC for ; Thu, 18 Mar 2021 10:25:51 +0100 (CET) Received: from firstgate.proxmox.com (localhost [127.0.0.1]) by firstgate.proxmox.com (Proxmox) with UTF8SMTP id 1B3B7EC85 for ; Thu, 18 Mar 2021 10:25:21 +0100 (CET) Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com [212.186.127.180]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by firstgate.proxmox.com (Proxmox) with UTF8SMTPS id 9C226EC77 for ; Thu, 18 Mar 2021 10:25:20 +0100 (CET) Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with UTF8SMTP id 5E1A142824; Thu, 18 Mar 2021 10:25:20 +0100 (CET) Message-ID: <02dea46e-870d-a3b8-fa72-d7e5bca5a7ee@proxmox.com> Date: Thu, 18 Mar 2021 10:25:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101 Thunderbird/87.0 Content-Language: en-US To: Dietmar Maurer , Proxmox Backup Server development discussion References: <20210317133810.24041-1-d.csapak@proxmox.com> <1769832677.243.1616046831798@webmail.proxmox.com> From: Dominik Csapak In-Reply-To: <1769832677.243.1616046831798@webmail.proxmox.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-SPAM-LEVEL: Spam detection results: 0 AWL 0.184 Adjusted score from AWL reputation of From: address KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment NICE_REPLY_A -0.001 Looks like a legit reply (A) RCVD_IN_DNSWL_MED -2.3 Sender listed at https://www.dnswl.org/, medium trust SPF_HELO_NONE 0.001 SPF: HELO does not publish an SPF Record SPF_PASS -0.001 SPF: sender matches SPF record Subject: Re: [pbs-devel] [PATCH proxmox-backup] api2/tape/backup: wait indefinitely for lock in scheduled backup jobs X-BeenThere: pbs-devel@lists.proxmox.com X-Mailman-Version: 2.1.29 Precedence: list List-Id: Proxmox Backup Server development discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 18 Mar 2021 09:25:51 -0000 On 3/18/21 06:53, Dietmar Maurer wrote: >> - // early check/lock before starting worker >> - let drive_lock = lock_tape_device(&drive_config, &setup.drive)?; >> + // for scheduled jobs we acquire the lock later in the worker >> + let drive_lock = if schedule.is_some() { >> + None >> + } else { >> + Some(lock_tape_device(&drive_config, &setup.drive)?) >> + }; > > What is the reason for the different locking times? Can't we always lock > later in the worker? > yes we can, but for the non-scheduled backup, we only have one try to lock and i did not want to start a task since that would generate a task entry that was preventable by checking beforehand On 3/18/21 07:04, Dietmar Maurer wrote: > >> + let (job_result, summary) = match try_block!({ >> + if schedule.is_some() { >> + // for scheduled tape backup jobs, we wait indefinitely for the lock >> + task_log!(worker, "waiting for drive lock..."); >> + loop { >> + if let Ok(lock) = lock_tape_device(&drive_config, &setup.drive) { >> + drive_lock = Some(lock); >> + break; >> + } // ignore errors > > I would prefer to add a timeout parameter to lock_tape_device() call. > The question is if the lock call gets interrupted by task abort? i did it this way, because an abort would not trigger if we are just waiting on a lock, so i decided to try the lock in a loop with the check_abort()? call below (so one can abort this) we could ofc add a timeout parameter, but would not change the code here (and currently has no real upside, since we probably would not use different timeouts anyway?) > >> + >> + worker.check_abort()?; >> + } >> + } >> + set_tape_device_state(&setup.drive, &worker.upid().to_string())?;