From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <d.csapak@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with UTF8SMTPS id 2D8F76BADC
 for <pbs-devel@lists.proxmox.com>; Thu, 18 Mar 2021 10:25:51 +0100 (CET)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with UTF8SMTP id 1B3B7EC85
 for <pbs-devel@lists.proxmox.com>; Thu, 18 Mar 2021 10:25:21 +0100 (CET)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with UTF8SMTPS id 9C226EC77
 for <pbs-devel@lists.proxmox.com>; Thu, 18 Mar 2021 10:25:20 +0100 (CET)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with UTF8SMTP id 5E1A142824;
 Thu, 18 Mar 2021 10:25:20 +0100 (CET)
Message-ID: <02dea46e-870d-a3b8-fa72-d7e5bca5a7ee@proxmox.com>
Date: Thu, 18 Mar 2021 10:25:19 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101
 Thunderbird/87.0
Content-Language: en-US
To: Dietmar Maurer <dietmar@proxmox.com>,
 Proxmox Backup Server development discussion <pbs-devel@lists.proxmox.com>
References: <20210317133810.24041-1-d.csapak@proxmox.com>
 <1769832677.243.1616046831798@webmail.proxmox.com>
From: Dominik Csapak <d.csapak@proxmox.com>
In-Reply-To: <1769832677.243.1616046831798@webmail.proxmox.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.184 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 NICE_REPLY_A           -0.001 Looks like a legit reply (A)
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pbs-devel] [PATCH proxmox-backup] api2/tape/backup: wait
 indefinitely for lock in scheduled backup jobs
X-BeenThere: pbs-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox Backup Server development discussion
 <pbs-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pbs-devel/>
List-Post: <mailto:pbs-devel@lists.proxmox.com>
List-Help: <mailto:pbs-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel>, 
 <mailto:pbs-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Thu, 18 Mar 2021 09:25:51 -0000

On 3/18/21 06:53, Dietmar Maurer wrote:
>> -    // early check/lock before starting worker
>> -    let drive_lock = lock_tape_device(&drive_config, &setup.drive)?;
>> +    // for scheduled jobs we acquire the lock later in the worker
>> +    let drive_lock = if schedule.is_some() {
>> +        None
>> +    } else {
>> +        Some(lock_tape_device(&drive_config, &setup.drive)?)
>> +    };
> 
> What is the reason for the different locking times? Can't we always lock
> later in the worker?
> 

yes we can, but for the non-scheduled backup, we only have one try
to lock and i did not want to start a task since that would generate
a task entry that was preventable by checking beforehand

On 3/18/21 07:04, Dietmar Maurer wrote:
 >
 >> +            let (job_result, summary) = match try_block!({
 >> +                if schedule.is_some() {
 >> +                    // for scheduled tape backup jobs, we wait 
indefinitely for the lock
 >> +                    task_log!(worker, "waiting for drive lock...");
 >> +                    loop {
 >> +                        if let Ok(lock) = 
lock_tape_device(&drive_config, &setup.drive) {
 >> +                            drive_lock = Some(lock);
 >> +                            break;
 >> +                        } // ignore errors
 >
 > I would prefer to add a timeout parameter to lock_tape_device() call.
 > The question is if the lock call gets interrupted by task abort?

i did it this way, because an abort would not trigger if we are just 
waiting on a lock, so i decided to try the lock in a loop with
the check_abort()? call below (so one can abort this)

we could ofc add a timeout parameter, but would not change
the code here (and currently has no real upside, since
we probably would not use different timeouts anyway?)

 >
 >> +
 >> +                        worker.check_abort()?;
 >> +                    }
 >> +                }
 >> +                set_tape_device_state(&setup.drive, 
&worker.upid().to_string())?;