From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pve-devel-bounces@lists.proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
	by lore.proxmox.com (Postfix) with ESMTPS id 3B6251FF15D
	for <inbox@lore.proxmox.com>; Thu, 19 Sep 2024 09:53:50 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
	by firstgate.proxmox.com (Proxmox) with ESMTP id BD57AF781;
	Thu, 19 Sep 2024 09:53:57 +0200 (CEST)
Message-ID: <39504c85-1fbb-44a3-850c-0d8132c01c09@proxmox.com>
Date: Thu, 19 Sep 2024 09:53:53 +0200
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird Beta
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
References: <mailman.275.1726558174.414.pve-devel@lists.proxmox.com>
Content-Language: en-US
From: Dominik Csapak <d.csapak@proxmox.com>
In-Reply-To: <mailman.275.1726558174.414.pve-devel@lists.proxmox.com>
X-SPAM-LEVEL: Spam detection results:  0
 AWL 0.016 Adjusted score from AWL reputation of From: address
 BAYES_00                 -1.9 Bayes spam probability is 0 to 1%
 DMARC_MISSING             0.1 Missing DMARC policy
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_VALIDITY_CERTIFIED_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_RPBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 RCVD_IN_VALIDITY_SAFE_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to
 Validity was blocked. See
 https://knowledge.validity.com/hc/en-us/articles/20961730681243 for more
 information.
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
Subject: Re: [pve-devel] [Veeam] Veeam change requests?
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
Reply-To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>
Cc: Andreas Neufert <Andreas.Neufert@veeam.com>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: pve-devel-bounces@lists.proxmox.com
Sender: "pve-devel" <pve-devel-bounces@lists.proxmox.com>

On 9/17/24 09:20, Andreas Neufert via pve-devel wrote:
> 
> Hi Proxmox Dev team,
> 

Hi,

> Tim Marx mentioned that you have some insights and change wishes for the Veeam backup processing and that we should reach out to this list. We would be happy to get this feedback here to be able to address it in our code or join a call if this helps.

Thanks for reaching out!

During (very basic & short) testing, i discovered a few things that are problematic from our point 
of view:

* During backup, there is often a longer running connection open to our QMP socket of running VMs
   (/var/run/qemu-server/XXXX.qmp, where XXXX is the vmid). This blocks our management stack from
   doing certain tasks, like start/stop (probably does not matter during backup) but also
   things like the VNC console, etc.

   a better way would be to close the connections as soon as possible instead of keeping them
   open. (Alternatively using our API/CLI could also be done, but I don't know what
   exact QMP commands you're running)

   if you absolutely need a longer running socket, please open a bug report on
   https://bugzilla.proxmox.com so we can discuss and track that there, how we could make
   a socket available that is not used by our stack

* Another thing that I noticed was that it's not really visible if a backup is running
   for a particular VM, so users might accidentally them down (or pause, etc.). Especially
   i think it's bad if the VM is placed under a HA policy that has 'stopped' as target, as
   that will try to stop the VM by itself. (Though this might be a configuration error in itself?)

   A quick way to fix this would be to have a (custom) lock in our VMs. For longer running tasks
   that block a guest, we have a line 'lock: XXXX' in the config that prevents our stack
   from doing most operations.

   Putting that in would be a very short call to our perl code that locks the config locally
   ( `PVE::QemuConfig->lock_config($vmid, $updatefn) ), checks for existing locks,
   updates the config with a new (custom) lock and writes it again.

   Though i must admit, I'm not sure if custom locks outside of our defined ones would work,
   but I'm sure we could add a 'custom' lock that you could use, should my mentioned
   approach not work properly.

* Also, I noticed that when a guest is started from your stack, you modify the QEMU command line a
   bit, namely removing some options that would be necessary to start the VM during the backup.
   Is there a specific reason why you do it this way, instead of starting the VM through
   our API/CLI?


A more general question last: What is the process for our/your users and us if they/we
find a bug? Where can they be reported to you?

I hope this helps

Best regards
Dominik


_______________________________________________
pve-devel mailing list
pve-devel@lists.proxmox.com
https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel