Replication error: dataset is busy

all lists on lists.proxmox.com
 help / color / mirror / Atom feed

From: Marco Gaiarin <gaio@lilliput.linux.it>
To: pve-user@lists.proxmox.com
Subject: Replication error: dataset is busy
Date: Fri, 13 Feb 2026 09:43:08 +0100	[thread overview]
Message-ID: <qp436m-odc.ln1@leia.lilliput.linux.it> (raw)


Situation: couple of PVE nodes, with a direct link between the two to
manage replica and migration.
For an hardware failure, the NIC on one of the server (cnpve2) failed, and
server need to be powered off.

After node cnpve2 reboot, all replica recovered, apart one (runing on cnpve2):

 2026-02-13 09:26:01 121-0: start replication job
 2026-02-13 09:26:01 121-0: guest => VM 121, running => 4345
 2026-02-13 09:26:01 121-0: volumes => local-zfs:vm-121-disk-0,rpool-data:vm-121-disk-0,rpool-data:vm-121-disk-1
 2026-02-13 09:26:04 121-0: freeze guest filesystem
 2026-02-13 09:26:06 121-0: create snapshot '__replicate_121-0_1770971161__' on local-zfs:vm-121-disk-0
 2026-02-13 09:26:06 121-0: create snapshot '__replicate_121-0_1770971161__' on rpool-data:vm-121-disk-0
 2026-02-13 09:26:06 121-0: create snapshot '__replicate_121-0_1770971161__' on rpool-data:vm-121-disk-1
 2026-02-13 09:26:06 121-0: thaw guest filesystem
 2026-02-13 09:26:06 121-0: using insecure transmission, rate limit: 10 MByte/s
 2026-02-13 09:26:06 121-0: incremental sync 'local-zfs:vm-121-disk-0' (__replicate_121-0_1770876001__ => __replicate_121-0_1770971161__)
 2026-02-13 09:26:06 121-0: using a bandwidth limit of 10000000 bytes per second for transferring 'local-zfs:vm-121-disk-0'
 2026-02-13 09:26:08 121-0: send from @__replicate_121-0_1770876001__ to rpool/data/vm-121-disk-0@__replicate_121-0_1770971161__ estimated size is 2.76G
 2026-02-13 09:26:08 121-0: total estimated size is 2.76G
 2026-02-13 09:26:08 121-0: TIME        SENT   SNAPSHOT rpool/data/vm-121-disk-0@__replicate_121-0_1770971161__
 2026-02-13 09:26:08 121-0: 663540 B 648.0 KB 0.69 s 964531 B/s 941.92 KB/s
 2026-02-13 09:26:08 121-0: write: Broken pipe
 2026-02-13 09:26:08 121-0: warning: cannot send 'rpool/data/vm-121-disk-0@__replicate_121-0_1770971161__': signal received
 2026-02-13 09:26:08 121-0: cannot send 'rpool/data/vm-121-disk-0': I/O error
 2026-02-13 09:26:08 121-0: command 'zfs send -Rpv -I __replicate_121-0_1770876001__ -- rpool/data/vm-121-disk-0@__replicate_121-0_1770971161__' failed: exit code 1
 2026-02-13 09:26:08 121-0: [cnpve1] cannot receive incremental stream: dataset is busy
 2026-02-13 09:26:08 121-0: [cnpve1] command 'zfs recv -F -- rpool/data/vm-121-disk-0' failed: exit code 1
 2026-02-13 09:26:08 121-0: delete previous replication snapshot '__replicate_121-0_1770971161__' on local-zfs:vm-121-disk-0
 2026-02-13 09:26:08 121-0: delete previous replication snapshot '__replicate_121-0_1770971161__' on rpool-data:vm-121-disk-0
 2026-02-13 09:26:08 121-0: delete previous replication snapshot '__replicate_121-0_1770971161__' on rpool-data:vm-121-disk-1
 2026-02-13 09:26:08 121-0: end replication job with error: failed to run insecure migration: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=cnpve1' -o 'UserKnownHostsFile=/etc/pve/nodes/cnpve1/ssh_known_hosts' -o 'GlobalKnownHostsFile=none' root@10.10.251.21 -- pvesm import local-zfs:vm-121-disk-0 zfs tcp://10.10.251.0/24 -with-snapshots 1 -snapshot __replicate_121-0_1770971161__ -allow-rename 0 -base __replicate_121-0_1770876001__' failed: exit code 255

On the rebooted node there's no holds:

 root@cnpve2:~# zfs list -t snapshot | grep 121
 rpool-data/vm-121-disk-0@__replicate_121-0_1770876001__      5.99G      -  2.49T  -
 rpool-data/vm-121-disk-1@__replicate_121-0_1770876001__       115M      -  22.1G  -
 rpool/data/vm-121-disk-0@__replicate_121-0_1770876001__      1.57G      -  35.4G  -
 root@cnpve2:~# zfs holds rpool-data/vm-121-disk-0@__replicate_121-0_1770876001__
 NAME                                                     TAG  TIMESTAMP
 root@cnpve2:~# zfs holds rpool-data/vm-121-disk-1@__replicate_121-0_1770876001__
 NAME                                                     TAG  TIMESTAMP
 root@cnpve2:~# zfs holds rpool/data/vm-121-disk-0@__replicate_121-0_1770876001__
 NAME                                                     TAG  TIMESTAMP

on the opposite node too:

 root@cnpve1:~# zfs list -t snapshot | grep 121
 rpool-data/vm-121-disk-0@__replicate_121-0_1770876001__         0B      -  2.49T  -
 rpool-data/vm-121-disk-1@__replicate_121-0_1770876001__         0B      -  22.1G  -
 rpool/data/vm-121-disk-0@__replicate_121-0_1770876001__         0B      -  35.4G  -
 root@cnpve1:~# zfs holds rpool-data/vm-121-disk-0@__replicate_121-0_1770876001__
 NAME                                                     TAG  TIMESTAMP
 root@cnpve1:~# zfs holds rpool-data/vm-121-disk-1@__replicate_121-0_1770876001__
 NAME                                                     TAG  TIMESTAMP
 root@cnpve1:~# zfs holds rpool/data/vm-121-disk-0@__replicate_121-0_1770876001__
 NAME                                                     TAG  TIMESTAMP

It is clear that somethink remains 'locked' on the non-rebooted node
(cnpve1), but how identify and unlock them?


Thanks.

--

                 reply	other threads:[~2026-02-13  9:10 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=qp436m-odc.ln1@leia.lilliput.linux.it \
    --to=gaio@lilliput.linux.it \
    --cc=pve-user@lists.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal