* [PVE-User] CT replication error
@ 2020-09-03 12:56 Adam Weremczuk
2020-09-09 7:51 ` Fabian Ebner
0 siblings, 1 reply; 4+ messages in thread
From: Adam Weremczuk @ 2020-09-03 12:56 UTC (permalink / raw)
To: pve-user
Hi all,
I have a dual host set up, PVE 6.2-6.
All containers replicate fine except for 102 giving the following:
Sep 3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication runner...
Sep 3 13:49:02 node1 zed: eid=7290 class=history_event
pool_guid=0x33A69221E174DDE9
Sep 3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up
snapshot(s)..
Sep 3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication job
error - command 'set -o pipefail && pvesm export
zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot
__replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o 'BatchMode=yes'
-o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm import
zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0'
failed: exit code 255
Sep 3 13:49:03 node1 zed: eid=7291 class=history_event
pool_guid=0x33A69221E174DDE9
Any idea what the problem is and how to fix it?
Regards,
Adam
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PVE-User] CT replication error
2020-09-03 12:56 [PVE-User] CT replication error Adam Weremczuk
@ 2020-09-09 7:51 ` Fabian Ebner
2020-09-09 18:11 ` Adam Weremczuk
0 siblings, 1 reply; 4+ messages in thread
From: Fabian Ebner @ 2020-09-09 7:51 UTC (permalink / raw)
To: pve-user
Hi,
could you check the replication log itself? There might be more
information there. Do the working replications use the same storages as
the failing one?
Am 03.09.20 um 14:56 schrieb Adam Weremczuk:
> Hi all,
>
> I have a dual host set up, PVE 6.2-6.
>
> All containers replicate fine except for 102 giving the following:
>
> Sep 3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication runner...
> Sep 3 13:49:02 node1 zed: eid=7290 class=history_event
> pool_guid=0x33A69221E174DDE9
> Sep 3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up
> snapshot(s)..
> Sep 3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication job
> error - command 'set -o pipefail && pvesm export
> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot
> __replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o 'BatchMode=yes'
> -o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm import
> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0'
> failed: exit code 255
> Sep 3 13:49:03 node1 zed: eid=7291 class=history_event
> pool_guid=0x33A69221E174DDE9
>
> Any idea what the problem is and how to fix it?
>
> Regards,
> Adam
>
>
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PVE-User] CT replication error
2020-09-09 7:51 ` Fabian Ebner
@ 2020-09-09 18:11 ` Adam Weremczuk
2020-09-10 8:58 ` Fabian Ebner
0 siblings, 1 reply; 4+ messages in thread
From: Adam Weremczuk @ 2020-09-09 18:11 UTC (permalink / raw)
To: pve-user
Hi Fabian,
Yes, all replication use the same shared zfs storage.
Are you referring to /var/log/pve/replicate/102-0 file?
It seems to only hold information about the last run.
Anyway, my problem turned out to be the node2 still holding
zfs-pool/subvol-102-disk-0 of the previous container.
I had deleted the old container from the web GUI before creating a new
one in its place (id 102).
For some reason node2 still had the old disk. Once I rm'ed it from the
shell of node2 replication started working for CT-102.
Regards,
Adam
On 09/09/2020 08:51, Fabian Ebner wrote:
> Hi,
> could you check the replication log itself? There might be more
> information there. Do the working replications use the same storages
> as the failing one?
>
> Am 03.09.20 um 14:56 schrieb Adam Weremczuk:
>> Hi all,
>>
>> I have a dual host set up, PVE 6.2-6.
>>
>> All containers replicate fine except for 102 giving the following:
>>
>> Sep 3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication
>> runner...
>> Sep 3 13:49:02 node1 zed: eid=7290 class=history_event
>> pool_guid=0x33A69221E174DDE9
>> Sep 3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up
>> snapshot(s)..
>> Sep 3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication
>> job error - command 'set -o pipefail && pvesm export
>> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot
>> __replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o
>> 'BatchMode=yes' -o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm
>> import zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1
>> -allow-rename 0' failed: exit code 255
>> Sep 3 13:49:03 node1 zed: eid=7291 class=history_event
>> pool_guid=0x33A69221E174DDE9
>>
>> Any idea what the problem is and how to fix it?
>>
>> Regards,
>> Adam
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PVE-User] CT replication error
2020-09-09 18:11 ` Adam Weremczuk
@ 2020-09-10 8:58 ` Fabian Ebner
0 siblings, 0 replies; 4+ messages in thread
From: Fabian Ebner @ 2020-09-10 8:58 UTC (permalink / raw)
To: pve-user
Hi,
glad you were able to resolve the issue. Did you use 'purge' to remove
the container? Doing that does not (yet) clean up the replicated volumes
on the remote nodes. We're probably going to change that.
Am 09.09.20 um 20:11 schrieb Adam Weremczuk:
> Hi Fabian,
>
> Yes, all replication use the same shared zfs storage.
>
> Are you referring to /var/log/pve/replicate/102-0 file?
>
> It seems to only hold information about the last run.
>
> Anyway, my problem turned out to be the node2 still holding
> zfs-pool/subvol-102-disk-0 of the previous container.
>
> I had deleted the old container from the web GUI before creating a new
> one in its place (id 102).
>
> For some reason node2 still had the old disk. Once I rm'ed it from the
> shell of node2 replication started working for CT-102.
>
> Regards,
> Adam
>
> On 09/09/2020 08:51, Fabian Ebner wrote:
>> Hi,
>> could you check the replication log itself? There might be more
>> information there. Do the working replications use the same storages
>> as the failing one?
>>
>> Am 03.09.20 um 14:56 schrieb Adam Weremczuk:
>>> Hi all,
>>>
>>> I have a dual host set up, PVE 6.2-6.
>>>
>>> All containers replicate fine except for 102 giving the following:
>>>
>>> Sep 3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication
>>> runner...
>>> Sep 3 13:49:02 node1 zed: eid=7290 class=history_event
>>> pool_guid=0x33A69221E174DDE9
>>> Sep 3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up
>>> snapshot(s)..
>>> Sep 3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication
>>> job error - command 'set -o pipefail && pvesm export
>>> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot
>>> __replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o
>>> 'BatchMode=yes' -o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm
>>> import zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1
>>> -allow-rename 0' failed: exit code 255
>>> Sep 3 13:49:03 node1 zed: eid=7291 class=history_event
>>> pool_guid=0x33A69221E174DDE9
>>>
>>> Any idea what the problem is and how to fix it?
>>>
>>> Regards,
>>> Adam
>>>
>>>
>>> _______________________________________________
>>> pve-user mailing list
>>> pve-user@lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-09-10 8:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 12:56 [PVE-User] CT replication error Adam Weremczuk
2020-09-09 7:51 ` Fabian Ebner
2020-09-09 18:11 ` Adam Weremczuk
2020-09-10 8:58 ` Fabian Ebner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox