public inbox for pve-user@lists.proxmox.com
 help / color / mirror / Atom feed
* [PVE-User] CT replication error
@ 2020-09-03 12:56 Adam Weremczuk
  2020-09-09  7:51 ` Fabian Ebner
  0 siblings, 1 reply; 4+ messages in thread
From: Adam Weremczuk @ 2020-09-03 12:56 UTC (permalink / raw)
  To: pve-user

Hi all,

I have a dual host set up, PVE 6.2-6.

All containers replicate fine except for 102 giving the following:

Sep  3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication runner...
Sep  3 13:49:02 node1 zed: eid=7290 class=history_event 
pool_guid=0x33A69221E174DDE9
Sep  3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up 
snapshot(s)..
Sep  3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication job 
error - command 'set -o pipefail && pvesm export 
zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot 
__replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o 'BatchMode=yes' 
-o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm import 
zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' 
failed: exit code 255
Sep  3 13:49:03 node1 zed: eid=7291 class=history_event 
pool_guid=0x33A69221E174DDE9

Any idea what the problem is and how to fix it?

Regards,
Adam




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PVE-User] CT replication error
  2020-09-03 12:56 [PVE-User] CT replication error Adam Weremczuk
@ 2020-09-09  7:51 ` Fabian Ebner
  2020-09-09 18:11   ` Adam Weremczuk
  0 siblings, 1 reply; 4+ messages in thread
From: Fabian Ebner @ 2020-09-09  7:51 UTC (permalink / raw)
  To: pve-user

Hi,
could you check the replication log itself? There might be more 
information there. Do the working replications use the same storages as 
the failing one?

Am 03.09.20 um 14:56 schrieb Adam Weremczuk:
> Hi all,
> 
> I have a dual host set up, PVE 6.2-6.
> 
> All containers replicate fine except for 102 giving the following:
> 
> Sep  3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication runner...
> Sep  3 13:49:02 node1 zed: eid=7290 class=history_event 
> pool_guid=0x33A69221E174DDE9
> Sep  3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up 
> snapshot(s)..
> Sep  3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication job 
> error - command 'set -o pipefail && pvesm export 
> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot 
> __replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o 'BatchMode=yes' 
> -o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm import 
> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -allow-rename 0' 
> failed: exit code 255
> Sep  3 13:49:03 node1 zed: eid=7291 class=history_event 
> pool_guid=0x33A69221E174DDE9
> 
> Any idea what the problem is and how to fix it?
> 
> Regards,
> Adam
> 
> 
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PVE-User] CT replication error
  2020-09-09  7:51 ` Fabian Ebner
@ 2020-09-09 18:11   ` Adam Weremczuk
  2020-09-10  8:58     ` Fabian Ebner
  0 siblings, 1 reply; 4+ messages in thread
From: Adam Weremczuk @ 2020-09-09 18:11 UTC (permalink / raw)
  To: pve-user

Hi Fabian,

Yes, all replication use the same shared zfs storage.

Are you referring to /var/log/pve/replicate/102-0 file?

It seems to only hold information about the last run.

Anyway, my problem turned out to be the node2 still holding 
zfs-pool/subvol-102-disk-0 of the previous container.

I had deleted the old container from the web GUI before creating a new 
one in its place (id 102).

For some reason node2 still had the old disk. Once I rm'ed it from the 
shell of node2 replication started working for CT-102.

Regards,
Adam

On 09/09/2020 08:51, Fabian Ebner wrote:
> Hi,
> could you check the replication log itself? There might be more 
> information there. Do the working replications use the same storages 
> as the failing one?
>
> Am 03.09.20 um 14:56 schrieb Adam Weremczuk:
>> Hi all,
>>
>> I have a dual host set up, PVE 6.2-6.
>>
>> All containers replicate fine except for 102 giving the following:
>>
>> Sep  3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication 
>> runner...
>> Sep  3 13:49:02 node1 zed: eid=7290 class=history_event 
>> pool_guid=0x33A69221E174DDE9
>> Sep  3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up 
>> snapshot(s)..
>> Sep  3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication 
>> job error - command 'set -o pipefail && pvesm export 
>> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot 
>> __replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o 
>> 'BatchMode=yes' -o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm 
>> import zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 
>> -allow-rename 0' failed: exit code 255
>> Sep  3 13:49:03 node1 zed: eid=7291 class=history_event 
>> pool_guid=0x33A69221E174DDE9
>>
>> Any idea what the problem is and how to fix it?
>>
>> Regards,
>> Adam
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>
>
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PVE-User] CT replication error
  2020-09-09 18:11   ` Adam Weremczuk
@ 2020-09-10  8:58     ` Fabian Ebner
  0 siblings, 0 replies; 4+ messages in thread
From: Fabian Ebner @ 2020-09-10  8:58 UTC (permalink / raw)
  To: pve-user

Hi,
glad you were able to resolve the issue. Did you use 'purge' to remove 
the container? Doing that does not (yet) clean up the replicated volumes 
on the remote nodes. We're probably going to change that.

Am 09.09.20 um 20:11 schrieb Adam Weremczuk:
> Hi Fabian,
> 
> Yes, all replication use the same shared zfs storage.
> 
> Are you referring to /var/log/pve/replicate/102-0 file?
> 
> It seems to only hold information about the last run.
> 
> Anyway, my problem turned out to be the node2 still holding 
> zfs-pool/subvol-102-disk-0 of the previous container.
> 
> I had deleted the old container from the web GUI before creating a new 
> one in its place (id 102).
> 
> For some reason node2 still had the old disk. Once I rm'ed it from the 
> shell of node2 replication started working for CT-102.
> 
> Regards,
> Adam
> 
> On 09/09/2020 08:51, Fabian Ebner wrote:
>> Hi,
>> could you check the replication log itself? There might be more 
>> information there. Do the working replications use the same storages 
>> as the failing one?
>>
>> Am 03.09.20 um 14:56 schrieb Adam Weremczuk:
>>> Hi all,
>>>
>>> I have a dual host set up, PVE 6.2-6.
>>>
>>> All containers replicate fine except for 102 giving the following:
>>>
>>> Sep  3 13:49:00 node1 systemd[1]: Starting Proxmox VE replication 
>>> runner...
>>> Sep  3 13:49:02 node1 zed: eid=7290 class=history_event 
>>> pool_guid=0x33A69221E174DDE9
>>> Sep  3 13:49:03 node1 pvesr[6852]: send/receive failed, cleaning up 
>>> snapshot(s)..
>>> Sep  3 13:49:03 node1 pvesr[6852]: 102-0: got unexpected replication 
>>> job error - command 'set -o pipefail && pvesm export 
>>> zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 -snapshot 
>>> __replicate_102-0_1599137341__ | /usr/bin/ssh -e none -o 
>>> 'BatchMode=yes' -o 'HostKeyAlias=node2' root@192.168.100.2 -- pvesm 
>>> import zfs-pool:subvol-102-disk-0 zfs - -with-snapshots 1 
>>> -allow-rename 0' failed: exit code 255
>>> Sep  3 13:49:03 node1 zed: eid=7291 class=history_event 
>>> pool_guid=0x33A69221E174DDE9
>>>
>>> Any idea what the problem is and how to fix it?
>>>
>>> Regards,
>>> Adam
>>>
>>>
>>> _______________________________________________
>>> pve-user mailing list
>>> pve-user@lists.proxmox.com
>>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>>
>>
>> _______________________________________________
>> pve-user mailing list
>> pve-user@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 
> _______________________________________________
> pve-user mailing list
> pve-user@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-09-10  8:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 12:56 [PVE-User] CT replication error Adam Weremczuk
2020-09-09  7:51 ` Fabian Ebner
2020-09-09 18:11   ` Adam Weremczuk
2020-09-10  8:58     ` Fabian Ebner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal