[pbs-devel] Some problems with fullstorage and cleanup

public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed

* [pbs-devel] Some problems with fullstorage and cleanup
@ 2020-08-31 11:14 Harald Leithner
  2020-08-31 11:39 ` Fabian Grünbichler
  0 siblings, 1 reply; 6+ messages in thread
From: Harald Leithner @ 2020-08-31 11:14 UTC (permalink / raw)
  To: pbs-devel

[-- Attachment #1.1: Type: text/plain, Size: 1231 bytes --]

Hi,

my test stroage run out of diskspace.

This happens at Version 0.8-11

I tried to forget the olders snapshots but this doesn't change the disk
usage. After this I started a manual GC (I assume it's garbage
collection). It failed after phase one with an error message. I didn't
copied the error message... (I miss the log panel in the gui).

I also configured GC and Prune Schedules at the same time and told the
vm that backups to the pbs to keep only 20 copies.

After a while I came back to the gui and has only 3 snapshots left and
one backup in progress (Thats maybe correct because of 1 yearly 1
monthly 1 daily and 2 last).
The Statistics Tab still says 100% usage and "zfs list" lists the same
usage.

Starting now a manual GC ends in the error message:

unable to parse active worker status
'UPID:backup-erlach3-test:00000400:0000036A:00000000:5F4CD753:termproxy::root:'
- not a valid user id

maybe because a backup is running?

Is there a way to give the unused diskspace free manually?

thx

Harald

-- 
ITronic

Harald Leithner
Wiedner Hauptstraße 120/5.1, 1050 Wien, Austria
Tel: +43-1-545 0 604
Mobil: +43-699-123 78 4 78
Mail: leithner@itronic.at | itronic.at

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [pbs-devel] Some problems with fullstorage and cleanup
  2020-08-31 11:14 [pbs-devel] Some problems with fullstorage and cleanup Harald Leithner
@ 2020-08-31 11:39 ` Fabian Grünbichler
  2020-08-31 12:32   ` Harald Leithner
  0 siblings, 1 reply; 6+ messages in thread
From: Fabian Grünbichler @ 2020-08-31 11:39 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On August 31, 2020 1:14 pm, Harald Leithner wrote:
> Hi,
> 
> my test stroage run out of diskspace.
> 
> This happens at Version 0.8-11
> 
> I tried to forget the olders snapshots but this doesn't change the disk
> usage. After this I started a manual GC (I assume it's garbage
> collection). It failed after phase one with an error message. I didn't
> copied the error message... (I miss the log panel in the gui).
> 
> I also configured GC and Prune Schedules at the same time and told the
> vm that backups to the pbs to keep only 20 copies.
> 
> After a while I came back to the gui and has only 3 snapshots left and
> one backup in progress (Thats maybe correct because of 1 yearly 1
> monthly 1 daily and 2 last).
> The Statistics Tab still says 100% usage and "zfs list" lists the same
> usage.
> 
> Starting now a manual GC ends in the error message:
> 
> unable to parse active worker status
> 'UPID:backup-erlach3-test:00000400:0000036A:00000000:5F4CD753:termproxy::root:'
> - not a valid user id

this was a known issue that should be fixed on upgrade to 0.8.11-1. can 
you run 'grep termproxy /var/log/apt/term.log' on the PBS server?

you can fixup the task index by running the sed command from 

/var/lib/dpkg/info/proxmox-backup-server.postinst

which replaces the invalid user 'root' with the correct 'root@pam'

> maybe because a backup is running?
> 
> Is there a way to give the unused diskspace free manually?

GC is the way to go, after fixing the issue above. note that only chunks 
older than 24h (or since the least recent, still running backup if that 
is even longer ago) will be GCed.

> thx
> 
> Harald




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [pbs-devel] Some problems with fullstorage and cleanup
  2020-08-31 11:39 ` Fabian Grünbichler
@ 2020-08-31 12:32   ` Harald Leithner
  2020-08-31 13:10     ` Fabian Grünbichler
  0 siblings, 1 reply; 6+ messages in thread
From: Harald Leithner @ 2020-08-31 12:32 UTC (permalink / raw)
  To: pbs-devel


[-- Attachment #1.1: Type: text/plain, Size: 2976 bytes --]



On 2020-08-31 13:39, Fabian Grünbichler wrote:
> On August 31, 2020 1:14 pm, Harald Leithner wrote:
>> Hi,
>>
>> my test stroage run out of diskspace.
>>
>> This happens at Version 0.8-11
>>
>> I tried to forget the olders snapshots but this doesn't change the disk
>> usage. After this I started a manual GC (I assume it's garbage
>> collection). It failed after phase one with an error message. I didn't
>> copied the error message... (I miss the log panel in the gui).
>>
>> I also configured GC and Prune Schedules at the same time and told the
>> vm that backups to the pbs to keep only 20 copies.
>>
>> After a while I came back to the gui and has only 3 snapshots left and
>> one backup in progress (Thats maybe correct because of 1 yearly 1
>> monthly 1 daily and 2 last).
>> The Statistics Tab still says 100% usage and "zfs list" lists the same
>> usage.
>>
>> Starting now a manual GC ends in the error message:
>>
>> unable to parse active worker status
>> 'UPID:backup-erlach3-test:00000400:0000036A:00000000:5F4CD753:termproxy::root:'
>> - not a valid user id
> 
> this was a known issue that should be fixed on upgrade to 0.8.11-1. can 
> you run 'grep termproxy /var/log/apt/term.log' on the PBS server?
> 

the only entry is "Fixing up termproxy user id in task log..."
btw. my first version was 0.8.9-1 not 0.8.11 I upgraded later to this
version

> you can fixup the task index by running the sed command from 
> 
> /var/lib/dpkg/info/proxmox-backup-server.postinst
> 
> which replaces the invalid user 'root' with the correct 'root@pam'
> 

ok after running the sed command manually the GC works again.

but is complains about no diskspace:

2020-08-31T14:29:16+02:00: WARN: warning: unable to access chunk
135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63,
required by "/test3/vm/3011/2020-08-30T22:00:02Z/drive-scsi1.img.fidx" -
update atime failed for chunk
"/test3/.chunks/135e/135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63"
- ENOSPC: No space left on device

>> maybe because a backup is running?
>>
>> Is there a way to give the unused diskspace free manually?
> 
> GC is the way to go, after fixing the issue above. note that only chunks 
> older than 24h (or since the least recent, still running backup if that 
> is even longer ago) will be GCed.
> 

Ok so any way I get diskspace back so I can run GC?

Also I have 3 backup entries (for the same VM) on the Storage view with
a spinning circle on the size column. (The backups are stop on the server).


>> thx
>>
>> Harald
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 

-- 
ITronic

Harald Leithner
Wiedner Hauptstraße 120/5.1, 1050 Wien, Austria
Tel: +43-1-545 0 604
Mobil: +43-699-123 78 4 78
Mail: leithner@itronic.at | itronic.at


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [pbs-devel] Some problems with fullstorage and cleanup
  2020-08-31 12:32   ` Harald Leithner
@ 2020-08-31 13:10     ` Fabian Grünbichler
  2020-09-01 11:06       ` Harald Leithner
  0 siblings, 1 reply; 6+ messages in thread
From: Fabian Grünbichler @ 2020-08-31 13:10 UTC (permalink / raw)
  To: Proxmox Backup Server development discussion

On August 31, 2020 2:32 pm, Harald Leithner wrote:
> 
> 
> On 2020-08-31 13:39, Fabian Grünbichler wrote:
>> On August 31, 2020 1:14 pm, Harald Leithner wrote:
>>> Hi,
>>>
>>> my test stroage run out of diskspace.
>>>
>>> This happens at Version 0.8-11
>>>
>>> I tried to forget the olders snapshots but this doesn't change the disk
>>> usage. After this I started a manual GC (I assume it's garbage
>>> collection). It failed after phase one with an error message. I didn't
>>> copied the error message... (I miss the log panel in the gui).
>>>
>>> I also configured GC and Prune Schedules at the same time and told the
>>> vm that backups to the pbs to keep only 20 copies.
>>>
>>> After a while I came back to the gui and has only 3 snapshots left and
>>> one backup in progress (Thats maybe correct because of 1 yearly 1
>>> monthly 1 daily and 2 last).
>>> The Statistics Tab still says 100% usage and "zfs list" lists the same
>>> usage.
>>>
>>> Starting now a manual GC ends in the error message:
>>>
>>> unable to parse active worker status
>>> 'UPID:backup-erlach3-test:00000400:0000036A:00000000:5F4CD753:termproxy::root:'
>>> - not a valid user id
>> 
>> this was a known issue that should be fixed on upgrade to 0.8.11-1. can 
>> you run 'grep termproxy /var/log/apt/term.log' on the PBS server?
>> 
> 
> the only entry is "Fixing up termproxy user id in task log..."
> btw. my first version was 0.8.9-1 not 0.8.11 I upgraded later to this
> version
> 
>> you can fixup the task index by running the sed command from 
>> 
>> /var/lib/dpkg/info/proxmox-backup-server.postinst
>> 
>> which replaces the invalid user 'root' with the correct 'root@pam'
>> 
> 
> ok after running the sed command manually the GC works again.
> 
> but is complains about no diskspace:
> 
> 2020-08-31T14:29:16+02:00: WARN: warning: unable to access chunk
> 135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63,
> required by "/test3/vm/3011/2020-08-30T22:00:02Z/drive-scsi1.img.fidx" -
> update atime failed for chunk
> "/test3/.chunks/135e/135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63"
> - ENOSPC: No space left on device

if you don't have enough space to touch a chunk, that is rather bad.. 
you can attempt to free up some more space by deleting backup metadata 
of snapshots you no longer needed, either by 'rm'-ing the directory that 
represents them, or by using 'forget' on the GUI if that works..

what does 'df -m /test3' report? and/or the equivalent command for 
whatever storage the datastore is on (e.g., zfs list -o space 
path/to/dataset).




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [pbs-devel] Some problems with fullstorage and cleanup
  2020-08-31 13:10     ` Fabian Grünbichler
@ 2020-09-01 11:06       ` Harald Leithner
  2020-09-08  6:58         ` Harald Leithner
  0 siblings, 1 reply; 6+ messages in thread
From: Harald Leithner @ 2020-09-01 11:06 UTC (permalink / raw)
  To: pbs-devel


[-- Attachment #1.1: Type: text/plain, Size: 3456 bytes --]

Hi,

after sometime the pbs was able to recover the diskspace of the deleted
snapshots.

I get some error reports after the backup but I think thats because the
pve has not the latest backup client and tries root and not root@pam.

Anyway thx for the help and I think a better "disk full handling" would
be useful ;-)

Harald

On 2020-08-31 15:10, Fabian Grünbichler wrote:
> On August 31, 2020 2:32 pm, Harald Leithner wrote:
>>
>>
>> On 2020-08-31 13:39, Fabian Grünbichler wrote:
>>> On August 31, 2020 1:14 pm, Harald Leithner wrote:
>>>> Hi,
>>>>
>>>> my test stroage run out of diskspace.
>>>>
>>>> This happens at Version 0.8-11
>>>>
>>>> I tried to forget the olders snapshots but this doesn't change the disk
>>>> usage. After this I started a manual GC (I assume it's garbage
>>>> collection). It failed after phase one with an error message. I didn't
>>>> copied the error message... (I miss the log panel in the gui).
>>>>
>>>> I also configured GC and Prune Schedules at the same time and told the
>>>> vm that backups to the pbs to keep only 20 copies.
>>>>
>>>> After a while I came back to the gui and has only 3 snapshots left and
>>>> one backup in progress (Thats maybe correct because of 1 yearly 1
>>>> monthly 1 daily and 2 last).
>>>> The Statistics Tab still says 100% usage and "zfs list" lists the same
>>>> usage.
>>>>
>>>> Starting now a manual GC ends in the error message:
>>>>
>>>> unable to parse active worker status
>>>> 'UPID:backup-erlach3-test:00000400:0000036A:00000000:5F4CD753:termproxy::root:'
>>>> - not a valid user id
>>>
>>> this was a known issue that should be fixed on upgrade to 0.8.11-1. can 
>>> you run 'grep termproxy /var/log/apt/term.log' on the PBS server?
>>>
>>
>> the only entry is "Fixing up termproxy user id in task log..."
>> btw. my first version was 0.8.9-1 not 0.8.11 I upgraded later to this
>> version
>>
>>> you can fixup the task index by running the sed command from 
>>>
>>> /var/lib/dpkg/info/proxmox-backup-server.postinst
>>>
>>> which replaces the invalid user 'root' with the correct 'root@pam'
>>>
>>
>> ok after running the sed command manually the GC works again.
>>
>> but is complains about no diskspace:
>>
>> 2020-08-31T14:29:16+02:00: WARN: warning: unable to access chunk
>> 135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63,
>> required by "/test3/vm/3011/2020-08-30T22:00:02Z/drive-scsi1.img.fidx" -
>> update atime failed for chunk
>> "/test3/.chunks/135e/135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63"
>> - ENOSPC: No space left on device
> 
> if you don't have enough space to touch a chunk, that is rather bad.. 
> you can attempt to free up some more space by deleting backup metadata 
> of snapshots you no longer needed, either by 'rm'-ing the directory that 
> represents them, or by using 'forget' on the GUI if that works..
> 
> what does 'df -m /test3' report? and/or the equivalent command for 
> whatever storage the datastore is on (e.g., zfs list -o space 
> path/to/dataset).
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 

-- 
ITronic

Harald Leithner
Wiedner Hauptstraße 120/5.1, 1050 Wien, Austria
Tel: +43-1-545 0 604
Mobil: +43-699-123 78 4 78
Mail: leithner@itronic.at | itronic.at


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 659 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [pbs-devel] Some problems with fullstorage and cleanup
  2020-09-01 11:06       ` Harald Leithner
@ 2020-09-08  6:58         ` Harald Leithner
  0 siblings, 0 replies; 6+ messages in thread
From: Harald Leithner @ 2020-09-08  6:58 UTC (permalink / raw)
  To: pbs-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 4124 bytes --]

Hi,

I set the GC Schedule to 2,22:30 daily but it seems the garbage 
collection doesn't run automatically...

After pressing the "Start GC" button in the ui it beginn to clean up the 
storage. Any idea why this could happen?

I'm not sure how this cron job is managed because I don't see a cron 
entry and no systemd timer.

Harald


Am 01.09.2020 um 13:06 schrieb Harald Leithner:
> Hi,
> 
> after sometime the pbs was able to recover the diskspace of the deleted
> snapshots.
> 
> I get some error reports after the backup but I think thats because the
> pve has not the latest backup client and tries root and not root@pam.
> 
> Anyway thx for the help and I think a better "disk full handling" would
> be useful ;-)
> 
> Harald
> 
> On 2020-08-31 15:10, Fabian Grünbichler wrote:
>> On August 31, 2020 2:32 pm, Harald Leithner wrote:
>>>
>>>
>>> On 2020-08-31 13:39, Fabian Grünbichler wrote:
>>>> On August 31, 2020 1:14 pm, Harald Leithner wrote:
>>>>> Hi,
>>>>>
>>>>> my test stroage run out of diskspace.
>>>>>
>>>>> This happens at Version 0.8-11
>>>>>
>>>>> I tried to forget the olders snapshots but this doesn't change the disk
>>>>> usage. After this I started a manual GC (I assume it's garbage
>>>>> collection). It failed after phase one with an error message. I didn't
>>>>> copied the error message... (I miss the log panel in the gui).
>>>>>
>>>>> I also configured GC and Prune Schedules at the same time and told the
>>>>> vm that backups to the pbs to keep only 20 copies.
>>>>>
>>>>> After a while I came back to the gui and has only 3 snapshots left and
>>>>> one backup in progress (Thats maybe correct because of 1 yearly 1
>>>>> monthly 1 daily and 2 last).
>>>>> The Statistics Tab still says 100% usage and "zfs list" lists the same
>>>>> usage.
>>>>>
>>>>> Starting now a manual GC ends in the error message:
>>>>>
>>>>> unable to parse active worker status
>>>>> 'UPID:backup-erlach3-test:00000400:0000036A:00000000:5F4CD753:termproxy::root:'
>>>>> - not a valid user id
>>>>
>>>> this was a known issue that should be fixed on upgrade to 0.8.11-1. can
>>>> you run 'grep termproxy /var/log/apt/term.log' on the PBS server?
>>>>
>>>
>>> the only entry is "Fixing up termproxy user id in task log..."
>>> btw. my first version was 0.8.9-1 not 0.8.11 I upgraded later to this
>>> version
>>>
>>>> you can fixup the task index by running the sed command from
>>>>
>>>> /var/lib/dpkg/info/proxmox-backup-server.postinst
>>>>
>>>> which replaces the invalid user 'root' with the correct 'root@pam'
>>>>
>>>
>>> ok after running the sed command manually the GC works again.
>>>
>>> but is complains about no diskspace:
>>>
>>> 2020-08-31T14:29:16+02:00: WARN: warning: unable to access chunk
>>> 135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63,
>>> required by "/test3/vm/3011/2020-08-30T22:00:02Z/drive-scsi1.img.fidx" -
>>> update atime failed for chunk
>>> "/test3/.chunks/135e/135e565dc79f80d3a9980688bfe161409bf229fb4d11ab7290b5b1e58b27bc63"
>>> - ENOSPC: No space left on device
>>
>> if you don't have enough space to touch a chunk, that is rather bad..
>> you can attempt to free up some more space by deleting backup metadata
>> of snapshots you no longer needed, either by 'rm'-ing the directory that
>> represents them, or by using 'forget' on the GUI if that works..
>>
>> what does 'df -m /test3' report? and/or the equivalent command for
>> whatever storage the datastore is on (e.g., zfs list -o space
>> path/to/dataset).
>>
>>
>> _______________________________________________
>> pbs-devel mailing list
>> pbs-devel@lists.proxmox.com
>> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
>>
> 
> 
> _______________________________________________
> pbs-devel mailing list
> pbs-devel@lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pbs-devel
> 

-- 
ITronic

Harald Leithner
Wiedner Hauptstraße 120/5.1, 1050 Wien, Austria
Tel: +43-1-545 0 604
Mobil: +43-699-123 78 4 78
Mail: leithner@itronic.at | itronic.at

[-- Attachment #1.1.2: OpenPGP_0x473B935C49EA1B08.asc --]
[-- Type: application/pgp-keys, Size: 3671 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 665 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-08  6:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 11:14 [pbs-devel] Some problems with fullstorage and cleanup Harald Leithner
2020-08-31 11:39 ` Fabian Grünbichler
2020-08-31 12:32   ` Harald Leithner
2020-08-31 13:10     ` Fabian Grünbichler
2020-09-01 11:06       ` Harald Leithner
2020-09-08  6:58         ` Harald Leithner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal