public inbox for pve-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Dominik Csapak <d.csapak@proxmox.com>
To: aderumier@odiso.com,
	Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
	pve-devel <pve-devel@pve.proxmox.com>
Subject: Re: [pve-devel] ceph create pool with min_size=1 not possible anymore with last gui wizard
Date: Mon, 7 Jun 2021 09:14:13 +0200	[thread overview]
Message-ID: <c1890524-65c2-206f-0add-747c7d1979a0@proxmox.com> (raw)
In-Reply-To: <1536e582aba69f723b2a306d9c4176da502f5588.camel@odiso.com>

On 6/7/21 08:57, aderumier@odiso.com wrote:
> Le vendredi 04 juin 2021 à 15:23 +0200, Dominik Csapak a écrit :
>> On 6/4/21 04:47, aderumier@odiso.com <mailto:aderumier@odiso.com> wrote:
>>> Hi,
>>>
>>
>>
>> Hi,
>>
>>> I was doing a training week with students,
>>>
>>> and I see that the new ceph wizard to create pool don't allow to set
>>> min_size=1 anymore.
>>>
>>> It's currently displaying a warning "min_size <= size/2 can lead to
>>> data loss, incomplete PGs or unfound objects",
>>>
>>> that's ok  ,  but It's also blocking the validation button.
>>>
>>
>> yes, in our experience, setting min_size to 1 is always a bad idea
>> and most likely not what you want
>>
>> what is possible though is to either create the pool on the cli,
>> or changing the min_size after creation to 1 (this is not blocked)
>>
> yes, Sute. I could be great to be able to change size/min_size from the 
> gui too.
> 
> 

this should already possible in current versions, but as i said
not for pool creation, only afterwards

> 
>>>
>>>
>>> Some users with small cluster/budgets want to do only size=2,
>>>
>>> so with min_size=2, the cluster will go read only in case of any osd
>>> down.
>>>
>>> It could be great to allow at least min_size=1 when size=2 is used.
>>>
>>
>> "great" but very dangerous
>>
>>>
>>> also,
>>> Other setup like size=4, min_size=2, also display the warning, but
>>> allow to validate the form.
>>>
>>> I'm not sure this warning is correct in this case , as since octopus,
>>> min_size
>>> is auto compute when pool is created, and a simple
>>>
>>> ceph osd pool create mypool 128 --size=4  , create a pool with
>>> min_size=2 by default.
>>>
>>>
>>
>> the rationale behind this decision was (i think) because
>> if you have exactly 50% min_size of size (e.g. 4/2)
>> you can get inconsistent pgs, with no quorum as to
>> which pg is correct?
>> (though don't quote me on that)
>>
>> so i think its always better to have > 50% min_size of size
>>
> Well, afaik, they are no "quorum" on pg consistency for repair currently.
> if a pg is corrupt, ceph is simply copy data from a pg copy where 
> checksum is ok.
> and if no checksum is available, it take a random copy. (maybe it need a 
> manual pg_repair in this case).
> But They are not something like "theses 2 copies have the more majority 
> (quorum) of checksum.
> 
> (Maybe I'm wrong, but 1 or 2 year ago, Sage have confirmed this on the 
> ceph mailing)
> 
> 

i thought more about 'inconsistent' pgs, maybe i am wrong
but how does ceph cope with multiple 'valid' objects (all checksums are 
ok) but different content? (e.g, when during a write, theres a
power cut?) i assumed that there a 'majority' must be
established?

although i did not find any document to support that, and in [0]
it is only mentioned it will take the authoritative copy

i'll discuss this with my colleages, and check more sources to maybe
relax the '> 50%' rule a little for the warning

thanks :)

0: https://docs.ceph.com/en/latest/rados/operations/pg-repair





  reply	other threads:[~2021-06-07  7:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-04  2:47 aderumier
2021-06-04 13:23 ` Dominik Csapak
2021-06-07  6:57   ` aderumier
2021-06-07  7:14     ` Dominik Csapak [this message]
2021-06-07  9:42       ` Maximilian Hill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c1890524-65c2-206f-0add-747c7d1979a0@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=aderumier@odiso.com \
    --cc=pve-devel@lists.proxmox.com \
    --cc=pve-devel@pve.proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal