public inbox for pbs-devel@lists.proxmox.com
 help / color / mirror / Atom feed
From: Aaron Lauterer <a.lauterer@proxmox.com>
To: Thomas Lamprecht <t.lamprecht@proxmox.com>,
	Proxmox Backup Server development discussion
	<pbs-devel@lists.proxmox.com>
Subject: Re: [pbs-devel] [PATCH proxmox-backup 2/2] docs/scanrefs: fix handling if ref is same as headline
Date: Mon, 8 Feb 2021 17:06:38 +0100	[thread overview]
Message-ID: <fbc0dffa-6045-34e9-e0aa-cad026d1c937@proxmox.com> (raw)
In-Reply-To: <2632760d-8024-0ede-0cf1-cd9140e450e2@proxmox.com>



On 2/6/21 9:22 AM, Thomas Lamprecht wrote:
> On 05.02.21 16:10, Aaron Lauterer wrote:
>> If the ref is named the same as the headline (once normalized), sphinx
>> will return a 'idX' value in node['ids'][1] which we use for the label
>> ID. The headline is always present at index 0.
>>
>> Checking for that and using index 0 in case we do get a 'idX' helps us
>> to avoid using the 'idX' as keys in our OnlineHelpInfo.js and actually
>> use the intended key.
>>
>> Signed-off-by: Aaron Lauterer <a.lauterer@proxmox.com>
>> ---
>>   docs/_ext/proxmox-scanrefs.py | 13 ++++++++++++-
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/docs/_ext/proxmox-scanrefs.py b/docs/_ext/proxmox-scanrefs.py
>> index 1b3c0615..0d626561 100644
>> --- a/docs/_ext/proxmox-scanrefs.py
>> +++ b/docs/_ext/proxmox-scanrefs.py
>> @@ -90,7 +90,18 @@ class ReflabelMapper(Builder):
>>                   if hasattr(node, 'expect_referenced_by_id') and len(node['ids']) > 1: # explicit labels
>>                       filename = self.env.doc2path(docname)
>>                       filename_html = re.sub('.rst', '.html', filename)
>> -                    labelid = node['ids'][1] # [0] is predefined by sphinx, we need [1] for explicit ones
>> +
>> +                    # node['ids'][0] contains a normalized version of the
>> +                    # headline.  If the ref and headline are the same
>> +                    # (normalized) sphinx will set the node['ids'][1] to a
>> +                    # generic id in the format `idX` where X is numeric. If the
>> +                    # ref and headline are not the same, the ref name will be
>> +                    # stored in node['ids'][1]
> 
> can you point me from where you derived that?
> 
> Because I think there are always two refs in such cases where we set one
> above a heading: the implicit heading one and the explicit from us.
> The always get normalized, but the implicit has a fallback if there's a ref
> conflict with an explicit or even another implicit one, when a title is
> reused in the same chapter or so?
> 
> Do we also have access to the chapter id/name here?
> Then we could enforce that explicit ones must have that prefixed.

I did derive that from comparing the output of the debug prints for the different situations. Unfortunately the Sphinx docs are a bit sparse on that or my search foo is not good enough ;)

Comparing the output if the explicit ref matches the implicit from the headline (shortened the 'children' element):

{'attributes': {'backrefs': [],
                 'classes': [],
                 'dupnames': [],
                 'ids': ['creating-backups', 'id1'],
                 'names': ['creating backups', 'creating_backups']},
  'children': [<title: <#text: 'Creating Backups'>>,
               <paragraph: <#text: 'This section e ...'>>,
               [.....]
               <literal_block: <#text: '# proxmox-back ...'>>,
               <section "excluding files/folders from a backup": <title...><paragraph...><paragraph...><paragraph...><par ...>],
  'document': <document: <section "backup client usage"...>>,
  'expect_referenced_by_id': {'creating-backups': <target: >},
  'expect_referenced_by_name': {'creating_backups': <target: >},


And now if the explicit ref is different from the headline:


{'attributes': {'backrefs': [],
                 'classes': [],
                 'dupnames': [],
                 'ids': ['creating-backups', 'client-creating-backups'],
                 'names': ['creating backups', 'client_creating_backups']},
  'children': [<title: <#text: 'Creating Backups'>>,
               <paragraph: <#text: 'This section e ...'>>,
               [...]
               <literal_block: <#text: '# proxmox-back ...'>>,
               <section "excluding files/folders from a backup": <title...><paragraph...><paragraph...><paragraph...><par ...>],
  'document': <document: <section "backup client usage"...>>,
  'expect_referenced_by_id': {'client-creating-backups': <target: >},
  'expect_referenced_by_name': {'client_creating_backups': <target: >},



You can see the difference in the 'attributes.ids' array.

On thing though that I observed is that 'expect_referenced_by_id' will contain the actual key used for the ref AFAICT. So we could use that and not worry about checking if the 'attributes.ids[0]' array contains a string starting with 'id[0-9]'. If I set the explicit ref to 'idX' with X being a number, that then is also present in the 'expect_referenced_by_id' field.


On an additional note: Right now we do not have any explicit references matching the headlines they are referencing because they are all prefixed or unique in another way. We could add a check here to fail if the explicit ref id matches the normalized headline and throw a warning / die with error to avoid any ambiguity in the refs in the future.

e.g. (pseudo code)
if (attributes['ids'][0] == expect_referenced_by_id:
     exit('reference is matching implicit headline ref, consider adding a prefix')

> 
>> +                    if re.match('^id[0-9]*$', node['ids'][1]):
> 
> should be a + not * op? we want to avoid clashes with real possible refs
> as much as possible..
> 
> What happens if I set now one to id1 and there would be already an id1?
> 
> I just really do not want to revisit this again, and loosing references
> is a no-go, the docs must work.

See above note, I think that addresses it.

> 
>> +                        labelid = node['ids'][0]
>> +                    else:
>> +                        labelid = node['ids'][1]
>> +
>>                       title = cast(nodes.title, node[0])
>>                       logger.info('traversing section {}'.format(title.astext()))
>>                       ref_name = getattr(title, 'rawsource', title.astext())
>>
> 




  reply	other threads:[~2021-02-08 16:07 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-05 15:10 [pbs-devel] [PATCH proxmox-backup 1/2] docs: fix references to changed refs Aaron Lauterer
2021-02-05 15:10 ` [pbs-devel] [PATCH proxmox-backup 2/2] docs/scanrefs: fix handling if ref is same as headline Aaron Lauterer
2021-02-06  8:22   ` Thomas Lamprecht
2021-02-08 16:06     ` Aaron Lauterer [this message]
2021-02-06  7:49 ` [pbs-devel] applied: [PATCH proxmox-backup 1/2] docs: fix references to changed refs Dietmar Maurer
2021-02-06  8:12 ` [pbs-devel] " Thomas Lamprecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fbc0dffa-6045-34e9-e0aa-cad026d1c937@proxmox.com \
    --to=a.lauterer@proxmox.com \
    --cc=pbs-devel@lists.proxmox.com \
    --cc=t.lamprecht@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal