Re: [PATCH log-tracker 2/2] fix #7702: improve postfix QID parsing

all lists on lists.proxmox.com
 help / color / mirror / Atom feed

From: Dominik Csapak <d.csapak@proxmox.com>
To: Stoiko Ivanov <s.ivanov@proxmox.com>
Cc: pmg-devel@lists.proxmox.com
Subject: Re: [PATCH log-tracker 2/2] fix #7702: improve postfix QID parsing
Date: Wed, 17 Jun 2026 08:44:51 +0200	[thread overview]
Message-ID: <b7ccd925-5779-4d5e-8e1c-a073d37cbac7@proxmox.com> (raw)
In-Reply-To: <20260616182650.61fcee21@rosa.proxmox.com>



On 6/16/26 6:25 PM, Stoiko Ivanov wrote:
> Thanks for tackling this and the analysis and benchmarks!
> 
> fwiw the code reads nicer to me with the added parsing!- a few
> questions/suggestions inline:
> 
> On Tue, 16 Jun 2026 17:24:07 +0200
> Dominik Csapak <d.csapak@proxmox.com> wrote:
> 
>> since adding support for postfix's long QIDs, the criteria was that the
>> QIDs had to be alphanumeric (since no built-in base52 check exists).
> this could be added - but to use it we'd need to split short-qid and
> long-qid parsing as you did here anyways!
>> ..snip..
>> While at it, we can remove the max length parameter from the
>> parse_qid{_prefix} functions, since they were all called with the same
>> length limit anyway.
> the dropping of the parameter could be a separate patch imho (but fine to
> apply as is if it's cumbersome to split)


sure, if i'll send a new version i split that out

> 
> 
>>
>>   /// Maximum length of a postfix queue ID. With `enable_long_queue_ids = yes`
>>   /// (see http://www.postfix.org/postconf.5.html#enable_long_queue_ids) the ID is
>>   /// the Unix-epoch seconds (6 base-52 chars, a 7th from year 2596+), 4 for the
>> @@ -2130,39 +2151,114 @@ fn open_logfile(path: &Path) -> std::io::Result<Box<dyn BufRead>> {
>>   /// inode), so up to 24 chars; 25 has headroom and the delimiter ends it first.
>>   const POSTFIX_QID_MAX_LEN: usize = 25;
>>   
>> -/// Parse a queue ID and return a tuple of (qid, remaining_text) or None.
> <> +/// A short postfix QID can only have between 6 and 21 characters, 5 from the timestamp in usec and
> nit - that's the usec portion of the timestamp (not the whole), based on
> the comment in:
> https://github.com/vdukhovni/postfix/blob/5f26cb870afd39245203703984f54643a001cf2e/postfix/src/global/mail_queue.h#L153

sure, that's what i meant, just not very well expressed^^

>> +/// between 1 and 16 from the inode nr printed as '%lX'
>> +const SHORT_QID_RANGE: RangeInclusive<usize> = 6..=21;
>> +
>> +/// A long postfix QID must be at least 12 characters: time in seconds -> 6+ chars + time in
>> +/// microseconds -> 4 chars + 'z' + at least one char for the inode
>> +const LONG_QID_MIN: usize = 12;
> not sure - but would it be more consistent/readable if we use a range here
> as well (despite the upper bound already being enforced by
> POSTFIX_QID_MAX_LEN) - at least I wondered for a second why we use 2
> bounds for the short id's, and only a lower for the long ones below?

since the maximum of the long qid is the overall limit, it makes no
sense to check it later again? (the position can't be greater than
that?)

> 
>> +
>> +/// Parse a queue ID from postfix and return a tuple of (qid, remaining_text) or None.
>>   ///
>>   /// Queue IDs are alphanumeric (`[0-9A-Za-z]`): legacy postfix queue IDs are
>>   /// hexadecimal, postfix long queue IDs (`enable_long_queue_ids`) use a base-52
>> -/// alphabet, and pmg-smtp-filter IDs are likewise alphanumeric. The scan stops at
>> -/// the first non-alphanumeric byte (the `:` or `)` delimiter that normally follows
>> -/// the queue ID); a run longer than `max` with no delimiter is truncated to `max`.
>> -fn parse_qid(data: &[u8], max: usize) -> Option<(&[u8], &[u8])> {
>> +/// alphabet. The scan tries to determine if it's a short or long QID and cancels
>> +/// the scan if it detects either the end or an invalid state. (e.g. a vowel in a
>> +/// long QID). It assumes that only non alphanumeric characters are the delimiter,
>> +/// so e.g. a long QID followed by a vowel would be detected as invalid.
>> +fn parse_qid(data: &[u8]) -> Option<(&[u8], &[u8])> {
>>       // to simplify limit max to data.len()
>> -    let max = max.min(data.len());
>> -    // take at most max, find the first non-alphanumeric byte
>> -    match data
>> -        .iter()
>> -        .take(max)
>> -        .position(|b| !b.is_ascii_alphanumeric())
>> -    {
>> -        // if there were less than 5 return nothing
>> -        // the QID always has at least 5 characters for the microseconds (see
>> -        // http://www.postfix.org/postconf.5.html#enable_long_queue_ids)
>> -        Some(n) if n < 5 => None,
>> -        // otherwise split at the first non-alphanumeric byte
>> -        Some(n) => Some(data.split_at(n)),
>> -        // or return 'max' length QID if no non-alphanumeric byte is found
>> -        None => Some(data.split_at(max)),
>> -    }
>> +    let max = POSTFIX_QID_MAX_LEN.min(data.len());
>> +
>> +    let mut z_seen = false;
>> +    let mut variant = QidVariant::Unknown;
>> +
>> +    let position = data.iter().take(max).position(|b| match variant {
>> +        QidVariant::Unknown => {
>> +            if b.is_ascii_hexdigit() {
> pre-existing - and also in your patch 1/2 (pmg-smtp-filter also uses
> %X as format-placeholder) - but the hexdigits only use upper-case A-F -
> could make sense to restrict this here too (and add is_uppercase_hexdigit
> above)?

there is no 'is_uppercase_hexdigit' in the std lib, so we'd have to add 
our own... but yeah we could do it. i don't think it gains us much though

> 
>> +                if is_vowel(*b) {
>> +                    // since vowels are forbidden in long qids, it must be a short one
>> +                    variant = QidVariant::Short;
>> +                }
>> +                return false;
>> +            }
>> +
>> +            if b.is_ascii_alphanumeric() {
>> +                if is_vowel(*b) {
>> +                    // it's a non hexadecimal vowel, so it can't be a short or a long qid
>> +                    variant = QidVariant::Invalid;
>> +                    return true;
>> +                }
>> +
>> +                if *b == b'z' {
>> +                    z_seen = true;
>> +                }
> this would only happen if 'z' is the first character in a queue-id? - that
> would be invalid as well?
> 

no, because as long as it's not clear if it's a short or long qid we
stay in the unknown state, for example:

1111111111z222

would be a valid long qid, where the first non hexdigit char is the 'z'

>> +
>> +                variant = QidVariant::Long;
>> +                return false;
>> +            }
>> +
>> +            true
>> +        }
>> +        QidVariant::Short => !b.is_ascii_hexdigit(),
>> +        QidVariant::Long => {
>> +            if !b.is_ascii_alphanumeric() {
>> +                return true;
>> +            }
>> +
>> +            if is_vowel(*b) {
>> +                // vowel detected in a long qid, must be invalid
>> +                variant = QidVariant::Invalid;
>> +                return true;
>> +            }
>> +
>> +            if *b == b'z' {
>> +                z_seen = true;
>> +            }
>> +
>> +            false
>> +        }
>> +        QidVariant::Invalid => true,
>> +    });
>> +
>> +    let pos = match (variant, z_seen, position) {
>> +        // these must be short variants that don't have any vowels in them
>> +        (QidVariant::Unknown, false, None) if SHORT_QID_RANGE.contains(&max) => Some(max),
>> +        (QidVariant::Unknown, false, Some(n)) if SHORT_QID_RANGE.contains(&n) => Some(n),
>> +
>> +        // short variants
>> +        (QidVariant::Short, false, None) if SHORT_QID_RANGE.contains(&max) => Some(max),
>> +        (QidVariant::Short, false, Some(n)) if SHORT_QID_RANGE.contains(&n) => Some(n),
>> +
>> +        (QidVariant::Long, true, None) if max >= LONG_QID_MIN => Some(max),
>> +        (QidVariant::Long, true, Some(n)) if n >= LONG_QID_MIN => Some(n),
>> +
>> +        // short variant can't have z -> invalid
>> +        (QidVariant::Short, true, _) => None,
>> +
>> +        // long variant without z is invalid
>> +        (QidVariant::Long, false, _) => None,
>> +
>> +        // this can't happen since we can have only a z_seen in Long or when changing to Long
>> +        (QidVariant::Unknown, true, _) => {
>> +            eprintln!("internal error, unknown QID variant with 'z'");
>> +            None
>> +        }
>> +
>> +        // everything else is also invalid
>> +        _ => None,
>> +    };
>> +
>> +    pos.map(|pos| data.split_at(pos))
>>   }
>>   
>> ..snip..
> 
> the rest looks ok to me - and from a quick check the test-case explains
> the issue quite well.
>

     prev parent reply	other threads:[~2026-06-17  6:45 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 15:24 [PATCH log-tracker 1/2] split out pmg-smtp-filter qid parsing Dominik Csapak
2026-06-16 15:24 ` [PATCH log-tracker 2/2] fix #7702: improve postfix QID parsing Dominik Csapak
2026-06-16 16:26   ` Stoiko Ivanov
2026-06-17  6:44     ` Dominik Csapak [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7ccd925-5779-4d5e-8e1c-a073d37cbac7@proxmox.com \
    --to=d.csapak@proxmox.com \
    --cc=pmg-devel@lists.proxmox.com \
    --cc=s.ivanov@proxmox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.

Service provided by Proxmox Server Solutions GmbH | Privacy | Legal