From: Stoiko Ivanov <s.ivanov@proxmox.com>
To: Dominik Csapak <d.csapak@proxmox.com>
Cc: pmg-devel@lists.proxmox.com
Subject: Re: [PATCH log-tracker 2/2] fix #7702: improve postfix QID parsing
Date: Tue, 16 Jun 2026 18:26:50 +0200 [thread overview]
Message-ID: <20260616182650.61fcee21@rosa.proxmox.com> (raw)
In-Reply-To: <20260616152451.3684124-2-d.csapak@proxmox.com>
Thanks for tackling this and the analysis and benchmarks!
fwiw the code reads nicer to me with the added parsing!- a few
questions/suggestions inline:
On Tue, 16 Jun 2026 17:24:07 +0200
Dominik Csapak <d.csapak@proxmox.com> wrote:
> since adding support for postfix's long QIDs, the criteria was that the
> QIDs had to be alphanumeric (since no built-in base52 check exists).
this could be added - but to use it we'd need to split short-qid and
long-qid parsing as you did here anyways!
>..snip..
> While at it, we can remove the max length parameter from the
> parse_qid{_prefix} functions, since they were all called with the same
> length limit anyway.
the dropping of the parameter could be a separate patch imho (but fine to
apply as is if it's cumbersome to split)
>
> /// Maximum length of a postfix queue ID. With `enable_long_queue_ids = yes`
> /// (see http://www.postfix.org/postconf.5.html#enable_long_queue_ids) the ID is
> /// the Unix-epoch seconds (6 base-52 chars, a 7th from year 2596+), 4 for the
> @@ -2130,39 +2151,114 @@ fn open_logfile(path: &Path) -> std::io::Result<Box<dyn BufRead>> {
> /// inode), so up to 24 chars; 25 has headroom and the delimiter ends it first.
> const POSTFIX_QID_MAX_LEN: usize = 25;
>
> -/// Parse a queue ID and return a tuple of (qid, remaining_text) or None.
<> +/// A short postfix QID can only have between 6 and 21 characters, 5 from the timestamp in usec and
nit - that's the usec portion of the timestamp (not the whole), based on
the comment in:
https://github.com/vdukhovni/postfix/blob/5f26cb870afd39245203703984f54643a001cf2e/postfix/src/global/mail_queue.h#L153
> +/// between 1 and 16 from the inode nr printed as '%lX'
> +const SHORT_QID_RANGE: RangeInclusive<usize> = 6..=21;
> +
> +/// A long postfix QID must be at least 12 characters: time in seconds -> 6+ chars + time in
> +/// microseconds -> 4 chars + 'z' + at least one char for the inode
> +const LONG_QID_MIN: usize = 12;
not sure - but would it be more consistent/readable if we use a range here
as well (despite the upper bound already being enforced by
POSTFIX_QID_MAX_LEN) - at least I wondered for a second why we use 2
bounds for the short id's, and only a lower for the long ones below?
> +
> +/// Parse a queue ID from postfix and return a tuple of (qid, remaining_text) or None.
> ///
> /// Queue IDs are alphanumeric (`[0-9A-Za-z]`): legacy postfix queue IDs are
> /// hexadecimal, postfix long queue IDs (`enable_long_queue_ids`) use a base-52
> -/// alphabet, and pmg-smtp-filter IDs are likewise alphanumeric. The scan stops at
> -/// the first non-alphanumeric byte (the `:` or `)` delimiter that normally follows
> -/// the queue ID); a run longer than `max` with no delimiter is truncated to `max`.
> -fn parse_qid(data: &[u8], max: usize) -> Option<(&[u8], &[u8])> {
> +/// alphabet. The scan tries to determine if it's a short or long QID and cancels
> +/// the scan if it detects either the end or an invalid state. (e.g. a vowel in a
> +/// long QID). It assumes that only non alphanumeric characters are the delimiter,
> +/// so e.g. a long QID followed by a vowel would be detected as invalid.
> +fn parse_qid(data: &[u8]) -> Option<(&[u8], &[u8])> {
> // to simplify limit max to data.len()
> - let max = max.min(data.len());
> - // take at most max, find the first non-alphanumeric byte
> - match data
> - .iter()
> - .take(max)
> - .position(|b| !b.is_ascii_alphanumeric())
> - {
> - // if there were less than 5 return nothing
> - // the QID always has at least 5 characters for the microseconds (see
> - // http://www.postfix.org/postconf.5.html#enable_long_queue_ids)
> - Some(n) if n < 5 => None,
> - // otherwise split at the first non-alphanumeric byte
> - Some(n) => Some(data.split_at(n)),
> - // or return 'max' length QID if no non-alphanumeric byte is found
> - None => Some(data.split_at(max)),
> - }
> + let max = POSTFIX_QID_MAX_LEN.min(data.len());
> +
> + let mut z_seen = false;
> + let mut variant = QidVariant::Unknown;
> +
> + let position = data.iter().take(max).position(|b| match variant {
> + QidVariant::Unknown => {
> + if b.is_ascii_hexdigit() {
pre-existing - and also in your patch 1/2 (pmg-smtp-filter also uses
%X as format-placeholder) - but the hexdigits only use upper-case A-F -
could make sense to restrict this here too (and add is_uppercase_hexdigit
above)?
> + if is_vowel(*b) {
> + // since vowels are forbidden in long qids, it must be a short one
> + variant = QidVariant::Short;
> + }
> + return false;
> + }
> +
> + if b.is_ascii_alphanumeric() {
> + if is_vowel(*b) {
> + // it's a non hexadecimal vowel, so it can't be a short or a long qid
> + variant = QidVariant::Invalid;
> + return true;
> + }
> +
> + if *b == b'z' {
> + z_seen = true;
> + }
this would only happen if 'z' is the first character in a queue-id? - that
would be invalid as well?
> +
> + variant = QidVariant::Long;
> + return false;
> + }
> +
> + true
> + }
> + QidVariant::Short => !b.is_ascii_hexdigit(),
> + QidVariant::Long => {
> + if !b.is_ascii_alphanumeric() {
> + return true;
> + }
> +
> + if is_vowel(*b) {
> + // vowel detected in a long qid, must be invalid
> + variant = QidVariant::Invalid;
> + return true;
> + }
> +
> + if *b == b'z' {
> + z_seen = true;
> + }
> +
> + false
> + }
> + QidVariant::Invalid => true,
> + });
> +
> + let pos = match (variant, z_seen, position) {
> + // these must be short variants that don't have any vowels in them
> + (QidVariant::Unknown, false, None) if SHORT_QID_RANGE.contains(&max) => Some(max),
> + (QidVariant::Unknown, false, Some(n)) if SHORT_QID_RANGE.contains(&n) => Some(n),
> +
> + // short variants
> + (QidVariant::Short, false, None) if SHORT_QID_RANGE.contains(&max) => Some(max),
> + (QidVariant::Short, false, Some(n)) if SHORT_QID_RANGE.contains(&n) => Some(n),
> +
> + (QidVariant::Long, true, None) if max >= LONG_QID_MIN => Some(max),
> + (QidVariant::Long, true, Some(n)) if n >= LONG_QID_MIN => Some(n),
> +
> + // short variant can't have z -> invalid
> + (QidVariant::Short, true, _) => None,
> +
> + // long variant without z is invalid
> + (QidVariant::Long, false, _) => None,
> +
> + // this can't happen since we can have only a z_seen in Long or when changing to Long
> + (QidVariant::Unknown, true, _) => {
> + eprintln!("internal error, unknown QID variant with 'z'");
> + None
> + }
> +
> + // everything else is also invalid
> + _ => None,
> + };
> +
> + pos.map(|pos| data.split_at(pos))
> }
>
>..snip..
the rest looks ok to me - and from a quick check the test-case explains
the issue quite well.
next prev parent reply other threads:[~2026-06-16 16:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-16 15:24 [PATCH log-tracker 1/2] split out pmg-smtp-filter qid parsing Dominik Csapak
2026-06-16 15:24 ` [PATCH log-tracker 2/2] fix #7702: improve postfix QID parsing Dominik Csapak
2026-06-16 16:26 ` Stoiko Ivanov [this message]
2026-06-17 6:44 ` Dominik Csapak
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260616182650.61fcee21@rosa.proxmox.com \
--to=s.ivanov@proxmox.com \
--cc=d.csapak@proxmox.com \
--cc=pmg-devel@lists.proxmox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox