public inbox for pmg-devel@lists.proxmox.com
 help / color / mirror / Atom feed
* [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing
@ 2026-06-17  7:47 Dominik Csapak
  2026-06-17  7:47 ` [PATCH log-tracker v2 2/3] parse_qid{_prefix}: remove 'max' parameter Dominik Csapak
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Dominik Csapak @ 2026-06-17  7:47 UTC (permalink / raw)
  To: pmg-devel

By splitting out the parsing of the pmg-smtp-filter QIDs, we can more
rigorously parse them since the parser does not have to overlap with
postfix's QIDs'. We have a bit of code duplication, but more clarity
when reading the code and better checks.

Since we know how long the pmg-smtp-filter QIDs must be, we can also
filter by the length.

Since we know the lengths of a pmg-smtp-filter QID, we don't need to
pass the max-len from outside.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
no changes from v1
 src/main.rs | 67 ++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 61 insertions(+), 6 deletions(-)

diff --git a/src/main.rs b/src/main.rs
index 0b6e92f..89d2768 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -111,7 +111,7 @@ fn main() -> Result<(), Error> {
 // we match 4 entries, all beginning with a QID
 // accept mail, move mail, block mail and the processing time
 fn handle_pmg_smtp_filter_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
-    let (qid, data) = match parse_qid_prefix(msg, 25) {
+    let (qid, data) = match parse_pmg_smtp_filter_qid_prefix(msg) {
         Some((q, m)) => (q, m),
         None => return,
     };
@@ -167,7 +167,7 @@ fn handle_pmg_smtp_filter_message(msg: &[u8], parser: &mut Parser, complete_line
             None => return,
         };
         let data = &data[qid_index + 13..];
-        let (qid, _) = match parse_qid(data, 25) {
+        let (qid, _) = match parse_pmg_smtp_filter_qid(data) {
             Some(t) => t,
             None => return,
         };
@@ -407,7 +407,7 @@ fn handle_lmtp_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
         }
 
         // this is the QID of the associated pmg-smtp-filter
-        let (qid, _) = match parse_qid(data, 25) {
+        let (qid, _) = match parse_pmg_smtp_filter_qid(data) {
             Some(t) => t,
             None => return,
         };
@@ -543,7 +543,7 @@ fn handle_smtpd_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
             return;
         }
         let data = &data[14..];
-        if let Some((qid, data)) = parse_qid(data, 25) {
+        if let Some((qid, data)) = parse_pmg_smtp_filter_qid(data) {
             let fe = get_or_create_fentry(&mut parser.fentries, qid);
             // set the FEntry to before-queue filtered
             fe.borrow_mut().is_bq = true;
@@ -597,7 +597,7 @@ fn handle_smtpd_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
 
         if let Some(qid_index) = find(data, b"(") {
             let data = &data[qid_index + 1..];
-            if let Some((qid, _)) = parse_qid(data, 25) {
+            if let Some((qid, _)) = parse_pmg_smtp_filter_qid(data) {
                 let fe = get_or_create_fentry(&mut parser.fentries, qid);
                 // set the FEntry to before-queue filtered
                 fe.borrow_mut().is_bq = true;
@@ -2194,6 +2194,42 @@ fn parse_number(data: &[u8], max_digits: usize) -> Option<(usize, &[u8])> {
     }
 }
 
+/// pmg-smtp-filter QIDs must be at least 7 hex characters (see src/PMG/MailQueue.pm in pmg-api)
+const PMG_SMTP_FILTER_QID_MIN_LEN: usize = 7;
+
+const PMG_SMTP_FILTER_QID_MAX_LEN: usize = 25;
+
+/// Parse a queue ID followed by the `": "` delimiter, returning (qid, remaining_text) or None.
+///
+/// Requiring the delimiter keeps foreign lines, like the output a custom check script logs under
+/// the pmg-smtp-filter identifier, from panicking the parser or being recorded under a bogus ID.
+fn parse_pmg_smtp_filter_qid_prefix(msg: &[u8]) -> Option<(&[u8], &[u8])> {
+    let (qid, data) = parse_pmg_smtp_filter_qid(msg)?;
+    Some((qid, data.strip_prefix(b": ")?))
+}
+
+/// Parse a queue ID from pmg-smtp-filter and return a tuple of (qid, remaining_text) or None.
+///
+/// Queue IDs are hexadecimal (`[0-9A-Fa-f]`) so only parse until the first non-hexadecimal
+/// character is encountered.
+fn parse_pmg_smtp_filter_qid(data: &[u8]) -> Option<(&[u8], &[u8])> {
+    // to simplify limit max to data.len()
+    let max = PMG_SMTP_FILTER_QID_MAX_LEN.min(data.len());
+
+    if max < PMG_SMTP_FILTER_QID_MIN_LEN {
+        return None;
+    }
+    // take at most max, find the first non-hexdigit byte
+    match data.iter().take(max).position(|b| !b.is_ascii_hexdigit()) {
+        // too short
+        Some(n) if n < PMG_SMTP_FILTER_QID_MIN_LEN => None,
+        // otherwise split at the first non-hexadecimal character
+        Some(n) => Some(data.split_at(n)),
+        // or return 'max' length QID if no non-hexadecimal character is found
+        None => Some(data.split_at(max)),
+    }
+}
+
 /// Parse time. Returns a tuple of (parsed_time, remaining_text) or None.
 fn parse_time(
     data: &'_ [u8],
@@ -2396,6 +2432,8 @@ fn find_lowercase(data: &[u8], needle: &[u8]) -> Option<usize> {
 
 #[cfg(test)]
 mod tests {
+    use crate::parse_pmg_smtp_filter_qid;
+
     use super::{POSTFIX_QID_MAX_LEN, parse_qid, parse_qid_prefix, rotated_logfile};
 
     #[test]
@@ -2427,11 +2465,28 @@ mod tests {
         );
     }
 
+    #[test]
+    fn parse_filter_ids() {
+        assert_eq!(
+            parse_pmg_smtp_filter_qid(b"3802E45DFA503808B06"),
+            Some((&b"3802E45DFA503808B06"[..], &b""[..])),
+        );
+    }
+
+    #[test]
+    fn parse_filter_id_failing() {
+        // these tests should all return None so not parsing
+        assert_eq!(parse_pmg_smtp_filter_qid(b"3"), None);
+        assert_eq!(parse_pmg_smtp_filter_qid(b""), None);
+        assert_eq!(parse_pmg_smtp_filter_qid(b"Warning"), None);
+        assert_eq!(parse_pmg_smtp_filter_qid(b"ThisIsAtest"), None);
+    }
+
     #[test]
     fn parse_filter_id_terminated_by_paren() {
         // pmg-smtp-filter ID, terminated by ')'
         assert_eq!(
-            parse_qid(b"3802E45DFA503808B06)", 25),
+            parse_pmg_smtp_filter_qid(b"3802E45DFA503808B06)"),
             Some((&b"3802E45DFA503808B06"[..], &b")"[..])),
         );
     }
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH log-tracker v2 2/3] parse_qid{_prefix}: remove 'max' parameter
  2026-06-17  7:47 [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing Dominik Csapak
@ 2026-06-17  7:47 ` Dominik Csapak
  2026-06-17  7:47 ` [PATCH log-tracker v2 3/3] fix #7702: improve postfix QID parsing Dominik Csapak
  2026-06-17 11:51 ` applied: [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing Stoiko Ivanov
  2 siblings, 0 replies; 4+ messages in thread
From: Dominik Csapak @ 2026-06-17  7:47 UTC (permalink / raw)
  To: pmg-devel

it was always called with the same value anyway, so just use the
constant directly in parse_qid and avoid having to pass it from outside.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
new in v2
 src/main.rs | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/src/main.rs b/src/main.rs
index 89d2768..3b815cb 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -269,7 +269,7 @@ fn handle_postscreen_message(msg: &[u8], parser: &mut Parser, complete_line: &[u
 // these only appear in the 'after-queue filter' case or when the mail is
 // 'accepted' in the 'before-queue filter' case
 fn handle_qmgr_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
-    let (qid, data) = match parse_qid_prefix(msg, POSTFIX_QID_MAX_LEN) {
+    let (qid, data) = match parse_qid_prefix(msg) {
         Some(t) => t,
         None => return,
     };
@@ -329,7 +329,7 @@ fn handle_lmtp_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
         return;
     }
 
-    let (qid, data) = match parse_qid_prefix(msg, POSTFIX_QID_MAX_LEN) {
+    let (qid, data) = match parse_qid_prefix(msg) {
         Some((q, t)) => (q, t),
         None => return,
     };
@@ -638,7 +638,7 @@ fn handle_smtpd_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
 
     // with none of the other messages matching, we try for a QID to match the
     // corresponding QEntry to the SEntry
-    let (qid, data) = match parse_qid_prefix(msg, POSTFIX_QID_MAX_LEN) {
+    let (qid, data) = match parse_qid_prefix(msg) {
         Some(t) => t,
         None => return,
     };
@@ -668,7 +668,7 @@ fn handle_smtpd_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
 // happens before the mail is passed to qmgr (after-queue or before-queue
 // accepted only)
 fn handle_cleanup_message(msg: &[u8], parser: &mut Parser, complete_line: &[u8]) {
-    let (qid, data) = match parse_qid_prefix(msg, POSTFIX_QID_MAX_LEN) {
+    let (qid, data) = match parse_qid_prefix(msg) {
         Some(t) => t,
         None => return,
     };
@@ -2137,9 +2137,9 @@ const POSTFIX_QID_MAX_LEN: usize = 25;
 /// alphabet, and pmg-smtp-filter IDs are likewise alphanumeric. The scan stops at
 /// the first non-alphanumeric byte (the `:` or `)` delimiter that normally follows
 /// the queue ID); a run longer than `max` with no delimiter is truncated to `max`.
-fn parse_qid(data: &[u8], max: usize) -> Option<(&[u8], &[u8])> {
+fn parse_qid(data: &[u8]) -> Option<(&[u8], &[u8])> {
     // to simplify limit max to data.len()
-    let max = max.min(data.len());
+    let max = POSTFIX_QID_MAX_LEN.min(data.len());
     // take at most max, find the first non-alphanumeric byte
     match data
         .iter()
@@ -2161,8 +2161,8 @@ fn parse_qid(data: &[u8], max: usize) -> Option<(&[u8], &[u8])> {
 ///
 /// Requiring the delimiter keeps foreign lines, like the output a custom check script logs under
 /// the pmg-smtp-filter identifier, from panicking the parser or being recorded under a bogus ID.
-fn parse_qid_prefix(msg: &[u8], max: usize) -> Option<(&[u8], &[u8])> {
-    let (qid, data) = parse_qid(msg, max)?;
+fn parse_qid_prefix(msg: &[u8]) -> Option<(&[u8], &[u8])> {
+    let (qid, data) = parse_qid(msg)?;
     Some((qid, data.strip_prefix(b": ")?))
 }
 
@@ -2434,13 +2434,13 @@ fn find_lowercase(data: &[u8], needle: &[u8]) -> Option<usize> {
 mod tests {
     use crate::parse_pmg_smtp_filter_qid;
 
-    use super::{POSTFIX_QID_MAX_LEN, parse_qid, parse_qid_prefix, rotated_logfile};
+    use super::{parse_qid, parse_qid_prefix, rotated_logfile};
 
     #[test]
     fn parse_short_hex_qid() {
         // legacy, short hexadecimal postfix queue ID
         assert_eq!(
-            parse_qid(b"0022C3801B5: removed", POSTFIX_QID_MAX_LEN),
+            parse_qid(b"0022C3801B5: removed"),
             Some((&b"0022C3801B5"[..], &b": removed"[..])),
         );
     }
@@ -2450,7 +2450,7 @@ mod tests {
         // postfix long queue ID (enable_long_queue_ids = yes) drawn from the
         // base-52 alphabet, i.e. containing non-hex letters
         assert_eq!(
-            parse_qid(b"4Zk8mP2gqRz: removed", POSTFIX_QID_MAX_LEN),
+            parse_qid(b"4Zk8mP2gqRz: removed"),
             Some((&b"4Zk8mP2gqRz"[..], &b": removed"[..])),
         );
     }
@@ -2460,7 +2460,7 @@ mod tests {
         // worst case long queue ID: 7 sec + 4 usec + 'z' + 12-char base-51 inode
         // (64-bit) = 24 chars; it must not be truncated by POSTFIX_QID_MAX_LEN
         assert_eq!(
-            parse_qid(b"4Zk8mP72gqRzLp7Wn3Yt8Kc5: removed", POSTFIX_QID_MAX_LEN),
+            parse_qid(b"4Zk8mP72gqRzLp7Wn3Yt8Kc5: removed"),
             Some((&b"4Zk8mP72gqRzLp7Wn3Yt8Kc5"[..], &b": removed"[..])),
         );
     }
@@ -2494,13 +2494,13 @@ mod tests {
     #[test]
     fn reject_too_short_qid() {
         // fewer than 5 leading queue-id characters is not a valid queue ID
-        assert_eq!(parse_qid(b"ab: x", POSTFIX_QID_MAX_LEN), None);
+        assert_eq!(parse_qid(b"ab: x"), None);
     }
 
     #[test]
     fn qid_prefix_parses_regular_entries() {
         assert_eq!(
-            parse_qid_prefix(b"0022C3801B5: removed", POSTFIX_QID_MAX_LEN),
+            parse_qid_prefix(b"0022C3801B5: removed"),
             Some((&b"0022C3801B5"[..], &b"removed"[..])),
         );
     }
@@ -2509,14 +2509,11 @@ mod tests {
     fn qid_prefix_rejects_lines_without_delimiter() {
         // foreign lines under a matched syslog identifier, like the output of a custom check
         // script, must not be mistaken for an entry: a bare delimiter at the end of the line ...
-        assert_eq!(parse_qid_prefix(b"DEBUG:", 25), None);
+        assert_eq!(parse_qid_prefix(b"DEBUG:"), None);
         // ... a delimiter without the following space ...
-        assert_eq!(
-            parse_qid_prefix(b"DEBUG:Module loader, version 1088", 25),
-            None
-        );
+        assert_eq!(parse_qid_prefix(b"DEBUG:Module loader, version 1088"), None);
         // ... and no delimiter at all
-        assert_eq!(parse_qid_prefix(b"DEBUGOUTPUT", 25), None);
+        assert_eq!(parse_qid_prefix(b"DEBUGOUTPUT"), None);
     }
 
     #[test]
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH log-tracker v2 3/3] fix #7702: improve postfix QID parsing
  2026-06-17  7:47 [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing Dominik Csapak
  2026-06-17  7:47 ` [PATCH log-tracker v2 2/3] parse_qid{_prefix}: remove 'max' parameter Dominik Csapak
@ 2026-06-17  7:47 ` Dominik Csapak
  2026-06-17 11:51 ` applied: [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing Stoiko Ivanov
  2 siblings, 0 replies; 4+ messages in thread
From: Dominik Csapak @ 2026-06-17  7:47 UTC (permalink / raw)
  To: pmg-devel

since adding support for postfix's long QIDs, the criteria was that the
QIDs had to be alphanumeric (since no built-in base52 check exists).

This lead to false positives in some logs that had lines such as:
 postfix/smtpd[xxxx]: warning: hostname example.org does not resolve to address x.y.z.w

Where the 'warning:' was interpreted as QIDs.
(There are other things postfix can print here that were too interpreted
as QIDs, this is just an example).

To fix it, add a simple state machine during parsing that keeps track of
what should be encountered in a long/short QID and abort if there is an
impossible combination (e.g. vowels in a long QID).

Since adding new checks here could be performance critical, I
benchmarked it and it showed no significant performance changes.

Setup:
A ~2.5 GiB large syslog (on ram; otherwise we're likely IO bottlenecked)
filtered for some time range and a search for a from address:

Benchmark 1: current-master
  Time (mean ± σ):      4.017 s ±  0.013 s    [User: 3.765 s, System: 0.252 s]
  Range (min … max):    4.004 s …  4.041 s    10 runs

Benchmark 2: this-patch
  Time (mean ± σ):      3.998 s ±  0.034 s    [User: 3.752 s, System: 0.245 s]
  Range (min … max):    3.938 s …  4.073 s    10 runs

Summary
  this-patch ran
    1.00 ± 0.01 times faster than current-master

Need to adapt one test that was not a valid long qid.

Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
---
changes from v1:
* adapted to split out parameter removal
* improved comment
* updated commit message
* added a minimum length check at the beginning of parse_qid

 src/main.rs                        | 143 ++++++++++++++++++++++++-----
 tests/test_input_misleading_lines  |  20 ++++
 tests/test_output_misleading_lines |  38 ++++++++
 tests/tests_misleading_lines.rs    |  39 ++++++++
 4 files changed, 219 insertions(+), 21 deletions(-)
 create mode 100644 tests/test_input_misleading_lines
 create mode 100644 tests/test_output_misleading_lines
 create mode 100644 tests/tests_misleading_lines.rs

diff --git a/src/main.rs b/src/main.rs
index 3b815cb..cd5ad7e 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,6 +1,7 @@
 use std::cell::RefCell;
 use std::collections::HashMap;
 use std::ffi::CString;
+use std::ops::RangeInclusive;
 use std::rc::{Rc, Weak};
 
 use std::fs::File;
@@ -2123,6 +2124,26 @@ fn open_logfile(path: &Path) -> std::io::Result<Box<dyn BufRead>> {
     }
 }
 
+fn is_vowel(c: u8) -> bool {
+    c == b'A'
+        || c == b'E'
+        || c == b'a'
+        || c == b'e'
+        || c == b'I'
+        || c == b'i'
+        || c == b'O'
+        || c == b'o'
+        || c == b'U'
+        || c == b'u'
+}
+
+enum QidVariant {
+    Unknown,
+    Short,
+    Long,
+    Invalid,
+}
+
 /// Maximum length of a postfix queue ID. With `enable_long_queue_ids = yes`
 /// (see http://www.postfix.org/postconf.5.html#enable_long_queue_ids) the ID is
 /// the Unix-epoch seconds (6 base-52 chars, a 7th from year 2596+), 4 for the
@@ -2130,31 +2151,111 @@ fn open_logfile(path: &Path) -> std::io::Result<Box<dyn BufRead>> {
 /// inode), so up to 24 chars; 25 has headroom and the delimiter ends it first.
 const POSTFIX_QID_MAX_LEN: usize = 25;
 
-/// Parse a queue ID and return a tuple of (qid, remaining_text) or None.
+/// A short postfix QID can only have between 6 and 21 characters, 5 from usec of the timestamp and
+/// between 1 and 16 from the inode nr printed as '%lX'
+const SHORT_QID_RANGE: RangeInclusive<usize> = 6..=21;
+
+/// A long postfix QID must be at least 12 characters: time in seconds -> 6+ chars + time in
+/// microseconds -> 4 chars + 'z' + at least one char for the inode
+const LONG_QID_MIN: usize = 12;
+
+/// Parse a queue ID from postfix and return a tuple of (qid, remaining_text) or None.
 ///
 /// Queue IDs are alphanumeric (`[0-9A-Za-z]`): legacy postfix queue IDs are
 /// hexadecimal, postfix long queue IDs (`enable_long_queue_ids`) use a base-52
-/// alphabet, and pmg-smtp-filter IDs are likewise alphanumeric. The scan stops at
-/// the first non-alphanumeric byte (the `:` or `)` delimiter that normally follows
-/// the queue ID); a run longer than `max` with no delimiter is truncated to `max`.
+/// alphabet. The scan tries to determine if it's a short or long QID and cancels
+/// the scan if it detects either the end or an invalid state. (e.g. a vowel in a
+/// long QID). It assumes that only non alphanumeric characters are the delimiter,
+/// so e.g. a long QID followed by a vowel would be detected as invalid.
 fn parse_qid(data: &[u8]) -> Option<(&[u8], &[u8])> {
+    // check the minimum length first
+    if data.len() < *SHORT_QID_RANGE.start() {
+        return None;
+    }
+
     // to simplify limit max to data.len()
     let max = POSTFIX_QID_MAX_LEN.min(data.len());
-    // take at most max, find the first non-alphanumeric byte
-    match data
-        .iter()
-        .take(max)
-        .position(|b| !b.is_ascii_alphanumeric())
-    {
-        // if there were less than 5 return nothing
-        // the QID always has at least 5 characters for the microseconds (see
-        // http://www.postfix.org/postconf.5.html#enable_long_queue_ids)
-        Some(n) if n < 5 => None,
-        // otherwise split at the first non-alphanumeric byte
-        Some(n) => Some(data.split_at(n)),
-        // or return 'max' length QID if no non-alphanumeric byte is found
-        None => Some(data.split_at(max)),
-    }
+
+    let mut z_seen = false;
+    let mut variant = QidVariant::Unknown;
+
+    let position = data.iter().take(max).position(|b| match variant {
+        QidVariant::Unknown => {
+            if b.is_ascii_hexdigit() {
+                if is_vowel(*b) {
+                    // since vowels are forbidden in long qids, it must be a short one
+                    variant = QidVariant::Short;
+                }
+                return false;
+            }
+
+            if b.is_ascii_alphanumeric() {
+                if is_vowel(*b) {
+                    // it's a non hexadecimal vowel, so it can't be a short or a long qid
+                    variant = QidVariant::Invalid;
+                    return true;
+                }
+
+                if *b == b'z' {
+                    z_seen = true;
+                }
+
+                variant = QidVariant::Long;
+                return false;
+            }
+
+            true
+        }
+        QidVariant::Short => !b.is_ascii_hexdigit(),
+        QidVariant::Long => {
+            if !b.is_ascii_alphanumeric() {
+                return true;
+            }
+
+            if is_vowel(*b) {
+                // vowel detected in a long qid, must be invalid
+                variant = QidVariant::Invalid;
+                return true;
+            }
+
+            if *b == b'z' {
+                z_seen = true;
+            }
+
+            false
+        }
+        QidVariant::Invalid => true,
+    });
+
+    let pos = match (variant, z_seen, position) {
+        // these must be short variants that don't have any vowels in them
+        (QidVariant::Unknown, false, None) if SHORT_QID_RANGE.contains(&max) => Some(max),
+        (QidVariant::Unknown, false, Some(n)) if SHORT_QID_RANGE.contains(&n) => Some(n),
+
+        // short variants
+        (QidVariant::Short, false, None) if SHORT_QID_RANGE.contains(&max) => Some(max),
+        (QidVariant::Short, false, Some(n)) if SHORT_QID_RANGE.contains(&n) => Some(n),
+
+        (QidVariant::Long, true, None) if max >= LONG_QID_MIN => Some(max),
+        (QidVariant::Long, true, Some(n)) if n >= LONG_QID_MIN => Some(n),
+
+        // short variant can't have z -> invalid
+        (QidVariant::Short, true, _) => None,
+
+        // long variant without z is invalid
+        (QidVariant::Long, false, _) => None,
+
+        // this can't happen since we can have only a z_seen in Long or when changing to Long
+        (QidVariant::Unknown, true, _) => {
+            eprintln!("internal error, unknown QID variant with 'z'");
+            None
+        }
+
+        // everything else is also invalid
+        _ => None,
+    };
+
+    pos.map(|pos| data.split_at(pos))
 }
 
 /// Parse a queue ID followed by the `": "` delimiter, returning (qid, remaining_text) or None.
@@ -2450,8 +2551,8 @@ mod tests {
         // postfix long queue ID (enable_long_queue_ids = yes) drawn from the
         // base-52 alphabet, i.e. containing non-hex letters
         assert_eq!(
-            parse_qid(b"4Zk8mP2gqRz: removed"),
-            Some((&b"4Zk8mP2gqRz"[..], &b": removed"[..])),
+            parse_qid(b"4Zk8mP2gqRz1: removed"),
+            Some((&b"4Zk8mP2gqRz1"[..], &b": removed"[..])),
         );
     }
 
diff --git a/tests/test_input_misleading_lines b/tests/test_input_misleading_lines
new file mode 100644
index 0000000..2512ec0
--- /dev/null
+++ b/tests/test_input_misleading_lines
@@ -0,0 +1,20 @@
+2026-06-16T02:26:09.196080+02:00 pmg postfix/smtpd[3971220]: warning: hostname example.com does not resolve to address 1.2.3.4
+2026-06-16T02:26:09.196262+02:00 pmg postfix/smtpd[3971220]: connect from unknown[1.2.3.4]
+2026-06-16T02:26:10.026205+02:00 pmg postfix/smtpd[3971220]: Anonymous TLS connection established from unknown[1.2.3.4]: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange secp256r1 server-signature RSA-PSS (4096 bits) server-digest SHA256
+2026-06-16T02:26:10.464588+02:00 pmg postfix/smtpd[3971220]: 123412341234: client=unknown[1.2.3.4]
+2026-06-16T02:26:10.588801+02:00 pmg postfix/postscreen[407424]: CONNECT from [1.2.3.4]:44340 to [10.10.10.10]:25
+2026-06-16T02:26:10.589017+02:00 pmg postfix/postscreen[407424]: PASS OLD [1.2.3.4]:44340
+2026-06-16T02:26:10.591164+02:00 pmg postfix/smtpd[3894722]: warning: hostname example.com does not resolve to address 1.2.3.4
+2026-06-16T02:26:10.591339+02:00 pmg postfix/smtpd[3894722]: connect from unknown[1.2.3.4]
+2026-06-16T02:26:10.662240+02:00 pmg postfix/cleanup[3949936]: 123412341234: message-id=<xyz>
+2026-06-16T02:26:10.672845+02:00 pmg postfix/qmgr[407346]: 123412341234: from=<test@pmg.local>, size=1337, nrcpt=1 (queue active)
+2026-06-16T02:26:10.673044+02:00 pmg postfix/smtpd[3971220]: disconnect from unknown[1.2.3.4] ehlo=2 starttls=1 mail=1 rcpt=1 bdat=2 quit=1 commands=8
+2026-06-16T02:26:10.673820+02:00 pmg pmg-smtp-filter[4049085]: 2026/06/16-02:26:10 CONNECT TCP Peer: "[127.0.0.1]:33030" Local: "[127.0.0.1]:10024"
+2026-06-16T02:26:10.727199+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: new mail message-id=<xyz>
+2026-06-16T02:26:11.460707+02:00 pmg postfix/smtpd[3894722]: Anonymous TLS connection established from unknown[1.2.3.4]: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange secp256r1 server-signature RSA-PSS (4096 bits) server-digest SHA256
+2026-06-16T02:26:11.753241+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: SA score=9999/5 time=0.328 bayes=undefined autolearn=disabled hits=SOME_RULE(9999)
+2026-06-16T02:26:11.767261+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: moved mail for <test@pmg.local> to spam quarantine - 432143214321 (rule: Quarantine)
+2026-06-16T02:26:11.770191+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: processing time: 2.054 seconds (0.908, 0.079, 0)
+2026-06-16T02:26:11.770749+02:00 pmg postfix/lmtp[3918143]: 123412341234: to=<test@pmg.local>, relay=127.0.0.1[127.0.0.1]:10024, delay=1.3, delays=0.21/0/0.04/1.1, dsn=2.5.0, status=sent (250 2.5.0 OK (123412341234))
+2026-06-16T02:26:11.771119+02:00 pmg postfix/qmgr[407346]: 123412341234: removed
+2026-06-16T02:26:12.123422+02:00 pmg postfix/smtpd[3894722]: disconnect from unknown[1.2.3.4] ehlo=2 starttls=1 mail=1 rcpt=1 bdat=2 quit=1 commands=8
diff --git a/tests/test_output_misleading_lines b/tests/test_output_misleading_lines
new file mode 100644
index 0000000..b54ca32
--- /dev/null
+++ b/tests/test_output_misleading_lines
@@ -0,0 +1,38 @@
+# LogReader: 3647914
+# Query options
+# Start: 2020-12-18 15:00:00 (1608300000)
+# End: 2026-06-16 15:40:00 (1781617200)
+# End Query Options
+
+QENTRY: 123412341234
+CTIME: 6A309823
+SIZE: 1337
+CLIENT: unknown[1.2.3.4]
+MSGID: <xyz>
+TO:6A309823:123412341234:Q: from <test@pmg.local> to <test@pmg.local> (432143214321)
+SMTP:
+L00000001 2026-06-16T02:26:09.196080+02:00 pmg postfix/smtpd[3971220]: warning: hostname example.com does not resolve to address 1.2.3.4
+L00000002 2026-06-16T02:26:09.196262+02:00 pmg postfix/smtpd[3971220]: connect from unknown[1.2.3.4]
+L00000003 2026-06-16T02:26:10.026205+02:00 pmg postfix/smtpd[3971220]: Anonymous TLS connection established from unknown[1.2.3.4]: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange secp256r1 server-signature RSA-PSS (4096 bits) server-digest SHA256
+L00000004 2026-06-16T02:26:10.464588+02:00 pmg postfix/smtpd[3971220]: 123412341234: client=unknown[1.2.3.4]
+L0000000B 2026-06-16T02:26:10.673044+02:00 pmg postfix/smtpd[3971220]: disconnect from unknown[1.2.3.4] ehlo=2 starttls=1 mail=1 rcpt=1 bdat=2 quit=1 commands=8
+FILTER: 123412341234
+L0000000D 2026-06-16T02:26:10.727199+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: new mail message-id=<xyz>
+L0000000F 2026-06-16T02:26:11.753241+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: SA score=9999/5 time=0.328 bayes=undefined autolearn=disabled hits=SOME_RULE(9999)
+L00000010 2026-06-16T02:26:11.767261+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: moved mail for <test@pmg.local> to spam quarantine - 432143214321 (rule: Quarantine)
+L00000011 2026-06-16T02:26:11.770191+02:00 pmg pmg-smtp-filter[4049085]: 123412341234: processing time: 2.054 seconds (0.908, 0.079, 0)
+QMGR:
+L00000009 2026-06-16T02:26:10.662240+02:00 pmg postfix/cleanup[3949936]: 123412341234: message-id=<xyz>
+L0000000A 2026-06-16T02:26:10.672845+02:00 pmg postfix/qmgr[407346]: 123412341234: from=<test@pmg.local>, size=1337, nrcpt=1 (queue active)
+L00000012 2026-06-16T02:26:11.770749+02:00 pmg postfix/lmtp[3918143]: 123412341234: to=<test@pmg.local>, relay=127.0.0.1[127.0.0.1]:10024, delay=1.3, delays=0.21/0/0.04/1.1, dsn=2.5.0, status=sent (250 2.5.0 OK (123412341234))
+L00000013 2026-06-16T02:26:11.771119+02:00 pmg postfix/qmgr[407346]: 123412341234: removed
+
+SMTPD: T6A309822L00000004
+CTIME: 6A309824
+CLIENT: unknown[1.2.3.4]
+LOGS:
+L00000007 2026-06-16T02:26:10.591164+02:00 pmg postfix/smtpd[3894722]: warning: hostname example.com does not resolve to address 1.2.3.4
+L00000008 2026-06-16T02:26:10.591339+02:00 pmg postfix/smtpd[3894722]: connect from unknown[1.2.3.4]
+L0000000E 2026-06-16T02:26:11.460707+02:00 pmg postfix/smtpd[3894722]: Anonymous TLS connection established from unknown[1.2.3.4]: TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange secp256r1 server-signature RSA-PSS (4096 bits) server-digest SHA256
+L00000014 2026-06-16T02:26:12.123422+02:00 pmg postfix/smtpd[3894722]: disconnect from unknown[1.2.3.4] ehlo=2 starttls=1 mail=1 rcpt=1 bdat=2 quit=1 commands=8
+
diff --git a/tests/tests_misleading_lines.rs b/tests/tests_misleading_lines.rs
new file mode 100644
index 0000000..fe6a652
--- /dev/null
+++ b/tests/tests_misleading_lines.rs
@@ -0,0 +1,39 @@
+use std::fs::File;
+use std::io::BufReader;
+use std::process::Command;
+mod utils;
+
+// Some postfix lines look like they could be a QID, like 'warning: [..];' but
+// aren't. Test that they don't produce any stray messages on stderr when multiple
+// like this exist (meaning they would have been parsed as QIDs)
+#[test]
+fn misleading_lines_are_ignored() {
+    let output = Command::new("faketime")
+        .env("TZ", "Europe/Vienna")
+        .arg("2020-12-31 23:59:59")
+        .arg(utils::log_tracker_path())
+        .arg("-vv")
+        .arg("-s")
+        .arg("2020-12-18 15:00:00")
+        .arg("-e")
+        .arg("2026-06-16 15:40:00")
+        .arg("-i")
+        .arg("tests/test_input_misleading_lines")
+        .output()
+        .expect("failed to execute pmg-log-tracker");
+
+    assert_eq!(String::from_utf8_lossy(&output.stderr), "".to_string());
+
+    assert!(
+        output.status.success(),
+        "pmg-log-tracker failed: {}",
+        String::from_utf8_lossy(&output.stderr)
+    );
+
+    let expected_file =
+        File::open("tests/test_output_misleading_lines").expect("failed to open test_output");
+
+    let expected_output = BufReader::new(&expected_file);
+    let output_reader = BufReader::new(&output.stdout[..]);
+    utils::compare_output(output_reader, expected_output);
+}
-- 
2.47.3





^ permalink raw reply related	[flat|nested] 4+ messages in thread

* applied: [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing
  2026-06-17  7:47 [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing Dominik Csapak
  2026-06-17  7:47 ` [PATCH log-tracker v2 2/3] parse_qid{_prefix}: remove 'max' parameter Dominik Csapak
  2026-06-17  7:47 ` [PATCH log-tracker v2 3/3] fix #7702: improve postfix QID parsing Dominik Csapak
@ 2026-06-17 11:51 ` Stoiko Ivanov
  2 siblings, 0 replies; 4+ messages in thread
From: Stoiko Ivanov @ 2026-06-17 11:51 UTC (permalink / raw)
  To: pmg-devel, Dominik Csapak

Thanks for the quick iteration on this and the explanations on my
comments!

Gave it another look, ran some more benchmarks on some live syslogs, and
additionally compared the (sorted) output with your patch and without with
those logs - looked ok.

On Wed, 17 Jun 2026 09:47:48 +0200, Dominik Csapak wrote:
> By splitting out the parsing of the pmg-smtp-filter QIDs, we can more
> rigorously parse them since the parser does not have to overlap with
> postfix's QIDs'. We have a bit of code duplication, but more clarity
> when reading the code and better checks.
> 
> Since we know how long the pmg-smtp-filter QIDs must be, we can also
> filter by the length.
> 
> [...]

Applied, thanks!

[1/3] split out pmg-smtp-filter qid parsing
      commit: e90400ddadb396197a5b677109c86d5681a75e64
[2/3] parse_qid{_prefix}: remove 'max' parameter
      commit: 129d533dedff1a554c04fa3c113564c23df98415
[3/3] fix #7702: improve postfix QID parsing
      commit: 83c77b7da75bf9e9d41c3526401c9a2024ad0c60





^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-06-17 11:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-17  7:47 [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing Dominik Csapak
2026-06-17  7:47 ` [PATCH log-tracker v2 2/3] parse_qid{_prefix}: remove 'max' parameter Dominik Csapak
2026-06-17  7:47 ` [PATCH log-tracker v2 3/3] fix #7702: improve postfix QID parsing Dominik Csapak
2026-06-17 11:51 ` applied: [PATCH log-tracker v2 1/3] split out pmg-smtp-filter qid parsing Stoiko Ivanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox
Service provided by Proxmox Server Solutions GmbH | Privacy | Legal