From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <t.lamprecht@proxmox.com>
Received: from firstgate.proxmox.com (firstgate.proxmox.com [212.224.123.68])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits))
 (No client certificate requested)
 by lists.proxmox.com (Postfix) with ESMTPS id 2DF0161A0D
 for <pve-devel@lists.proxmox.com>; Wed, 19 Aug 2020 09:01:14 +0200 (CEST)
Received: from firstgate.proxmox.com (localhost [127.0.0.1])
 by firstgate.proxmox.com (Proxmox) with ESMTP id 1F048191F4
 for <pve-devel@lists.proxmox.com>; Wed, 19 Aug 2020 09:00:44 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (proxmox-new.maurer-it.com
 [212.186.127.180])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256)
 (No client certificate requested)
 by firstgate.proxmox.com (Proxmox) with ESMTPS id 7B296191E6
 for <pve-devel@lists.proxmox.com>; Wed, 19 Aug 2020 09:00:43 +0200 (CEST)
Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1])
 by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 4C87C446BE
 for <pve-devel@lists.proxmox.com>; Wed, 19 Aug 2020 09:00:43 +0200 (CEST)
To: Proxmox VE development discussion <pve-devel@lists.proxmox.com>,
 Dominik Csapak <d.csapak@proxmox.com>
References: <20200730090410.3651-1-d.csapak@proxmox.com>
From: Thomas Lamprecht <t.lamprecht@proxmox.com>
Message-ID: <3c746675-fbe6-19fb-3721-cb4a8b323450@proxmox.com>
Date: Wed, 19 Aug 2020 09:00:42 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:80.0) Gecko/20100101
 Thunderbird/80.0
MIME-Version: 1.0
In-Reply-To: <20200730090410.3651-1-d.csapak@proxmox.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-SPAM-LEVEL: Spam detection results:  0
 AWL -0.019 Adjusted score from AWL reputation of From: address
 KAM_DMARC_STATUS 0.01 Test Rule for DKIM or SPF Failure with Strict Alignment
 RCVD_IN_DNSWL_MED        -2.3 Sender listed at https://www.dnswl.org/,
 medium trust
 SPF_HELO_NONE           0.001 SPF: HELO does not publish an SPF Record
 SPF_PASS               -0.001 SPF: sender matches SPF record
 URIBL_BLOCKED 0.001 ADMINISTRATOR NOTICE: The query to URIBL was blocked. See
 http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more
 information. [tools.pm]
Subject: [pve-devel] applied: [PATCH common v2] run_command: improve
 performance for logging and long lines
X-BeenThere: pve-devel@lists.proxmox.com
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Proxmox VE development discussion <pve-devel.lists.proxmox.com>
List-Unsubscribe: <https://lists.proxmox.com/cgi-bin/mailman/options/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=unsubscribe>
List-Archive: <http://lists.proxmox.com/pipermail/pve-devel/>
List-Post: <mailto:pve-devel@lists.proxmox.com>
List-Help: <mailto:pve-devel-request@lists.proxmox.com?subject=help>
List-Subscribe: <https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel>, 
 <mailto:pve-devel-request@lists.proxmox.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Aug 2020 07:01:14 -0000

On 30.07.20 11:04, Dominik Csapak wrote:
> to call out/err/logfunc with each line, we search for a newline and call
> outfunc/logfunc with everything before that
> 
> since we do a select/read (with 4096 size) in a loop, this means
> that if we have very long lines, we search for a newline in an
> ever growing buffer (for which we know does not contain a newline)
> 
> so instead, only search the new data for newlines
> 
> Signed-off-by: Dominik Csapak <d.csapak@proxmox.com>
> ---
> changes from v1:
> * keep the substitution instead of a match, making the diff a little smaller
>   this fixes a bug in the v1, when there were multiple lines in one read
>  src/PVE/Tools.pm | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
>

applied, thanks! With the followup below, doing two things both already present
before your optimization, so just FYI:
* non-capturing group for things we do not use
* fix matching of \r\n sequence, as this was non-greedy and so a \r\n was matched
  as two lines, one with \r and one with \n. But, the regex clearly indicates that
  this wasn't intended.

Also, I made the change to not use the s modifier also for the $h eq $error case,
for consistency (it does not matters much, we do not use . here)

diff --git a/src/PVE/Tools.pm b/src/PVE/Tools.pm
index d9c69e3..f9270d9 100644
--- a/src/PVE/Tools.pm
+++ b/src/PVE/Tools.pm
@@ -497,7 +497,7 @@ sub run_command {
                if ($h eq $reader) {
                    if ($outfunc || $logfunc) {
                        eval {
-                           while ($buf =~ s/^([^\010\r\n]*)(\r|\n|(\010)+|\r\n)//) {
+                           while ($buf =~ s/^([^\010\r\n]*)(?:\n|(?:\010)+|\r\n?)//) {
                                my $line = $outlog . $1;
                                $outlog = '';
                                &$outfunc($line) if $outfunc;
@@ -518,7 +518,7 @@ sub run_command {
                } elsif ($h eq $error) {
                    if ($errfunc || $logfunc) {
                        eval {
-                           while ($buf =~ s/^([^\010\r\n]*)(\r|\n|(\010)+|\r\n)//s) {
+                           while ($buf =~ s/^([^\010\r\n]*)(?:\n|(?:\010)+|\r\n?)//) {
                                my $line = $errlog . $1;
                                $errlog = '';
                                &$errfunc($line) if $errfunc;