* [pve-devel] [PATCH proxmox-i18n] use xgettext to extract translatable strings
@ 2023-12-01 14:25 Maximiliano Sandoval
2023-12-04 14:43 ` Alexander Zeidler
0 siblings, 1 reply; 2+ messages in thread
From: Maximiliano Sandoval @ 2023-12-01 14:25 UTC (permalink / raw)
To: pve-devel
xgettext is a robust tool to extract translatable strings from source
code.
Using msgcat for concatenating pot files is not recommended, hence we
also switch to xgettext. It also added garbage when there were comments.
What do we get for free:
- It de-escapes strings. there are 3 cases in our code base where
single-quoted strings were used and its `'` had to be escaped, these
were not de-escaped properly when presented to translators. This is one
such example
```diff
#: proxmox-widget-toolkit/src/panel/EmailRecipientPanel.js:39
-msgid "The notification will be sent to the user\\'s configured mail address"
+#, fuzzy
+msgid "The notification will be sent to the user's configured mail address"
msgstr "La notificación sera enviada a el correo configurado del usuario"
```
- xgettext can detect when strings use a certain style of substitutions,
but I was not able to detect the conditions and it only affects a single
string in the entire code base.
```diff
#: proxmox-widget-toolkit/src/Utils.js:995
+#, javascript-format
msgid "{0}% of {1}"
msgstr "{0}% de {1}"
```
- Correct POT-Creation-Date, note how the new one matches the
Revision-Date's format.
```diff
@@ -7,7 +7,7 @@ msgid ""
msgstr ""
"Project-Id-Version: proxmox translations\n"
"Report-Msgid-Bugs-To: <support@proxmox.com>\n"
-"POT-Creation-Date: Wed Nov 22 18:17:30 2023\n"
+"POT-Creation-Date: 2023-12-01 11:25+0100\n"
"PO-Revision-Date: 2023-11-27 16:43+0100\n"
"Last-Translator: Maximiliano Sandoval <m.sandoval@proxmox.com>\n"
"Language-Team: Spanish\n"
```
- Extraction of strings using ngettext, pgettext, etc. Even if we don't
have js wrappers for these at the moment, they are critical to provide
good-quality translations and could be added in the future.
- We can extract comments from the source code with `xgettext -c`.
Newly added comments won't mark strings as fuzzy but can provide
helpful context to translators.
Comments are additive, if for example two sources contain
the same string with different comments and it appears a third time
without comments, the three sources and the two comments will be shown
to translators.
These are a few examples that could be implemented in our codebase:
It is not clear if "Prune Options" prunes the options or configures
pruning.
```js
// TRANSLATORS: Opens the panel that allows configuring how Pruning works
let s = gettext("Prune Options");
```
Adding a source for a concept or its expanded name can help
translators decide whats the gender for a word in their language.
```js
// TRANSLATORS: TOTP stands for Time-based one-time password
let s = gettext("Add a TOTP login factor");
```
Some strings are not marked for translation to avoid translating
certain parts of it, this is a change that could be made
```diff
-fieldLabel: 'Crush Rule', // do not localize
+// TRANSLATORS: Do not translate 'Crush', its a proper name
+fieldLabel: gettext('Crush Rule'),
```
Or simply to give more context when substitutions are involved.
```
// TRANSLATORS: For example 'Join CLUSTER_NAME'
return Ext.String.format(gettext('Join {0}'), `'${cn}'`);
```
Cons:
- In total 3 translations were marked as fuzzy. Translators will have to
review and mark them as translated again.
- If using -c, gettext can't distinguish if the comment above is useful
for translators. The common practice is to add a `TRANSLATORS:` tag to
these comments.
- The reordering of sources for each msgstr will create an unnecessarily
massive (yet ultimately harmless) diff (approx. 50k insertions(+) 50k
deletions(-)).
Signed-off-by: Maximiliano Sandoval <m.sandoval@proxmox.com>
Thomas: Should this be merged, please run `make do_update` and commit the
changes to each .po{,t} file. I am not sure if it is possible to even
send an email with over 100k lines of text.
---
Makefile | 10 +++-
jsgettext.pl | 135 ---------------------------------------------------
2 files changed, 8 insertions(+), 137 deletions(-)
delete mode 100755 jsgettext.pl
diff --git a/Makefile b/Makefile
index 1d7af6e..4776e02 100644
--- a/Makefile
+++ b/Makefile
@@ -97,7 +97,13 @@ pbs-lang-%.js: %.po
# parameter 1 is the name
# parameter 2 is the directory
define potupdate
- ./jsgettext.pl -p "$(1) $(shell cd $(2);git rev-parse HEAD)" -o $(1).pot $(2)
+ find . -iname "*.js" -path "./$(2)*" | xargs xgettext -c -s \
+ --from-code="UTF-8" \
+ --package-name="$(1)" \
+ --package-version="$(shell cd $(2);git rev-parse HEAD)" \
+ --msgid-bugs-address="<support@proxmox.com>" \
+ --copyright-holder="Copyright (C) Proxmox Server Solutions GmbH <support@proxmox.com> & the translation contributors." \
+ --output="$(1)".pot
endef
.PHONY: update update_pot do_update
@@ -124,7 +130,7 @@ init-%.po: messages.pot
.INTERMEDIATE: messages.pot
messages.pot: proxmox-widget-toolkit.pot proxmox-mailgateway.pot pve-manager.pot proxmox-backup.pot
- msgcat $^ > $@
+ xgettext $^ --msgid-bugs-address="<support@proxmox.com>" -o $@
.PHONY: distclean
distclean: clean
diff --git a/jsgettext.pl b/jsgettext.pl
deleted file mode 100755
index 7f758fd..0000000
--- a/jsgettext.pl
+++ /dev/null
@@ -1,135 +0,0 @@
-#!/usr/bin/perl
-
-use strict;
-use warnings;
-
-use Encode;
-use Getopt::Long;
-use Locale::PO;
-use Time::Local;
-
-my $options = {};
-GetOptions($options, 'o=s', 'b=s', 'p=s') or die "unable to parse options\n";
-
-my $dirs = [@ARGV];
-
-die "no directory specified\n" if !scalar(@$dirs);
-
-foreach my $dir (@$dirs) {
- die "no such directory '$dir'\n" if ! -d $dir;
-}
-
-my $projectId = $options->{p} || die "missing project ID\n";
-
-my $basehref = {};
-if (my $base = $options->{b}) {
- my $aref = Locale::PO->load_file_asarray($base) ||
- die "unable to load '$base'\n";
-
- my $charset;
- my $hpo = $aref->[0] || die "no header";
- my $header = $hpo->dequote($hpo->msgstr);
- if ($header =~ m|^Content-Type:\s+text/plain;\s+charset=(\S+)$|im) {
- $charset = $1;
- } else {
- die "unable to get charset\n" if !$charset;
- }
-
- foreach my $po (@$aref) {
- my $qmsgid = decode($charset, $po->msgid);
- my $msgid = $po->dequote($qmsgid);
- $basehref->{$msgid} = $po;
- }
-}
-
-sub find_js_sources {
- my ($base_dirs) = @_;
-
- my $find_cmd = 'find ';
- # shell quote heuristic, with the (here safe) assumption that the dirs don't contain single-quotes
- $find_cmd .= join(' ', map { "'$_'" } $base_dirs->@*);
- $find_cmd .= ' -name "*.js"';
- open(my $find_cmd_output, '-|', "$find_cmd | sort") or die "Failed to execute command: $!";
-
- my $sources = [];
- while (my $line = <$find_cmd_output>) {
- chomp $line;
- print "F: $line\n";
- push @$sources, $line;
- }
- close($find_cmd_output);
-
- return $sources;
-}
-
-my $header = <<'__EOD';
-Proxmox message catalog.
-
-Copyright (C) Proxmox Server Solutions GmbH
-
-This file is free software: you can redistribute it and/or modify it under the terms of the GNU
-Affero General Public License as published by the Free Software Foundation, either version 3 of the
-License, or (at your option) any later version.
--- Proxmox Support Team <support\@proxmox.com>
-__EOD
-
-my $ctime = scalar localtime;
-
-my $href = {
- '' => Locale::PO->new(
- -msgid => '',
- -comment => $header,
- -fuzzy => 1,
- -msgstr => "Project-Id-Version: $projectId\n"
- ."Report-Msgid-Bugs-To: <support\@proxmox.com>\n"
- ."POT-Creation-Date: $ctime\n"
- ."PO-Revision-Date: YEAR-MO-DA HO:MI +ZONE\n"
- ."Last-Translator: FULL NAME <EMAIL\@ADDRESS>\n"
- ."Language-Team: LANGUAGE <support\@proxmox.com>\n"
- ."MIME-Version: 1.0\n"
- ."Content-Type: text/plain; charset=UTF-8\n"
- ."Content-Transfer-Encoding: 8bit\n",
- ),
-};
-
-sub extract_msg {
- my ($filename, $linenr, $line) = @_;
-
- my $count = 0;
-
- while(1) {
- my $text;
- if ($line =~ m/\bgettext\s*\((("((?:[^"\\]++|\\.)*+)")|('((?:[^'\\]++|\\.)*+)'))\)/g) {
- $text = $3 || $5;
- }
- last if !$text;
- return if $basehref->{$text};
- $count++;
-
- my $ref = "$filename:$linenr";
-
- if (my $po = $href->{$text}) {
- $po->reference($po->reference() . " $ref");
- } else {
- $href->{$text} = Locale::PO->new(-msgid=> $text, -reference=> $ref, -msgstr=> '');
- }
- }
- die "can't extract gettext message in '$filename' line $linenr\n" if !$count;
- return;
-}
-
-my $sources = find_js_sources($dirs);
-
-foreach my $s (@$sources) {
- open(my $SRC_FH, '<', $s) || die "unable to open file '$s' - $!\n";
- while(defined(my $line = <$SRC_FH>)) {
- if ($line =~ m/gettext\s*\(/ && $line !~ m/^\s*function gettext/) {
- extract_msg($s, $., $line);
- }
- }
- close($SRC_FH);
-}
-
-my $filename = $options->{o} // "messages.pot";
-Locale::PO->save_file_fromhash($filename, $href);
-
--
2.39.2
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [pve-devel] [PATCH proxmox-i18n] use xgettext to extract translatable strings
2023-12-01 14:25 [pve-devel] [PATCH proxmox-i18n] use xgettext to extract translatable strings Maximiliano Sandoval
@ 2023-12-04 14:43 ` Alexander Zeidler
0 siblings, 0 replies; 2+ messages in thread
From: Alexander Zeidler @ 2023-12-04 14:43 UTC (permalink / raw)
To: Proxmox VE development discussion
From a brief look at it:
I also think it's a good idea to provide more information for translators
(where it actually adds value and doesn't just bloat code).
> Cons:
> - In total 3 translations were marked as fuzzy. Translators will have to
> review and mark them as translated again.
since ~12k translations are already marked as fuzzy ...
> - If using -c, gettext can't distinguish if the comment above is useful
> for translators. The common practice is to add a `TRANSLATORS:` tag to
> these comments.
There's also "-cTAG"
> + find . -iname "*.js" -path "./$(2)*" | xargs xgettext -c -s \
rather: -name
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-12-04 14:43 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-01 14:25 [pve-devel] [PATCH proxmox-i18n] use xgettext to extract translatable strings Maximiliano Sandoval
2023-12-04 14:43 ` Alexander Zeidler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox