-
-
Notifications
You must be signed in to change notification settings - Fork 811
Use gettext for recurring phrases!!!!! #684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: lektor
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some notes. I'm really sweating right now after slaving away a solid 3 hours of my life that I'll never get back but am happy to give away... so need to step away from this for now, updating all the po files by hand is going to be hard, so @freakboy3742 or any adults w/ a credit card who signed up for a deepL key, maybe we should just deepL all the strings? Or I'll copy them in tomorrow or after dinner today.
@@ -83,7 +83,7 @@ locale = fa_IR | |||
lektor-github-repos = 0.1.1 | |||
lektor-gravatar = 0.1.3 | |||
lektor-markdown-admonition = 0.3.1 | |||
git+https://github.com/beeware/lektor-i18n-plugin@v0.5.4 = | |||
../lektor-i18n-plugin = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to replace with new version after beeware/lektor-i18n-plugin#6 gets merged. That is just for extra safety, seem to work fine without it but you never know when the internal impls of any of those programs change...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
../ is my local checkout I used to update the pos
@@ -1,2 +1,3 @@ | |||
[jinja2: **/templates/**.html] | |||
encoding = utf-8 | |||
trimmed = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -9,6 +9,9 @@ | |||
|
|||
@pass_context | |||
def translate(context, string, bag_name="translate"): | |||
if bag_name == 'translate': | |||
raise RuntimeError("Use the new gettext system instead") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While doing this PR. eventually I will replace all other bags as well.
@@ -8,16 +8,16 @@ | |||
<div class="container"> | |||
<p>{{ breadcrumbs(this) }}</p> | |||
<h1>{{ this.title }}</h1> | |||
<p>{{ "posted_by"|trans }} | |||
{% if this.mastodon_handle %} | |||
{% set author_link %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[sweating intensifies]
templates/event.html
Outdated
{% elif this.event_type == "keynote" %} | ||
<p>{% trans title=this.title, url=this.url, talk_title=this.talk_title %}{{ speakers_list }} will be keynoting at {{ title }}, giving a presentation entitled "<a href="{{ url }}">{{ talk_title }}</a>".{% endtrans %}</p> | ||
{% elif this.event_type == "tutorial" %} | ||
<p>{% trans title=this.title, url=this.url, talk_title=this.talk_title %}{{ speakers_list }} will be presenting a tutorial at {{ title }} entitled "<a href="{{ url }}">{{ talk_title }}</a>".{% endtrans %}</p> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for the duplication here... but I realize I can't do away with it.
@HalfWhitt I apologize that some code you wrote for translation bags might get deleted. The bird has flown away... (continuing the extended metaphor in the other thread) |
Yikes... looks like if you're doing _(name) when name is a variable, it does not get extracted... this happens with the badge macro. Which means this PR must be reviewed extremely carefully for such things. |
FYI -- marked some arabic strings as fuzzy b/c cannot figure out where to put periods correctly... i'm using textedit to edit the po files directly, spent some time yak shaving to use emacs po-mode but realized I don't know how to use emacs... |
Yikes. More strings not extracted out of event.html... |
Isolating event.html into a seperate directory and deleting random components shows that deleting
will cause babel to extract properly. |
Progress: python-babel/babel#1216 reports a simple MWE for this situation. |
Given that this seems to be a bug, I'm going to work around it by separating the logic into a seperate macro. To pass Python dicts around, I will put filters that converts to and from JSON in our plugin. [sweats] |
And... now the messages are extracted! Lektorbuilding to update the translation files and then committing. |
FYI beeware/lektor-i18n-plugin#5 caused a huge diff in 42a8f3c because of the way xgetttext seems to output stuff differently, but a cursory diff will get you the conclusion that it's actually good because now the difference between the format of the pot and the po file is 0, sans the translated strings. |
@freakboy3742 @HalfWhitt (latter -- since you started the freeform) I'd like some preliminary comments on this before I go copy all the strings into the po files and suggestions on how to work around the issue linked #684 (comment) ? 'cause this is going to produce huge, multithousand-line diffs. Read the above comments though, if you have time, especially the last one. Thanks |
Maybe CI is using Ubuntu 24.04 with gettext 0.21 while I'm using gettext 0.25 on macOS and we're hitting the test case over here at https://github.com/translate/translate/pull/5439/files which differentiates b/w <=0.23 and >0.23... See: https://launchpad.net/ubuntu/noble/+package/gettext OK... time to go yak shaving tomorrow to install gettext 0.21... |
Hmm... maybe let me try my ubuntu 22.04 vm tomorrow. See how that works. |
|
||
#: https://beeware.org/ (content/contents+en.lr:button-block.label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI beeware/lektor-i18n-plugin#5 caused a huge diff in 42a8f3c because of the way xgetttext seems to output stuff differently, but a cursory diff will get you the conclusion that it's actually good because now the difference between the format of the pot and the po file is 0, sans the translated strings.
Not in the specific hash but looking at these changes
This is resolved now. |
What? I can translate on Weblate with no conflicts except the header
Co-authored-by: AmiMohammad Dehghan <[email protected]>
FWIW this is finalized. I'd need to go through and check copy pasting, but that's it. Sometimes somehow the translation does not show if you use lektor server so you'd have to lektor clean lektor build lektor clean and then lektor server for some reason... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More from #516
Fixes #683.
Notes about this PR
(aka. digest of my 70ish-message monologue)
I will keep editing this section...
REVISION July 3, 02:32 UTC
Deps
When beeware/lektor-i18n-plugin#6 is merged, I will modify the dependency to not depend on a local checkout but instead the finished v0.5.5. It adds some improvements, listed there in the changelog.
Therefore DO NOT APPLY PREVIEW before I update that dep.
Fuzziness
Since the old translations splits up sentences, and mistakes are being made in Chinese Simplified and Chinese Traditional regarding context, they're likely to occur in other languages as well. In addition, universally using ; and . (english variants) probably made some typography errors in other scripts.
Therefore, after much monologue, Ctrl + Uing, and comtemplation, I've decided to mark anything satisfying any of the following as fuzzy:
Only things that are now not fuzzy should consist of simple phrases or completed sentences that are stored as a whole in the databags.
Let me know if you want all strings fuzzy that I've added.
Huge Diffs on PO / POT files
Changes to the headers of PO files
Side effect of second bullet point is that there's a trivial inconsistency. When you initialize new po files now using that plugin, all pos will have
On top of it; I've not added this since it'd take so much time and is purely cosmetic.
Why is pgettext used?
,
which is for seperator; Chinese uses、
for that, and other conventions exist in other parts of the world. It's probably unclear what,
is per se._(...)
seems to strip whitespace, which is problematic when we have stuff like_("Superpower: ")
where there's a trailing space. Usingpgettext
seems to work around this, and can also be used to tell translators that the space is free to remove if your language does not require an additional space in these circumstances. Example: In chinese,:
doesn't need a space after.How did Labels Get Handled?
All labels are stored in a dict with a function to query it in templates/macros/labels.html. Some labels does not need translation, so they're stored as-is.
The file also includes the function for getting the list seperator comma-space string.
How did I handle plurals?
There's a few strings requiring pluralization -- I've marked the sprint helping and sprint helping plural as pluralized, and also the Gold Member to Gold Members on the front page, just in case we'll have more. If previously existing a plural form for the ENTIRE strings, I copy paste the regular form into cases for <=1 and the rest for all cases >1 and fuzzy. If nonexisting plural but the ORIGINAL STRING exists ENTIRELY, I machine translate the plural and paste them in all cases that's >1 and then fuzzy.
That said, maybe the giving talk etc also need plural forms; I'd need to work on that later (possibly in another PR) since this is taking too much time (6 days of continuous work).
To and From JSON Filters in the Plugin
The speaker names part Babel seems to choke on it and not extract any strings from the file containing that -- therefore I extracted that code as a macro into a separate file. However, you can't pass python objects around in Jinja functions (macros), so I devised this JSON plan to pass objects around in macros...
What things did I manually Edit?
Traditional Chinese is missing a bunch of coverage, so I tried filling some strings in by converting from Simplified Chinese strings. However I marked them as fuzzy, and everything I did had identical result as Google Translate, so not so much to worry about.
Modified some simplified chinese strings with all the new context that I was able to get.
For Arabic, only change I made is to correct the semicolon at
by AUTHOR; published DATE
into the arabic semicolon, but didn't change anything else since I realized those things would be picked up by Weblate anyways. This string is also fuzzy.For the former label.ini strings in Italian, capitalization is normalized (capitalize first letter when the rest strings of the similar category are). Those strings are all fuzzy.
For strings like {Silver, Gold, Platinum, Individual etc} Member: If the silver, gold, platinum, etc. strings existed in the databags but the word member doesn't, machine translate for the word Member is used and concatnated onto the silver, gold, platinum, etc. If those silver, gold, platinum etc words don't exist, but the word member does, no effort is made to complete the strings.
See also the pluralization part.
Misc Notes
I didn't touch the strings that overlapt with the site content at all except for Persian -- I just left them in their exisitng machine-translated state fuzzied. I only touched the entirely new strings.
What I'm Working on Now
Double and Triple checking that everything got copied + pasted correctly.
Questions
https://github.com/beeware/beeware.github.io/blob/lektor/BeeWare.lektorproject#L17-L20 (EDIT -- normalized by me already)
Tell me if I'm worrying about all of this way too much.
Easter Egg
Bee-fore I started this PR, the missing number of strings in Chinese simplified in Weblate is 256 = 2^8. Now it's 254 since I translated some more in Weblate and resolved the conflicts.
PR Checklist: