• Juni 29, 2018

Continuous translation process with GITLab and WebLate. 

Continuous translation process with GITLab and WebLate. 

Continuous translation process with GITLab and WebLate.  1024 475 SimplyDelivery

Initial state: 

Our growing business drew the attention of international customers, allowing us to enter new markets. The internationalization of our application was one of the primary requirements in order to perform in this new environment.

The ZEND framework is used for the backend. It was obvious to use .po-files for the translation-mechanism. It wasn’t too difficult to offer our software with only one additional language (english), so it was possible to manage the .po-files manually. The well-known POEdit was the tool of our choice. The developers wrote their code, commited and pushed it into our GIT repository. We then defined a manual process to update the .po-files. This allowed us to either translate them in-house or send them to a translation office.

In the end, as demand for other languages rose, lacking the linguistic expertise in our team to deal with additional languages, led to this process becoming ineffective quickly. Having to handle this amount of translations became a pain-staking process with our limited resources, which inevitably led to an increase in our error-rate. It was time to think it over and define a better process.

 

Target state:

The idea was to create a process in the following way:

Every developer codes in the common way we did before. Any apparent string in our code is written with the ZEND translation mechanism.

$this->translate ("Some text")

We didn’t want our developers to think about translations anymore, because their main focus should be on the business logic. The .po-files should automatically be updated when the code is committed and merged into the repo. We want to enable our translation office to use a visual interface for translatiosn adn we would like to synchronize everything without any manual effort. We could define a process like this:

1. Developers code with translation marks and our git-process should remain untouched.
2. The creation of translation files should be fully automatic.
3. We need a translation tool with common features, such as a translations db and online lookup.
4. The exchange of translations which need translating and those already translated should also be automatic.

 

Solution:

We defined the following process:

1. The developer implements the code as usual. The code is merged into the master branch after revision.
2. The .po-files are automatically refreshed and checked into the master branch, when the new code is merged into the master branch.
3. A WebLate-server is the tool we want to use to empower translation offices to work with us. WebLate offers all features for professional translation work and can handle git-repositories.
4. The WebLate server must receive the new .po-files automatically.
5. The new .po-files must be checked in automatically after the translators finished their work.

We transformed this process into a real workflow as described in the following:

Prepare GITLab

We use pipelines with jobs in order to automate processes with GITLab. It is therefore necessary to install a runner to execute the jobs. There is quite a good documentation on GITLab you can find here: https://docs.gitlab.com/runner/install/

The essentials for our GITLab installation on AWS Ressources:

Download:
# Linux x86-64
sudo wget -O /usr/local/bin/gitlab-runner https://gitlab-runner-downloads.s3.amazonaws.com/latest/binaries/gitlab-runner-linux-amd64
Permissions:
sudo chmod +x /usr/local/bin/gitlab-runner
Create a GitLab-User:
sudo useradd --comment 'GitLab Runner' --create-home gitlab-runner --shell /bin/bash
Install and run the service:
sudo gitlab-runner install --user=gitlab-runner --working-directory=/home/gitlab-runner
sudo gitlab-runner start
Register the Runner:
sudo gitlab-runner register

Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com )
https://yourdomain.de

Please enter the gitlab-ci token for this runner
xxx

Please enter the gitlab-ci description for this runner
[hostame] my-runner
Register the Runner - we use „shell“ as executer:
Please enter the executor: ssh, docker+machine, docker-ssh+machine, kubernetes, docker, parallels, virtualbox, docker-ssh, shell:
docker

The runner must be configured in the settings of your GITLab-project. We disabled shared runners and activated it for our project. You have to enable the runner for every single project.

 

Prepare the project with job description:

The runner can execute pipelines containing jobs. In order to achieve this, we first created a GITLab-user called „SDGitlabRunner“ (meep meep, nobody is faster) and then assigned this user to every project we needed. The easiest way is to make him a master, but you can also make a more complex process and only give him developer rights on a special branch.

 

You must create an SSH-Keypair for this user before you can assign the GITLab-Runner-User. This significant point is the only setting which leaves me a little bit uncomfortable. It is necessary to assign the public key as usual to the GITLab-User. For the script in the next chapter, on the other hand, it is necessary to set a private key as a „secret variable“ with the name „SSH_PRIVATE_KEY“ in every project. Not even Google could find a better solution. I don’t necessarily like it this way, but we need it to push our updated .po-files back to the repo.

 

The .gitlab-ci.yml-File

The runner with all the jobs is controlled with one file. So we created a .gitlab-ci.yml-file in the root directory of every project which needs translations. Don’t forget to add the runner for every desired project.

The yml-file we created looks finally like the following:

stages:
  - git-checkout
  - find-files
  - xgettext
  - msmerge
  - push-back-to-repo

before_script:
  - echo "Preparing GIT, SSH-key and remote."
  - eval $(ssh-agent -s) && ssh-add ~/.ssh/gitlab_runner
  - '[[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config'
  - git remote set-url origin git@gitlabserver.whatever:group/project.git
  - git remote set-url --push origin git@gitlabserver.whatever:group/project.git
  - git config --global user.email „your@mailaddress.de“ && git config --global user.name "SDGitlabRunner"
  - git config --global push.default simple

git-checkout:
  stage: git-checkout
  only:
    - your_desired_branch
  script:
    - echo "GIT fetch and checkout hard."
    - git fetch
    - git checkout master
    - git reset --hard origin/master
  cache:
    paths:
      - ./

find-files:
  stage: find-files
  only:
    - your_desired_branch
  script:
    - echo "Looking for files to collect them in a list."
    - find ./module/ -name *.phtml >potfiles.txt || true
    - cat potfiles.txt
  cache:
    paths:
      - ./

xgettext:
  stage: xgettext
  only:
    - your_desired_branch
  script:
    - echo "Crawling through the list and create a .pot-File with translations."
    - xgettext -f potfiles.txt -o temporary.pot --from-code="UTF-8" --keyword=translate --language=PHP --add-comments=TRANSLATORS --add-comments=translators --force-po --no-wrap --package-name=SimplyDeliveryProject —msgid-bugs-address=your@mailaddress.de --foreign-user
  cache:
    paths:
      - ./

msmerge:
  stage: msmerge
  only:
    - your_desired_branch
  script:
    - echo "Merge the files. Add more language-files here."
    - |
      if [ -f language/en_EN.po ] && [ "$(wc -l temporary.pot)" -gt 0 ]; then
        echo "Merge en_EN.po"
        msgmerge -N -o language/en_EN.po language/en_EN.po temporary.pot —lang=en_EN --no-wrap
      fi
  cache:
    paths:
      - ./

push-back-to-repo:
  stage: push-back-to-repo
  only:
    - your_desired_branch
  script:
    - echo "Cleanup and push changed files back to repo if necessary."
    - rm potfiles.txt
    - rm s temporary.pot
    - git status
    - FILESTOPUSH=$(git diff --numstat | wc -l)
    - |
      if [ "$FILESTOPUSH" != 0 ]; then
        echo "Found $FILESTOPUSH changed translation files, push now back to repo"
        git add .
        git commit -m "Language-Files updated"
        git push origin master
      else
        echo "No changed files, nothing to commit and push."
      fi
  cache:
    paths:
      - ./

So whats going on here? Let’s go though it step by step.

First we define stages to run all the jobs as a sequence, which makes it much easier to control the runner.
stages:
- git-checkout
- find-files
- xgettext
- msmerge
- push-back-to-repo

The before_script is executed before all other jobs. We create an .ssh-private key and catch it from the environment variable. We also needed to create a known_host-file executing
ssh-keygen -R [hostname] or use StrictHostKeyChecking no, but the first way is the better way. In the before_script we also set the right GIT-remotes, the username and the e-mail-address and switched to push.default simple.

before_script:
- echo "Preparing GIT, SSH-key and remote."
- eval $(ssh-agent -s) && ssh-add ~/.ssh/gitlab_runner
- '[[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config'
- git remote set-url origin git@gitlabserver.whatever:group/project.git
- git remote set-url --push origin git@gitlabserver.whatever:group/project.git
- git config --global user.email „your@mailaddress.de“ && git config --global user.name "SDGitlabRunner"
- git config --global push.default simple

The next job is to check out the current code. We have to keep the files between the jobs, so we active the cache. We use the hard way for checkout to avoid any merge-conflict.

git-checkout:
  stage: git-checkout
  only:
    - your_desired_branch
  script:
    - echo "GIT fetch and checkout hard."
    - git fetch
    - git checkout master
    - git reset --hard origin/master
  cache:
    paths:
      - ./

Now we grep through the source-code-directories and put every file path we want to consider in a file called potfile.txt.

find-files:
  stage: find-files
  only:
    - your_desired_branch
  script:
    - echo "Looking for files to collect them in a list."
    - find ./module/ -name *.phtml >potfiles.txt || true
    - cat potfiles.txt
  cache:
    paths:
      - ./

The next job is to grep through the extracted files and find translations („this->translate“) in the code. A lot of times, xgettext is not installed on the GITLab-Server, especially if you use default AMIs from AWS. Just install it with:

sudo apt-get update
sudo apt-get install gettext

The result is written in the temporary.pot-file.

xgettext:
  stage: xgettext
  only:
    - your_desired_branch
  script:
    - echo "Crawling through the list and create a .pot-File with translations."
    - xgettext -f potfiles.txt -o temporary.pot --from-code="UTF-8" --keyword=translate --language=PHP --add-comments=TRANSLATORS --add-comments=translators --force-po --no-wrap --package-name=SimplyDeliveryProject —msgid-bugs-address=your@mailaddress.de --foreign-user
  cache:
    paths:
      - ./

Now we conduct a merge of the existing .po-Files and the new translations we found in the source code. Just to be sure we check if the target-.po-file exists. If you have more languages, just add them.

msmerge:
  stage: msmerge
  only:
    - your_desired_branch
  script:
    - echo "Merge the files. Add more language-files here."
    - |
      if [ -f language/en_EN.po ] && [ "$(wc -l temporary.pot)" -gt 0 ]; then
        echo "Merge en_EN.po"
        msgmerge -N -o language/en_EN.po language/en_EN.po temporary.pot —lang=en_EN --no-wrap
      fi
    cache:
      paths:
        - ./

Last step: checking the new .po-files and push them back to the repository. First we remove unneeded files. We count the number of files to push to avoid „empty“ pushes to the repo wich will cause failed job states for the GITLab pipeline.

push-back-to-repo:
  stage: push-back-to-repo
  only:
    - your_desired_branch
  script:
    - echo "Cleanup and push changed files back to repo if necessary."
    - rm potfiles.txt
    - rm s temporary.pot
    - git status
    - FILESTOPUSH=$(git diff --numstat | wc -l)
    - |
      if [ "$FILESTOPUSH" != 0 ]; then
      echo "Found $FILESTOPUSH changed translation files, push now back to repo"
      git add .
      git commit -m "Language-Files updated"
      git push origin master
    else
      echo "No changed files, nothing to commit and push."
    fi
  cache:
    paths:
      - ./

Now we can automatically update .po-files if there are changes in the code. The developer doesn’t have to take care of it, he just writes his code.

 

WebLate and done

The next step is to inform our WebLate-Server about the changes. That’s pretty easy – we just add a web hook from WebLate to every project in GITLab. Go to „settings->integrations“ in the GITLab-project and configure a Webhook – that’s it!

Some good additional documentation can be found here:
https://docs.gitlab.com/ee/ci/pipelines.html
https://docs.gitlab.com/ee/ci/yaml/

 

 

Way back from WebLate to GITLab

The last step in order to accomplish this process is to enable WebLate to push changes back to the repository. We do it the same way again – we create a user in GitLab called „WebLateUser“ and attach this user to the desired projects. We need an SSH-key again that we give to this user in GITLab and configure this on the WebLate-side in the admin-panel. You can read more here:
https://docs.weblate.org/en/weblate-2.20/admin/projects.html#private

 

JavaScript

Of course there are JavaScript elements in our application. For our Angular applications we use separate repos and so we can work with this described mechanism in a similar way.

For mixed views composed with our ZEND Framework we wrote a little script „translation collector“, triggered bevor the find-file-stage. We execute this to extract translations into a .phtml-file bevor we run the „find“.

 

Author

Dipl.-Ing. (BA) Ronny Rohland
Co-Founder & Lead Developer SimplyDelivery
Mail: rr@simplydelivery.de
Skype: ronny.rohland

If you have any question don’t hesitate to contact us.

 

Kontaktieren Sie uns

Verkauf: +49 3058844369 • Support: +49 33785100622

Anrede*

Vorname*

Nachname*

Ihre E-Mail-Adresse*

Ihre Telefonnummer*

Datenschutz*


Datenschutz



Meine Daten werden dabei nur streng zweckgebunden zur Bearbeitung und Beantwortung meiner Anfrage genutzt. Mit dem Absenden des Kontaktformulars erkläre ich mich mit der Verarbeitung einverstanden. Nach Abschluss der Verarbeitung werden die erhobenen Daten gelöscht, insofern diese nicht für die weitere Bearbeitung der Anfrage oder eines daraufhin entstandenen Auftrags weiter benötigt werden.