Posts Tagged ‘i18n’

Internationalization in Kross scripts


As I said in the previous post, internationalization in Kross scripts is a subject that deserves its own post. Again, this is based on my own experiences, and I’m by no means an expert in script languages or Kross, so don’t treat this information as a universal truth 😉

Also note that internationalization (from now on, i18n) has a very broad scope. Here we’ll talk only about string i18n (allowing the messages to be translated to other languages). If you are interested in other i18n areas you’ll have to keep looking 😉

So let’s see if I can give some useful information after all 😉

Translation systems

First of all, what systems do we have to translate strings? The de facto standard in free software world is GNU gettext. It provides tools to extract the messages to be translated from the source code, a file format to store the translations, a library to translate the strings at runtime… Most popular programming languages support gettext, being it through its standard API (like in Python) or from 3rd party projects (like in Ruby).

But it is not the only system available. For example, Qt has its own translation system as part of its i18n support. Ruby also has an i18n module that contain a translation system (although it can use gettext files switching the backend).

What does KDE use? Although KDE uses a lot of Qt infrastructure, in the case of string translation it uses gettext instead. Well, an enhanced version of gettext with some very interesting features. For example, the semantic markup helps translators to better understand how a string should be translated, but it also benefits the users as it provides a richer and more consistent appearance for messages. Another interesting feature is Transcript, which aids in the correct translation of case based languages (among others).

Moreover, KDE libraries take care of gettext initialization, catalogs and so on, so you just have to mark the strings to be translated without worrying about setting everything up.

What translation systems can be used from Kross scripts? Thanks to the Translation Module, the KDE translation system can be used in Kross scripts, no matter what programming language the script is written in. But, as Kross only acts as a bridge between the script interpreter and C++ code, the translation systems available for the programming language of the script can be used from Kross. So the i18n module in Ruby or gettext in Python could be used. And if Qt bindings are available for the language, even the Qt translation system could be used!

Anyway, in my humble opinion, the best choice is to use Kross Translation Module (if we know for sure that the script will be executed from Kross). The Translation Module is available no matter which programming language was used to write the script, and it is provided by Kross, so it doesn’t depend on external libraries. Moreover, as it just forwards the function calls to the KDE translation system, all the fancy features given by it are available for free in the scripts. And, finally, the messages can be extracted from the scripts like they are extracted from C++ sources, storing them in the same file (so translators don’t have to care where the strings came from) and accessing them at runtime without any special configuration.

Message extraction

Translators need to know what strings they have to translate, and in order to accomplish this the translatable strings must be extracted from the source code. To do this, an extraction utility analyzes the source code looking for those strings. The programmer must mark the translatable strings so the extraction utility knows which strings have to be extracted and which not.

To mark the string usually just means to wrap the string with a function/method call that will also translate that string to the appropriate language when the application is running.

To extract the strings, gettext provides xgettext, which understands source code in a lot of programming languages, including Python. However, it doesn’t support yet Javascript nor Ruby explicitly. When a programming language is unknown, xgettext tries (as far as I know) with C like strings and functions. So, although neither Javascript nor Ruby are explicitly supported, double quote C like strings can be extracted apparently without problems. But it doesn’t support single quote strings, custom delimited strings or function calls without parenthesis.

See the following Ruby code as an example (note that mark is just an example function name):

mark("Hello world double quotes") # Extracted by xgettext
object.mark("Hello world method") # Extracted by xgettext

mark('Hello world single quotes') # Not extracted by xgettext
mark(%/Hello world custom quotes/) # Not extracted by xgettext
mark "Hello world without parenthesis" # Not extracted by xgettext

xgettext has also a very useful feature: it let’s you specify how the strings were marked in the source code. So it supports the default gettext function names, but can also extract custom functions. For example, you can say “wherever you find a function called i18nc with 2 arguments, treat the first item as string context and the second argument as the string to be translated”. This is used by KDE, as KDE i18n functions have a different name than gettext canonical functions.

Note that the file created by xgettext when the messages are extracted uses a format understood by gettext (although other translation systems may be able to read those files). Anyway, it is not a problem using Kross Translation Module, because as it was already said, KDE uses (an enhanced) gettext as its translation system.

In fact, the extraction of strings in KDE is done with xgettext, and it is automatic if the application is part of KDE’s subversion repository, although it can also be simplified a lot with an script for 3rd party applications, as shown in i18n Build System. However, the for subversion and 3rd party applications given in that article only support C/C++ files. They must be adjusted to also look for messages in Kross scripts. Take a look to the changes in KTutorial commit #65 as an example.

The gettext project for Ruby provides rgettext, which is a xgettext tailored for Ruby source code. However, it just works for Ruby source code, and as far as I can tell it doesn’t allow you to specify how the strings are marked (maybe with -r argument you can extend the behavior of rgettext, but I haven’t tested it), so you can just use the canonical gettext function names. That is, it can’t extract messages from a Ruby script that uses Kross Translation Module (as the function names are the same used in KDE libraries).

So, in the end, in order to be able to extract messages from Kross scripts using the translation module you just have to include the scripts in the files to be looked for strings, and keep a C/C++ style in Javascript and Ruby strings and translation module calls.

Runtime translation, catalog location…

Another section that starts with bold letters… It means that there is a lot to say about this subject, right? Wrong. That’s one of the nicest things about using the translation module: once the messages from the scripts are extracted along with the messages from C++ code you are done. They will be treated like the C++ internationalized strings when translated, merged, installed and used at runtime. Do the string translation already work in C++? So it will for Kross scripts 🙂

Well, I lied. There is one case where this is not so easy: when the scripts aren’t part of the application. If an script is provided as an add-on for the application by 3rd parties, the messages from that script weren’t probably extracted with the rest of messages of the application. So the script must provide also its own translation files like additional data (for example, packaging the script and the translations in a tar.gz file or something like that). The translation files would need to be loaded at runtime, which would require some infrastructure.

Anyway, I haven’t explored this scenario as it seems very strange, at least for tutorials. If someone makes a nice scripted tutorial for an application that tutorial is likely going to end as part of the application itself. It would be very strange if the tutorial was kept as a separate entity, and even more strange if it got its own translations (getting localized seems like a sign that it is good enough to become part of the application 😉 ).