Tuesday 16 September 2014

Do you play English? Part 2

In this post I will continue to write about translating games for the ScummVM project. This is the second part of a three parts series.

Note: This post contains an embedded sourc code example that is not visible on the RSS feed.

Part 2: Improve the original translation of a game


Sometime the official translation of a game could be mistaken with the result you would get from the AltaVista translation of the 90s. And I am not exaggerating.

The French version of Drascula was an example of this, and I am told the Italian version was not better - but my limited knowledge of the Italian language does not allow me to confirm. While this was hilarious in its own way, it distracted from the game, so we decided to provide an improved translation for both Italian and French. I didn't work at all on the Italian translation, so the examples I will take are all from the French translation. But most of the explanation would work for the other translations.

Here the strings are partly in the game data file and partly in the original executable. We extracted the string from the original executable and instead in ScummVM they are in the drascula.dat file that we provide with ScummVM. So improving the translation meant both modifying this drascula.dat file and modifying the the game data files. Sometimes it also meant adding new strings, as for example the subtitles for some languages were missing in the Von Braun cutscenes.

Modifying the strings in the drascula.dat file is easy. The strings are hardcoded in the source code of the tool used to generate that file. So we just need to modify that source code. The only little difficulty is that non-ASCII characters (e.g. accentuated character, and we have a lot of those in French) are using the Code Page 850 encoding. And in C we need to use the octal number in the string preceded with a backslash. So for example, to have an è, the decimal value in the CP850 encoding is 138, which in octal is 212. So the string would be '\212'. Therefore to get "Chèvre" ("Goat" in English, I think my brain was permanently damaged by working on Broken Sword) I would need to write "Ch\212vre".

Modifying the data file is not much harder. Those files are actually ARJ archives. So you can easily decompress them using a tool that supports this compression. Files with strings are those with the extension CAL (which contain the dialogs) and ALD (which contain the hotspots). But you cannot edit them directly; that would be too simple. They use a simple encryption: each byte is x'ored with 0XFF.

For example the letter A in ASCII has a value of 65. In binary this gives 01000001.
When you x'or it with 0XFF (11111111 in binary) this gives: 10111110 (190 in decimal).
To get back to the original text you just need to X'or it again by 0xFF.

So I quickly wrote a simple C program to decrypt and re-encrypt the files:


To give you an idea of how bad it was, here are some of the hotspots from the original version and the corresponding ones from my improved version.


OriginalImprovedComment
PUITPUITSA simple typo you might think. Maybe, if it had been the only one...
CIMTEIERECIMETIEREAnagrams now? Maybe that was actually designed as a puzzle?
CAISSONTIROIRWhere did that come from??? Canadian French maybe?
CERVEAUSCERVEAUXYou may need a brain to know that the plural of words ending in 'eau' takes a X and not an S.
TRONCCOFFREMaybe my favorite. It make me think that the "translator" may have been working from the English text and not the Spanish one. TRONC is a tree trunk. COFFRE is a chest... or a car trunk.
ARMARIOARMOIREOK, they forgot to translate that one.
BAULCOFFREAnd that one.
ESPEJOMIROIRAnd also that one.
PUERTAPORTEDid I download the Spanish version by mistake?

And you have many more like this. And the dialogs were not much better. For those who understand french here are a few examples of original dialogs:
  • Quelle merde de jeu dans lequel le personnage principal meurt! Un instant, qu'y a-t-il de mon dernier désir?
  • Et bien merci et au revoir. Que tu la dormes bien.
  • Non rien. Je m'es allais déjà.
  • Comment peux-je tuer un vampire?
  • Qu'est-ce qu'on suppose que tu fais?
I will stop there. But I could fill pages like that. So if you speak french and fancy a good laugh, feel free to download the original french version (not the updated one) from our web site and play the game.

Another game for which we improved an existing translation is Mortville Manor. This is a French game that was also released in German and English. Except the DOS version was never released in English. Strangerke (one of the developer who worked on the engine in ScummVM) extracted the English strings from the Amiga and Atari version. But it was still missing all the dialogs. Strangerke created a Google Doc spreadsheet with the French and English strings and with a ScummVM user named Hugo we started fixing the existing English translation and translating the missing strings. Then we implemented a small tool to generate a data file from these strings (mort.dat, which is distributed with ScummVM) so that users can play in English using the game data files from the DOS French or German version.

For Mortville Manor, we actually also bundled the French and German strings and the data for the menu in the mort.dat data file. That way we can easily improve those languages as well. But for now they have not been improved and only the original French and German versions are available. I have been told the German one is not perfect. So if you like this game, speak German, would like to improve the German translation, and have a lot of free time on your hands you can contact me ;-)


See you tomorrow for part 3.

No comments:

Post a Comment