Help:Extension:Translate/Components/cs

From Linux Web Expert

Revision as of 03:46, 6 December 2023 by imported>FuzzyBot (Updating to match new version of source page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Rozšíření Translate (překladače) je v mnoha ohledech rozsáhlé. Nejpravděpodobnějšími způsoby, jak rozšířit Translate, je přidat podporu pro nové formáty souborů (odkaz na sekci) nebo nové skupiny zpráv (odkaz na sekci). Někdy je také užitečné napsat novou revizi zpráv (odkaz na sekci) nebo rozšířit Translate pomocí háčků (odkaz na sekci). Někdy se můžete domluvit pouze pomocí stávajícího webu API .

Kromě již zmíněných konceptů existuje v Translate mnoho dalších důležitých konceptů a tříd, které je užitečné pochopit při používání Translate. Tato stránka si klade za cíl komplexně podrobně popsat všechny součásti Translate (překladače).

Primární rozšiřitelné komponenty

WebAPI

  • Podrobná dokumentace o API

Kromě háčků a rozhraní, která lze použít pouze v kódu PHP, poskytuje WebAPI přístup k mnoha skupinám zpráv a překládá související informace a akce. Je založen na MediaWiki API framework. Podporuje mnoho výstupních formátů jako json a xml.

Podpora formátu souboru (FFS)

Rozšíření Translate podporuje překlad non-wiki obsahu, jako jsou zprávy softwarového rozhraní, prostřednictvím tříd File Format Support (FFS). Tyto třídy implementují rozhraní FFS a abstraktně analyzují a generují obsah souboru. Třídy FFS používá třída FileBasedMessageGroup prostřednictvím konfiguračních souborů YAML.

Skupiny zpráv

Message groups bring together a collection of messages. They come in various types: translatable pages, SVG files or software interface messages stored in various file formats. Each message group instance has a unique identifier, name and description. In the code message groups are primarily referenced by their identifier, while the MessageGroups class can be used to get the instances for a given id. Message groups can also control many translation process related actions like allowed translation languages and the message group workflow states. Usually these behaviors fallback to the global defaults.

The two primary ways to register message groups to Translate are the TranslatePostInitGroups hook and YAML configuration.

Překladatelské pomůcky (pomocníci)

Translation aids are little modules that provide helpful and necessary information for the translator when translating. Different aids can provide suggestions from translation memory and machine translation, documentation about the message or even such a basic thing as the message definition – the text that needs to be translated.

Translate comes with many aid classes. Currently there is no hook to add new classes. Each class that extends the TranslationAid class only needs to implement one method called getData. It should return the information in structured format (nested arrays), which is then exposed via ApiQueryTranslationAids WebAPI module. In addition to the aid class, changes are needed to actually use the provided data in the translation editor(s).

One special case of translation aids are machine translation services. See the next section.

Webové služby

Adding more machine translation services can easily be done by extending the TranslationWebService class. See the webservices subdirectory for examples. You will need some basic information to implement such a class:

  • URL for the service
  • What language pairs are supported
  • Whether they use language codes that differ from the codes used in MediaWiki
  • Whether the service needs an API key

When you have this information, it is straightforward to write the mapCode, doPairs and doRequest methods. You should use the TranslationWebServiceException to signal errors. The errors are automatically logged and tracked, and if the service goes down, it will automatically be suspended to avoid unnecessary requests to it. The suggestions will automatically be displayed in the translation editor via the MachineTranslationAid class and the ApiQueryTranslationAids WebAPI module. See also $wgTranslateTranslationServices to see how those services are registered.

Kontroloři zpráv

We use computers to catch simple errors in translations, like unbalanced parenthesis or failing to use a variable placeholder. These checkers can emit warnings that are displayed in the translation editor (constantly updating). Any warning present in saved translation will also mark the translation as outdated (fuzzy in jargon). Each message group determines which checks it uses.

Další základní součásti

Soubor zpráv

Message collection provides access to the list of messages for a message group. It is used to load a set of languages for certain group in a certain language. It provides paging and filtering functionality.

There is currently a limitation that all messages in a collection must be in the same namespace. This prevents the creation of aggregate groups that include groups which have messages in different namespaces.

Here is short a example of how to use message collection to load all Finnish translations of group core and print the first ten of them:

$group = MessageGroups::getGroup( 'core' );
$collection = $group->initCollection( 'fi' );
$collection->filter( 'ignored' );
$collection->filter( 'translated', false );
$collection->loadTranslations();
$collection->slice( 0, 10 );
foreach ( $collection->keys() as $mkey => $title ) {
	echo $title->getPrefixedText() . ': ';
	echo $collection[$mkey]->translation() . "\n\n";
}

Zpráva

Pomocné třídy

Vyhledávač fontů

When rendering bitmap graphics, suitable fonts are needed for each language or script. To solve this problem, the FCFontFinder class was written. It uses the fc-match command of the package fontconfig (so this doesn't work on Windows) to find a suitable font. Many additional fonts should be installed on the server to make this useful. It can either return a path to a font file or the name of the font, whichever is more suitable.

Mezipaměť skupiny zpráv

The messages of file-based message groups are stored in CDB files. Each language of each group has its own CDB cache file. The reason for cache files are twofold.

First they provide constant and efficient access to message data avoiding the potentially expensive parsing of files in various places. For example the list of message keys for each group can be loaded efficiently when rebuilding a message index.

The second reason is that the cache files are used together with the translations in the wiki to process external message changes. Having a snapshot of the state of translations in files and wiki (hopefully consistent at that point) allows us to automatically deduct whether something has been changed in the wiki or externally and make intelligent choices, leaving only real conflicts (messages changed both externally and on the wiki since last snapshot) to be resolved by the translation administrator.

Nástroje pro skupiny zpráv

Index zpráv

Message index is a reverse map of all known messages. It provides efficient answer to the questions is this a known message and what groups does this message belong to. It needs to be fast for single and multiple message key lookups. Multiple different backends are implemented, with different trade-offs.

  • Serialized file is fast to parse, but don't provide random access and is very memory inefficient when the number of keys grow.
  • CDB file takes more disk space, but provides random access and reasonably fast lookups, while loading everything into memory is slower.
  • Database backend provides efficient random access and full load with the expense of little slower individual lookups. It also doesn't need to write to any files avoiding any permission problems.
  • Also memory backend (memcached, apc) is provided, which could be useful alternatives to database backend in multiple server setups to reduce database contention.

Message index does not support incremental rebuilds. Thus rebuilding the index gets relatively resource intensive when the number of message groups and message keys increase. Depending on the message group, this might involve parsing files or doing database queries and loading the definitions, which can take a lot of memory. The message index rebuilding is triggered in various places in Translate, and by default it is executed immediately during the request. As it gets slower, it can be delayed via the MediaWiki job queue and run outside of web requests.

Message table

Metadata table

Revtag

Stats code

String matcher/mangler

Ttmserver (translation memory)

Ttmserver is the name of translation memory interface. It supports multiple backends for inserting and querying translation suggestions. The code is located under ttmserver directory.

Misc stuff: RC integration, preferences, toolbox, jobs

Repository layout

Files in the root of the repository include:

  • Standard MediaWiki extensions files like Translate.php, translations and some documentation files like hooks.txt and README which includes change notes.
  • Major translate classes like MessageCollection and Message and some misc utilities not yet moved under utils.

Rest of the code is under subdirectories. Major parts have their own subdirectories each:

  • api - for WebAPI code
  • ffs - for file format support code
  • messagegroups - for message groups
  • scripts - for command line scripts
  • tag - for page translation code
  • ttmserver - for translation memory code
  • specials - for all special pages
  • tests - for all PHP unit tests

Most of the code is under utils. Some additional folders for non-code:

  • data - for miscellaneous data files
  • libs - for bundled library dependencies
  • resources - for all css, scripts and images
  • sql - for all SQL table definitions