Help:Content translation/Translating/Translation quality/zh

From Linux Web Expert

File:PD-icon.svg Note: When you edit this page, you agree to release your contribution under the CC0. See Public Domain Help Pages for more info. File:PD-icon.svg

创建翻译时,必须在发布内容之前审阅内容。 您需要确保生成的内容以合适的方式改变原意,并检查它在目标语言中的阅读是否自然。 初始的机器翻译提供了一个有用的工具,它有助于加快翻译过程,该工具鼓励用户查看和大量编辑初始内容。

使用各种机制致力于确保翻译人员适当地编辑初始的机器翻译。 翻译编辑器会跟踪用户修改了多少初始的机器翻译,并借此定义不同的限制:阻止发布、鼓励用户进一步查看内容。

通过这种方式,该工具可以让用户充分利用机器翻译,同时防止创建未经过大量审查的低质量结果。 下面将进一步介绍这些限制的工作原理、如何根据每种语言的需求进行调整,以及如何衡量使用该工具生成的内容的质量。

鼓励审校翻译的限制

检查用户对初始机器翻译结果进行修改的百分比。 系统由此得知初始翻译被修改,删除,或者添加了多少单词。 系统在段落和全文两个层面上进行检查。 接下来解释每个层面上的差别阈值

全文范围的阈值

File:Cx-limits-publish.png
当尝试发布含有过多未修改内容的机器翻译时将显示错误信息。 此阈值已根据编辑的反馈针对印度尼西亚语进行了调整。

如果整个文档的95%或更多包含未经修改的机器翻译内容,则禁止发布。 此限制可防止接近原始的机器翻译,并规避明显的故意破坏行为。 它还可以防止用户仅添加内容,而不编辑机器翻译部分。 如下所述,可以根据每种语言调整此限制。

每个段落的限制

File:Cx-limits-paragraph.png
显示特定段落的警告,其中未修改的机器翻译超出限制。

还针对每个段落测量用户修改的百分比。 当一个段落包含超过85%的初始机器翻译时(或者,当从源文档复制内容时,其包含超过60%的未修改内容),则被认为是有问题的。

翻译编辑器将针对每个被认为有问题的段落显示警告,鼓励用户进一步编辑。 在某些情况下,用户仍然可以发布,但生成的页面可能会被添加到可能未经审核的翻译的跟踪类别中,供社区审核。 在其他情况下,可能根本不允许用户发布。

以下是确定是否允许用户发布时考虑的一些因素(其中一些仍在开发中):

  • The number of problematic paragraphs. Users are prevented from publishing translations with 50 or more problematic paragraphs.

允许发布少于50个有问题的段落的翻译,但那些有10到49个有问题的段落的翻译将被添加到可能未经审核的翻译的跟踪类别中,以供社群审查。

  • Previous deleted translations. To prevent recurring problems, the tool identifies users whose published translations were deleted in the last 30 days, and imposes much more strict limits upon their subsequent translation efforts.

对于此类用户,包含10个或更多问题段落的翻译将被禁止发布,而具有9个或更少问题段落的翻译将被添加到可能未经审核的翻译的跟踪类别中,以供社区审阅。

  • User confirmation. A less strict threshold is considered for paragraphs that a user marks as resolved—taken as a signal that the user reviewed and confirmed the status of the translation.

对于显示未修改内容警告但用户将其标记为已解决的段落,将应用不太严格的阈值(接受95%的机器翻译或75%的源内容)。 这将提供一种方法来适应自动翻译非常好的情况,但仍避免潜在的滥用该功能(即,不盲目遵循用户的确认)。

不受限制影响的内容

某些内容预计不会进行大量编辑,因此在应用上述限制时不考虑。 非常短的章节标题、引文或参考文献列表被排除在审查之外。 否则,用户可能会收到有关翻译不应翻译的内容的误导性警告,例如出现在参考文献或其他专有名词中的书名。

Limits on the mobile experience

For the mobile experience the initial set of limits follow a simpler approach. At the moment, only the overall percentage of unmodified machine translation for the whole translation is considered. On mobile, the whole translation consist of just one section of the article.

In particular, a warning is shown when the percentage of unmodified machine translation is over 85% for the whole section, and publishing is prevented when the percentage of unmodified machine translation is over 95%.

Feedback on how the limits system work on the mobile context would be very useful to determine how to evolve this initial approach.

调整限制

上述限制提供了一组通用机制,但它们可能需要调整每个维基的特定需求。 根据初始评估,初始机器翻译所需的修改量可以在10%到70%之间,具体取决于语言对。 在某些维基上,默认限制可能过于严格,会产生不必要的干扰或阻止发布完全有效的翻译。 在其他维基上,限制可能不够严格,允许发布编辑不够的翻译。

調整不同的閾值可使每個wiki根據其特定需求調整工具的限制。 母语人士的反馈对于正确调整限制至关重要。 如果根据您创建或审核翻译的经验,目前的限制似乎不能很好地运作,请s分享您的反馈,我们可以探索如何更好地调整它们。

When providing feedback about adjusting the thresholds, we recommend that you first create several example translations (make sure to check the publishing options if your test is not intended to be published as regular content). 在测试限制如何适用于您的语言时,请记住以下几点很有用:

  • Check for both cases. Make sure to check how the limits work for both: translations where the content has not been edited enough, versus where it has been edited enough.

通过这种方式,您可以更轻松地为工具的限制功能找到合适的平衡。 仅检查一种类型的问题可能会导致阈值在相反方向上移动得太远。

  • Check different content. Content in our wikis is highly diverse, and machine translation may work much better for some cases compared to others.

例如,与具有更多描述性文本的内容相比,充满数字数据或技术名称的内容可能需要用户进行较少的编辑。 确保通过翻译各种不同长度、不同内容的不同文章类型进行测试。

  • Prepare to iterate. Adjusting the thresholds is an iterative process.

It may require custom adjustments to the thresholds or that you improve your general approach. In any case, after each change, further testing may be needed to verify the improvements made.

Adjusting the limits in collaboration with editors has proven to be effective. For example, initial results show that the Indonesian community was able to significantly reduce the number of problematic translations they were receiving by restricting the publication of translations with more than 70% of unmodified machine translation content. Similar adjustments have been made for Telugu and Assamese language wikis. There is no automatic tool that is infallible, and these limits are not an exception.

The process of content review by the community is still essential, but these limits provide communities with a tool to reduce the number of translations they have to focus on, making the review process much more effective. Please share your feedback and we can explore how to better adjust them.

Tracking potentially unreviewed translations

A tracking category with the name "cx-unreviewed-translation-category" is provided for communities to easily find articles that have been published with some content exceeding the recommended limits.

You can find this category in the list of tracking categories on each wiki. Using it, you can track articles that passed the limits preventing publication, but that still had some paragraphs that were edited less than expected. For example the Indonesian Wikipedia's category includes articles that have less than 40% of machine translation overall, but which have some paragraphs with more than 80% of unmodified machine translation.

衡量翻译质量

自动评估内容质量并非易事。 删除率提供了一个有用的估计,即创建的内容是否足够好以便编辑所在的社群不会删除它。 Based on the analysis of deletion ratios, articles that are created as translations are less likely to be deleted when compared with articles created from scratch. This suggests that it may not be practical to set the limits for participation through translating much higher than those set for other ways of article creation.

Find published translations

Content translation adds a contenttranslation edit tag to the published translations. This allows communities the ability to use Recent changes, and similar tools, to focus on pages created using the translation tool. In addition, data on published translations and the statistics for machine translation use are available for anyone to analyze.

Inspect a specific translation

File:Translation debugger example.webm
Translation debugger example

The Translation debugger is a tool that allows the inspection of some metadata for a given translation, including the percentage of machine translation used for the whole document, and the translation service used for each paragraph. For specific types of content such as templates, the Content Translation Server API can be queried to check how templates will be transferred across languages.

基于用户权限的其他限制

File:Cx-limits-user-expertise.png
显示基于用户权限的发布限制时出错。 此示例基于英语维基百科社群决定仅限于扩展确认用户直接向主名字空间发布条目。

一些维基已经基于用户权限实施了其他翻译限制,以减少低质量翻译的创建。 For example, English Wikipedia requires users to be extended confirmed, which means they need to make 500 edits on English Wikipedia before they are allowed to publish a translation as an article. Newer editors can still publish translated articles in the User: or Draft: namespaces, and then move the article to the mainspace.

This restriction was created before the system of limits described in this page was available, and it is not the recommended approach to encourage the creation of good quality translations.

Before adding restrictions that do not take into account the content created, consider going through the process of adjusting the limits of unmodified content as described above. The limits can be made as strict as needed to prevent low-quality translations, while still allowing publication by editors making good translations.