Module:Formatnum
From Linux Web Expert
File:OOjs UI icon check-constructive.svg | <translate> This module is rated as [[<tvar name=1>Special:MyLanguage/Category:Modules for general use</tvar>|ready for general use]].</translate> <translate> It has reached a mature form and is thought to be bug-free and ready for use wherever appropriate.</translate> <translate> It is ready to mention on help pages and other resources as an option for new users to learn.</translate> <translate> To reduce server load and bad output, it should be improved by sandbox testing rather than repeated trial-and-error editing.</translate> |
File:Semi-protection-shackle-no-text.svg | <translate> This module is [[<tvar name=1>Special:MyLanguage/Category:Modules subject to page protection</tvar>|subject to {{<tvar name=2>#if:</tvar>|cascading|page}} protection]].</translate> <translate> It is a highly visible module in use by a very large number of pages.</translate> <translate> Because vandalism or mistakes would affect many pages, and even trivial editing might cause substantial load on the servers, it is [[<tvar name=1>Special:MyLanguage/Project:Protected page</tvar>|protected]] from editing.</translate> |
This module formats a number in the local format, based on MediaWiki data. This module is used by Template:Formatnum. For usage instructions, please take a look at Template:Formatnum.
Current limitations:
- Numbers with more than 14 decimals are not supported.
- It still does not allow more than 20 different "KnownLanguages" (in MediaWiki's core library for Lua) on one page. This means that you can have at a maximum 20 module transclusions, where each of them is formatting a number into a different language.
TODO:
- Improve these limits by not using mw.formatNum() — and fix the various bugs in that MediaWiki module (including missing or incorrect data for some languages).
- See /engines/LuaCommon/LanguageLibrary.php in Scribunto PHP library implementing the MediaWiki module.
- See /languages/Language.php in MediaWiki core library with the default support code for all known languages
- See /languages/classes/Language*.php in MediaWiki core library for replacement code specific to some languages
- See /resources/src/mediawiki.language/mediawiki.language.numbers.js for similar function in MediaWiki client-side JavaScript library
- Support more languages and localized digits (we need more complete mappings of languages to their numeric scripts for digits, more data for localized separators and the number of digits in groups).
- Properly handle rounding of the least significant figure when there are decimals in excess for the specified precision, instead of truncating them.
- Reference data:
- See By-Type Chart: Core Data: Numbering Systems in Unicode CLDR data (mappings of languages to numbering systems)
- /common/supplemental/numberingSystems.xml in Unicode CLDR data (digits and algorithms for each numbering system)
- Rule-Based Number Formatting in Unicode TR35: LDML (Locale Data Markup Language)
- icu::RuleBasedNumberFormat Class Reference: Detailed Description in ICU documentation, for the RBNF syntax used in CLDR data for implementing algorithmic number systems.
- See By-Type Chart: Numbers: Symbols in Unicode CLDR data (separators, signs, etc.)
- See By-Type Chart: Core Data: Number Formatting Patterns in Unicode CLDR data (formatting numbers using those numbering systems)
- Number Format Patterns in Unicode TR35: LDML (Locale Data Markup Language)
- See By-Type Chart: Core Data: Numbering Systems in Unicode CLDR data (mappings of languages to numbering systems)
Using this module from templates
Usage:
{{formatnum|1=value|2=lang|prec=prec|sep=compact}}
Parameters:
- lang — language code as a string (e.g. 'en', 'de', etc.). If the language is not specified (nil or empty string) or not supported, the current user's language (as reported by MediaWiki's
{{int:Lang}}
) will be used.
- The lang named parameter is also a supported alias of the 2nd parameter.
- The value "arabic-indic" is also currently supported as an alias, and replaced by a supported language code.
- See formatNum() below for the expected values and the description of other parameter values.
Note:
- This template internally uses
{{#invoke:Formatnum|main}}</nowiki>
to pass indirectly its parameters to the main() function of this module in the parent frame.
formatNum()
This function converts an value into a localized number.
Unlike "Language:formatNum()" in MediaWiki's core libraries for Lua, it correctly supports numbers using exponential notations such as 1e15 (MediaWiki's core function is currently broken and randomly forgets the exponent for some language, so that its formated numbers are incorrectly scaled and display wrong values).
Usage:
formatted_string = formatnum.formatNum(value, lang, prec, compact)
Parameters:
- value — as an ASCII-only number or string. If the string cannot be converted to a number with Lua's tonumber(), that string will be returned as is.
- lang — language code as a string (e.g. 'en', 'de', etc.). If that language is not supported, localized digits and separators will not be used (except for a few languages)
- prec — if not nil and not negative, this is the number of digits in displayed decimals (by truncating the decimals in excess or by adding zeroes). Valid range: 0 to 14.
- when prec is not specified or nil, the decimal separator is shown only if there are 1 or more visible decimals;
- when prec is negative or non integer it is treated like nil;
- when prec is 0, there will never be any decimal separator or any displayed decimals;
- when prec is positive, a decimal separator will always be present before this number of decimals, but when prec is higher than 14, it is treated like 14.
- compact — if this option is not nil and not false, don't return any localized grouping separators (the localized decimal separator and digits are still used).
Examples:
formatted_string = formatnum.formatNum(12345.123)
— convert to user's language, using localized digits and separatorsformatted_string = formatnum.formatNum(12345.123,'')
— same thing (language code not supported)formatted_string = formatnum.formatNum(12345.123,'default')
— same thing (language code not supported)formatted_string = formatnum.formatNum(12345.123,'en')
— convert to English: "12,345.123"formatted_string = formatnum.formatNum(12345.123,'fr')
— convert to French: "12 345,123"formatted_string = formatnum.formatNum(12345.123,'fr',2)
— same thing but limit to 2 decimals: "12 345,12"formatted_string = formatnum.formatNum(12345,'fr',2)
— same thing (2 null decimals are padded): "12 345,00"formatted_string = formatnum.formatNum(12345,'fr',2,true)
— same thing but without grouping separators: "12345,00"
Limitations:
- Same limit of 20 "KnownLanguages" (because it still depends on mw.Language module in order to detect localized digits and separators).
- When specifying the "prec" parameter, decimals in excess are just truncated, and the least significant digit is not rounded.
-- This module is intended to replace the functionality of Template:Formatnum and related templates.
local p = {}
function p.main(frame)
local args = frame:getParent().args
local prec = args.prec or ''
local sep = args.sep or ''
local number = args[1] or args.number or ''
local lang = args[2] or args.lang or ''
-- validate the language parameter within MediaWiki's caller frame
if lang == "arabic-indic" then -- only for back-compatibility ("arabic-indic" is not a SupportedLanguage)
lang = "fa" -- better support than "ks"
elseif lang == '' or not mw.language.isSupportedLanguage(lang) then
-- Note that 'SupportedLanguages' are not necessarily 'BuiltinValidCodes', and so they are not necessarily
-- 'KnownLanguages' (with a language name defined at least in the default localisation of the local wiki).
-- But they all are ValidLanguageCodes (suitable as Wiki subpages or identifiers: no slash, colon, HTML tags, or entities)
-- In addition, they do not contain any capital letter in order to be unique in page titles (restriction inexistant in BCP47),
-- but they may violate the standard format of BCP47 language tags for specific needs in MediaWiki.
-- Empty/unspecified and unsupported languages are treated here in Commons using the user's language,
-- instead of the local 'ContentLanguage' of the Wiki.
lang = frame:callParserFunction( "int", "lang" ) -- get user's chosen language
end
return p.formatNum(number, lang, prec, sep ~= '')
end
local digit = { -- substitution of decimal digits for languages not supported by mw.language:formatNum() in core Lua libraries for MediaWiki
["ml-old"] = { '൦', '൧', '൨', '൩', '൪', '൫', '൬', '൭', '൮', '൯' },
["mn"] = { '᠐', '᠑', '᠒', '᠓', '᠔', '᠕', '᠖', '᠗', '᠘', '᠙'},
["ta"] = { '௦', '௧', '௨', '௩', '௪', '௫', '௬', '௭', '௮', '௯'},
["te"] = { '౦', '౧', '౨', '౩', '౪', '౫', '౬', '౭', '౮', '౯'},
["th"] = { '๐', '๑', '๒', '๓', '๔', '๕', '๖', '๗', '๘', '๙'}
}
function p.formatNum(number, lang, prec, compact)
-- Do not alter the specified value when it is not a valid number, return it as is
local value = tonumber(number)
if value == nil then
return number
end
-- Basic ASCII-only formatting (without paddings)
number = tostring(value)
-- Check the presence of an exponent (incorrectly managed in mw.language:FormatNum() and even forgotten due to an internal bug, e.g. in Hindi)
local exponent
local pos = string.find(number, '[Ee]')
if pos ~= nil then
exponent = string.sub(number, pos + 1, string.len(number))
number = string.sub(number, 1, pos - 1)
else
exponent = ''
end
-- Check the minimum precision requested
prec = tonumber(prec) -- nil if not specified as a true number
if prec ~= nil then
prec = math.floor(prec)
if prec < 0 then
prec = nil -- discard an incorrect precision (not a positive integer)
elseif prec > 14 then
prec = 14 -- maximum precision supported by tostring(number)
end
end
-- Preprocess the minimum precision in the ASCII string
local dot
if (prec or 0) > 0 then
pos = string.find(number, '.', 1, true) -- plain search, no regexp
if pos ~= nil then
prec = pos + prec - string.len(number) -- effective number of trailing decimals to add or remove
dot = '' -- already present
else
dot = mw.ustring.sub (mw.language.new(lang):formatNum(0.1), 2, 2) -- must be added
end
else
dot = '' -- don't add dot
prec = 0 -- don't alter the precision
end
if lang ~= nil and mw.language.isKnownLanguageTag(lang) == true then
-- Convert number to localized digits, decimal separator, and group separators
local language = mw.getLanguage(lang)
if compact then
number = language:formatNum(tonumber(number), { noCommafy = 'y' }) -- caveat: can load localized resources for up to 20 languages
else
number = language:formatNum(tonumber(number)) -- caveat: can load localized resources for up to 20 languages
end
-- Postprocessing the precision
if prec > 0 then
local zero = language:formatNum(0)
number = number .. dot .. mw.ustring.rep(zero, prec)
elseif prec < 0 then
-- TODO: rounding of last decimal; here only truncate decimals in excess
number = mw.ustring.sub(number, 1, mw.ustring.len(number) + prec)
end
-- Append the localized base-10 exponent without grouping separators (there's no reliable way to detect a localized leading symbol 'E')
if exponent ~= '' then
number = number .. 'E' .. language:formatNum(tonumber(exponent),{noCommafy=true})
end
else -- not localized, ASCII only
-- Postprocessing the precision
if prec > 0 then
number = number .. dot .. mw.string.rep('0', prec)
elseif prec < 0 then
-- TODO: rounding of last decimal; here only truncate decimals in excess
number = mw.string.sub(number, 1, mw.string.len(number) + prec)
end
-- Append the base-10 exponent
if exponent ~= '' then
number = number .. 'E' .. exponent
end
end
-- Special cases for substitution of ASCII digits (missing support in Lua core libraries for some languages)
if digit[lang] then
for i, v in ipairs(digit[lang]) do
number = mw.ustring.gsub(number, tostring(i - 1), v)
end
end
return number
end
return p