Extension:WikibaseMediaInfo
WikibaseMediaInfo is an extension to Wikibase adding a MediaInfo entity for handling structured data about multimedia files.
The extension hooks into the File Page. It stores supplemental metadata (captions and depicts statements) about the file in a MediaInfo Entity. The user can view, create, edit, and delete this data.
Requirements
- UniversalLanguageSelector
- CirrusSearch
- Wikibase (follow instructions for both client and repo)
- WikibaseCirrusSearch
Installation
- Ensure CirrusSearch, Wikibase (client and repo) and WikibaseCirrusSearch are set up properly.
- <translate> [[<tvar name=2>Special:ExtensionDistributor/WikibaseMediaInfo</tvar>|Download]] and move the extracted <tvar name=name>
WikibaseMediaInfo
</tvar> folder to your <tvar name=ext>extensions/
</tvar> directory.</translate>
<translate> Developers and code contributors should install the extension [[<tvar name=git>Special:MyLanguage/Download from Git</tvar>|from Git]] instead, using:</translate>cd extensions/
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/WikibaseMediaInfo - <translate> Only when installing from Git, run <tvar name=composer>Composer</tvar> to install PHP dependencies, by issuing <tvar name=code>
composer install --no-dev
</tvar> in the extension directory.</translate> <translate> (See <tvar name=phab><translate> task <tvar name=1>T173141</tvar></translate></tvar> for potential complications.)</translate> - <translate> Add the following code at the bottom of your <tvar name=1>LocalSettings.php </tvar> file:</translate>
wfLoadExtension( 'WikibaseMediaInfo' );
- <translate> Run the [[<tvar name=update>Special:MyLanguage/Manual:Update.php</tvar>|update script]] which will automatically create the necessary database tables that this extension needs.</translate>
- File:OOjs UI icon check-constructive.svg <translate> Done</translate> – <translate> Navigate to <tvar name=special>Special:Version</tvar> on your wiki to verify that the extension is successfully installed.</translate>
- Add configuration.
MediaInfo UI
MediaInfo entities are shown on, and can be edited from, their associated File page and while uploading a file via UploadWizard.
There are separate sections in the UI for editing captions and statements.
An editing interface for is always shown for default properties (defaults are defined in config). On live commons depicts is the only default property - users are encouraged to say what is depicted by an image. Statements with other properties can be added by the user at will.
Glossary
MediaInfo Entity
A Wikibase entity that contains structured data about media files. It is stored in a slot on a File page and consists of
- an ID in the form Mxxx, where xxx is the id of the associated wiki page
- any number of captions (one per language)
- any number of statements
(Note: if there is no caption or statement data then the entity is not stored in the database - in this case it is known as a 'virtual entity')
Caption
A short piece of text describing a media file, plus its language. Used to provide a short description of the file (the same as 'labels' in Wikibase).
Statement
A single fact about a media file consisting of a key-value pair such as Licence=CC-BY-SA
or Depicts=Dog
.
Keys are always a property. Values can be any wikibase datatype.
Strictly, a wikibase 'statement' means a key-value pair (a 'claim') plus a rank (preferred
, normal
, or deprecated
) and zero or more documentary references.
We don't typically have documentary references for descriptions of files, and all statements have a normal
rank by default, so in MediaInfo we prefer to use the term 'statement' instead of 'claim'.
Property
A property is a property of a file that can have a value - for example 'depicts' (what an image is a picture of), 'resolution', 'created by', 'license'.
Each property has a unique id in wikibase in the form Pxxx
such as P123
.
Item
An item is a concept, topic, or object with an id - for example on Wikidata the CC0 licence is Q6938433, physics is Q413 and the planet Earth is Q2.
Each item has a unique id in wikibase in the form Qxxx
such as Q123
.
Qualifier
A qualifier is a secondary statement that modifies the primary statement. For example an image might have a tree in the foreground and the sea in the background, in which case it could have 2 'depicts' claims associated with it - 'depicts=tree(applies to part=foreground)' and 'depicts=sea(applies to part=background)'.
Search
Users can search for files by their MediaInfo captions just as they would search for anything else. For example, if a user uploads a picture of the Eiffel Tower, and enters 'Tour Eiffel' (French) and 'Eiffel Tower' (English) as multilingual file captions, the picture is findable by another user searching for either 'Eiffel Tower' or 'Tour Eiffel'.
Searching for claims/statements
Searching for claims/statements happens via WikibaseCirrusSearch keywords, for details see: Help:WikibaseCirrusSearch
Search implementation
When the File page is saved, the following MediaInfo data is written to the Elasticsearch index (all examples use Wikidata Property and Item ids):
- Captions data in every language is stored in the
opening_text
field - Claims are stored in the format
propertyID=value
as array elements in thestatement_keywords
field using the wikibase property ID (and item id, if value is an item) - e.g. 'depicts house cat' is stored asP180=Q146
- Claims with qualifiers are stored in the
statement_keywords
field along with their qualifiers in the formatpropertyID=value[qualifierPropertyID=qualifierValue]
. For example, the Mona Lisa painting (Wikidata item Q12418) depicts a sky (Q13217555) in the background (Wikidata property P518). If we arrange this data in a Wikibase claim, it would be: 'depicts sky, applies to part background', which would be stored asP180=Q12418[P518=Q13217555]
- Note that claims with qualifiers are also stored without the qualifier, to increase their findability. So, for example, if someone entered the above claim-plus-qualifier, the claim
P180=Q12418
is also stored, so that someone can find the file by searching for 'depicts sky' alone, as well as by searching for 'depicts sky, applies to part background'. - Claims data with qualifiers where the qualifier value is a quantity is stored in the
statement_quantity
field in the formatpropertyID=value|quantity
, eg. 'depicts human, quantity 1' is stored asP180=Q5|1
.
Note that not all claims are stored. A claim will be indexed in ElasticSearch only if ALL of the following conditions are true:
- The claim has a real value (i.e. its value is not 'no value' or 'unknown value') AND
- We know how to process its value for indexing. More value processors may be added in future, but currently we require the claim's value to be either a Q item ID, a string (alphanumeric), or a quantity (numeric) AND
- the claims's Wikidata property ID is NOT in a configurable list of excluded IDs (
$wgWBRepoSettings[ 'searchIndexPropertiesExclude' ]
) AND either its property ID is in a configurable list of property IDs that should be indexed ($wgWBRepoSettings[ 'searchIndexProperties' ]
) OR its property type is in a configurable list of property types that should be indexed ($wgWBRepoSettings[ 'searchIndexTypes' ]
)
Note that for a claim's quantities to be stored, the claim must meet all the criteria above AND the property ID for the quantity qualifier must be present in a configurable list of property IDs ($wgWBRepoSettings[ 'searchIndexQualifierPropertiesForQuantity' ]
).
MediaSearch search profile
Structured data (captions and statements) are now also included in the default search profile when searching (only) in the NS_FILE
namespace.
Search terms like "dog" will also include files where the caption (in the user's language) contains "dog", or contains a statement P180=Q144
(depicts=dog).
For a more elaborate (technical) writeup up the MediaSearch search profile, see the Extension:WikibaseMediaInfo/MediaSearch subpage.
Configuration
Extension configuration variables are sets of key-value pairs.
They are documented in more detail in WikibaseMediaInfo/extension.json
.
All config variables are added toLocalSettings.php
.
The following config options are available for this extension:
Config (in LocalSettings.php)
$wgMediaInfoProperties
Default wikibase properties that will always be shown, allowing users to add/edit/delete values for them, on the File page/UploadWizard, regardless or not they already have a value. On live Commons this is the Lua error in Module:Wikidata_label at line 271: attempt to index field 'wikibase' (a nil value). (P180) property, as we want to encourage users to fill in values for this in particular. Value is an array of key-value pairs connecting a label name to an existing wikibase database id.
['depicts' => 'P180']
$wgMediaInfoHelpUrls
URLs for pages where a user can learn more about particular wikibase properties - if there is a help URL for a particular property then there will be a "learn more" link for that property that leads to the URL from the config.
['P180' => 'https://www.wikidata.org/wiki/Property:P180']
Other Config
$wgUploadWizardConfig['wikibase']['enabled']
Enables MediaInfo data on UploadWizard when set totrue
.
Development
Tests
PHPUnit tests are located in tests/phpunit
.
You can run tests not requiring the MediaWiki framework (located in tests/phpunit/composer
) by running composer test
.
This command also runs code style checks using PHPCS.
Tests relying on the MediaWiki framework (located in tests/phpunit/mediawiki
) must be run using MediaWiki core’s composer phpunit:entrypoint
endpoint.
JavaScript tests are located in tests/node-qunit
.
You can run tests from a terminal with npm run test:unit
.
Node version 6.x should be used.
See also
File:Wikimedia-logo black.svg | <translate> This {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extension|skin}} is being used on one or more [[<tvar name=2>m:Special:MyLanguage/Wikimedia projects</tvar>|Wikimedia projects]].</translate> <translate> This probably means that the {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extension|skin}} is stable and works well enough to be used by such high-traffic websites.</translate> <translate> Look for this {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extension's|skin's}} name in Wikimedia's <tvar name=2>CommonSettings.php</tvar> and <tvar name=3>InitialiseSettings.php</tvar> configuration files to see where it's installed.</translate> <translate> A full list of the {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extensions|skins}} installed on a particular wiki can be seen on the wiki's <tvar name=ver>Special:Version</tvar> page.</translate> |
MediaInfo_Entity_Page.png |
- Pages with script errors
- Pages with broken file links
- Stable extensions
- Extensions with invalid or missing type
- Extensions without a compatibility policy
- Extensions with manual MediaWiki version
- Extensions supporting Composer
- GPL licensed extensions
- Extensions in Wikimedia version control
- ArticleUndelete extensions
- BeforePageDisplay extensions
- CirrusSearchAddQueryFeatures extensions
- CirrusSearchProfileService extensions
- CirrusSearchRegisterFullTextQueryClassifiers extensions
- GetEntityByLinkedTitleLookup extensions
- GetEntityContentModelForTitle extensions
- GetPreferences extensions
- MediaWikiServices extensions
- MultiContentSave extensions
- ParserOutputPostCacheTransform extensions
- RevisionUndeleted extensions
- ScribuntoExternalLibraries extensions
- SearchDataForIndex2 extensions
- SidebarBeforeOutput extensions
- WikibaseClientEntityTypes extensions
- WikibaseRepoEntityNamespaces extensions
- WikibaseRepoEntityTypes extensions
- All extensions
- Extensions used on Wikimedia
- Wikibase extensions