Extension:WikibaseLexeme/RDF mapping
This is the specification of the RDF mapping of the Wikibase Lexeme data model. It is based on the Wikibase RDF dump format. If not stated otherwise the prefixes are defined by this document. When relevant it reuses the LEMON model by the Ontolex W3C community group.
Lexeme
Example:
@prefix dct: <http://purl.org/dc/terms/> .
@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
wd:L64723 a wikibase:Lexeme , ontolex:LexicalEntry ;
# lemma
wikibase:lemma "hard"@en ;
rdfs:label "hard"@en ;
# language
dct:language wd:Q1860 ;
# lexical category
wikibase:lexicalCategory wd:Q34698 ;
# statements
wdt:P2 wd:Q3 ;
wdt:P7 "value1" , "value2" ;
p:P2 wds:Q3-4cc1f2d1-490e-c9c7-4560-46c3cce05bb7 ;
p:P7 wds:Q3-24bf3704-4c5d-083a-9b59-1881f82b6b37 ,
wds:Q3-45abf5ca-4ebf-eb52-ca26-811152eb067c ;
# forms
ontolex:lexicalForm wd:L64723-F1 ;
# senses
ontolex:sense wd:L64723-S1 .
Comments:
- Classes
- The lexeme concept of Wikibase aligns well with
ontolex:LexicalEntry
. A classwikibase:Lexeme
is also used for consistency withwikibase:Item
andwikibase:Property
. - Lemma
- We use the custom property
wikibase:lemma
. The closest lemon relation isontolex:canonicalForm
but its range isontolex:Form
. Usingwikibase:lemma
has instead of the genericrdfs:label
just like item (and maybe alsoschema:name
andskos:prefLabel
) has the advantage of not having lexemes appearing in existing SPARQL queries that are usingrdfs:label
and allows to easily query only lexemes by label with just one triple pattern. - Language
- We use the the Dublin Core
language
property just like lemon examples. We are not reusing directlyschema:inLanguage
because it is already used for Wikibase sitelinks representation with a BCP 47 language code range. It is planned but not implemented yet to emit thisschema:inLanguage
property as a derived value with as value the BCP 47 language code of the language when it exists. - Lexical category
- We use our own
wikibase:lexicalCategory
property in order to avoid a slight abuse of thelexinfo:partOfSpeech
from the lexinfo lemon extension that is restricted to parts of speech. - Statements
- For consistency and simplicity we use the same schema as for items and properties.
- Forms
- The relation between Lexemes and Forms uses the
ontolex:lexicalForm
relation. See the Form section for forms representation. - Senses
- The relation between Lexemes and Forms uses the
ontolex:sense
relation. See the Sense section for forms representation.
Form
Example:
@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
wd:L64723-F1 a wikibase:Form , ontolex:Form ;
# representation
ontolex:representation "hard"@en ;
rdfs:label "hard"@en ;
# grammatical features
wikibase:grammaticalFeature wd:Q1234 , wd:Q2345 ;
# statements
wdt:P2 wd:Q3 ;
wdt:P7 "value1" , "value2" ;
p:P2 wds:Q3-4cc1f2d1-490e-c9c7-4560-46c3cce05bb7 ;
p:P7 wds:Q3-24bf3704-4c5d-083a-9b59-1881f82b6b37 ,
wds:Q3-45abf5ca-4ebf-eb52-ca26-811152eb067c .
Comments:
- Classes
- The form concept of Wikibase aligns with
ontolex:Form
. The additional classwikibase:Form
is also used. - Representation
- We use the
ontolex:representation
relation from lemon. We do not use its sub propertyontolex:writtenRep
in order to not forbid representations in phonetic variants of languages even if the lemon specification recommends to not useontolex:representation
directly.rdfs:label
is also emitted for interoperability reasons. - Grammatical Features
- We use a custom property
wikibase:grammaticalFeature
because there is no such relation in lemon withontolex:Form
for domain. - Statements
- For consistency and simplicity we use the same schema as for items and properties.
Sense
Example:
@prefix ontolex: <http://www.w3.org/ns/lemon/ontolex#> .
wd:L64723-S1 a wikibase:Sense , ontolex:LexicalSense ;
# gloss
skos:definition "presenting difficulty"@en ;
rdfs:label "presenting difficulty"@en ;
# statements
wdt:P2 wd:Q3 ;
wdt:P7 "value1" , "value2" ;
p:P2 wds:Q3-4cc1f2d1-490e-c9c7-4560-46c3cce05bb7 ;
p:P7 wds:Q3-24bf3704-4c5d-083a-9b59-1881f82b6b37 ,
wds:Q3-45abf5ca-4ebf-eb52-ca26-811152eb067c .
Comments:
- Classes
- The sense concept of Wikibase aligns with
ontolex:LexicalSense
. The additional classwikibase:Sense
is also used. - Gloss
- We use
skos:definition
to provide gloss following Lemon usage.rdfs:label
is also emitted for interoperability reasons even if a gloss is not really a label. - Statements
- For consistency and simplicity we use the same schema as for items and properties.
Data node
Example:
wdata:L64723 schema:version "59"^^xsd:integer ;
schema:dateModified "2015-03-18T22:38:36Z"^^xsd:dateTime ;
a schema:Dataset ;
schema:about wd:L64723 .
For each Lexeme a data node should be returned with the URI wdata:L1
if the Lexeme is wd:L1
. It should use the same schema as for Wikibase items and properties data node. It could also provide some statistics based on page properties just like items.
Note: There is no specific data node for forms and senses because the granularity of data nodes is the data container (wiki page). It is not a strong limitation because it is easy to retrieve the data node of the Lexeme they belong to with the property path schema:about/ontolex:lexicalForm
or schema:about/ontolex:sense
.
Wikidata Query Service
Wikidata Query Service does not provide the following features (mostly for performance reasons):
- The
wikibase:Lexeme
,wikibase:Form
andwikibase:Sense
classes. - The
rdfs:label
relations (more specific equivalents exists for lexemes, forms and senses). - Just as for items and properties, the data node is integrated within the
wd:
node.