Manual:Parameters to Special:Export

From Linux Web Expert

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.


<translate> See also</translate>: Manual:DumpBackup.php

Wiki pages can be exported in a special XML format to upload into another MediaWiki.[1] See Help:Export for more details.

Available parameters

Below is the list of available parameters for Special:Export as of version 1.16. Not all of these are available through the Special:Export UI.

Parameter Variable type Description
action String Unused; set to "submit" in the export form.
Page Selection
/ (no parameter) Selects up to one page, e.g. Special:Export/Sandbox.
pages ? A list of page titles, separated by linefeed (%0A) characters. Maximum of 35 pages.
addcat/category-name String? These were added later. addcat returns all members of the category catname added to it. If $wgExportFromNamespaces is enabled, addns and nsindex do the same, but with namespaces and their numerical indexes. A maximum of 5000 page titles will be returned.

For example, the following is for all pages in en:Category:Books:

https://en.wikipedia.org/w/index.php?title=Special:Export&addcat&catname=Books&pages=XXXX

addns/namespace-index
Sorting
dir[2] String Should be set to "desc" to retrieve revisions in reverse chronological order.

The default, with this parameter omitted, is to retrieve revisions in ascending order of timestamp (oldest to newest).

Limiting Results
offset[2] ? The timestamp at which to start, which is non-inclusive. The timestamp may be in several formats, including the 14-character format usually used by MediaWiki, and an ISO 8601 format like the one that is output by the XML dumps.
limit[2] Integer The maximum number of revisions to return. If you request more than a site-specific maximum (defined in $wgExportMaxHistory: 1000 on Wikimedia projects at present), it will be reduced to this number.

This limit is cumulative across all the pages specified in the pages parameter. For example, if you request a limit of 100, for two pages with 70 revisions each, you will get 70 from one and 30 from the other.[3]

curonly Boolean Include only the current revision (default for GET requests).
history ? Include the full history, overriding dir, limit, and offset.

This is not working for all say https://en.wikipedia.org/w/index.php?title=Special:Export&pages=US_Open_(tennis)&history=1&action=submit works fine and gives all revisions but https://en.wikipedia.org/w/index.php?title=Special:Export&pages=India&history=1&action=submit doesn't.

Extras
templates ? Includes any transcluded templates on any pages listed for export.
listauthors Boolean Include a list of all contributors' names and user IDs for each page. Functionality is disabled by default; can be enabled by changing $wgExportAllowListContributors.
pagelink-depth Integer Includes any linked pages to the depth specified. Limited to $wgExportMaxLinkDepth (defaults to 0, disabling the feature), or 5 if user does not have permission to change limits.
wpDownload ? Save as file, named with current time stamp. Implemented through content-disposition:attachment HTTP header.

URL parameter requests do not work

The dir, offset and limit parameters only work for POST requests. GET requests through a URL are ignored.

When you use the URL as in a browser, you are submitting via GET. In the ruby script, you are using POST.

As an example, the following parameter request does not work, it returns all revisions of a page despite the parameter limit=5.

https://en.wikipedia.org/w/index.php?title=Special:Export&pages=XXXX&offset=1&limit=5&action=submit

Retrieving earliest 5 revisions

A POST request is generated by cURL when passing -d "". The following retrieves the earliest 5 revisions from the English Wikipedia main page and its talk page:

curl -d "" 'https://en.wikipedia.org/w/index.php?title=Special:Export&pages=Main_Page%0ATalk:Main_Page&offset=1&limit=5&action=submit'

And here are the next 5 revisions of the main page only:

curl -d "" 'https://en.wikipedia.org/w/index.php?title=Special:Export&pages=Main_Page&offset=2002-01-27T20:25:56Z&limit=5&action=submit'

Here the timestamp from the last revision of the previous query is copied into the offset field of the URL. Because the offset field is non-inclusive, that 5th revision is not displayed again, and instead we get revisions 6-10.[4]

POST request to download

A more explicit example, especially if you also want to save the darn thing, would be

curl -d "&pages=Main_Page&offset=1&action=submit" https://en.wikipedia.org/w/index.php?title=Special:Export -o "somefilename.xml"

The URL root needs to follow the MediaWiki parameters... also, note the fact that you need to add the curl parameters at the end for saving the file as something. Otherwise it will just scroll on your screen and nothing will be saved. Currently, it appears that Wikipedia servers are under maintenance, hence the above method is showing error and not providing the xml.

If you instead have the list of titles in a file, say title-list, you must pass the list as a parameter to curl and encode the linefeeds correctly (for some reason, --data-urlencode and @ do not work):

curl -d "&action=submit&pages=$(cat title-list | hexdump -v -e '/1 "%02x"' | sed 's/\(..\)/%\1/g' )" https://en.wikipedia.org/w/index.php?title=Special:Export -o "somefilename.xml"

If you want to save bandwidth, append the following arguments as well:

--compressed -H 'Accept-Encoding: gzip,deflate'

Stopping the export of your MediaWiki

Please keep in mind that making it difficult for your users to back up their work could discourage them from contributing to your wiki.

If $wgExportAllowHistory is set to false in LocalSettings.php, only the current version can be exported, not the full history.

By default with GET requests, only the current (last) version of each page is returned.

If the $wgExportAllowHistory parameter is true in LocalSettings.php, and the "Include only the current revision, not the full history" is unchecked, then all versions of each page are returned.

To disable export completely, you need to set a callback-function in your LocalSettings.php:

function removeExportSpecial(&$aSpecialPages)
{
	unset($aSpecialPages['Export']);
	return true;
}
$wgHooks['SpecialPage_initList'][] = 'removeExportSpecial';

If you want to define a permission for export, put the following in your LocalSettings.php:

// Override SpecialExport, which is work for MW1.35
// the parameters of __construct() are changed in later versions
class SpecialExport2 extends SpecialExport {
    public function __construct() {
        parent::__construct();
        $this->mRestriction = 'export';
    }
    public function execute( $par ) {
        $this->checkPermissions();
        parent::execute( $par );
    }
}
function adjustExportSpecial(&$aSpecialPages)
{
	$aSpecialPages['Export'] = SpecialExport2::class;
	return true;
}
$wgHooks['SpecialPage_initList'][] = 'adjustExportSpecial';
$wgGroupPermissions['sysop']['export'] = true; // Add export permission to sysop only

Keep in mind that exporting is still possible, if you have the API enabled.

Notes

  1. If this function is enabled on the destination wiki, and the user is a sysop there. The export can be used for analyzing the content. See also Syndication feeds for exporting other information but pages and Help:Import on importing pages.
  2. 2.0 2.1 2.2 These parameters are ignored if either curonly or history are supplied, or if passed via a GET request (e.g., a browser address bar). See URL parameter requests do not work for more information.
  3. The order is by page_id, pages with lower page_id get more revisions. The reason for this is that Special:Export only ever does one database query per HTTP request. If you want to request all the history of several pages with many revisions each, you have to do it one page at a time.
  4. This parameter convention is very similar to the one for UI history pages.

See also

External links