Extension:PdfHandler
PdfHandler Release status: stable |
|
---|---|
Implementation | Media |
Description | Allows to handle PDF files like multipage DJVU |
Author(s) | Martin Seidel (xarax) |
Compatibility policy | Snapshots releases along with MediaWiki. Master is not backward compatible. |
MediaWiki | >= 1.42 |
Database changes | No |
License | GNU General Public License 2.0 or later |
Download | |
Example | |
|
|
Quarterly downloads | Lua error in Module:Extension at line 172: bad argument #1 to 'inNamespace' (unrecognized namespace name 'skin'). |
Public wikis using | Lua error in Module:Extension at line 172: bad argument #1 to 'inNamespace' (unrecognized namespace name 'skin'). |
Translate the PdfHandler extension if it is available at translatewiki.net | |
Vagrant role | pdfhandler |
Issues | Open tasks · Report a bug |
The PdfHandler extension shows uploaded PDF files in a multipage preview layout.[1]
Together with the Proofread Page extension, PDF files can be displayed side-by-side with text. This allows users to transcribe books and other documents.[2]
Usage
- A user can display PDF files as an image, showing a single page at a time, like so:
[[File:myPdfFile.pdf|page=1|600px]]
. The page and size parameters are optional; the default page is page #1. Instead of a size-parameter, you can also use the thumb-parameter, with or without captions:[[File:myPdfFile.pdf|page=1|thumb|My PDF]]
.[3] - Because PdfHandler extends ImageHandler, you can use all the arguments that you would for an Image -- for example: thumb, right/left, caption, border, link, etc.
- If you would like to present a 2-page pdf, for example, do the following:
[[File:myPdfFile.pdf|page=1]] [[File:myPdfFile.pdf|page=2]]
- The main usage of the PdfHandler extension is without user interaction. If you upload a new pdf file, the metadata will be stored in the database, and then this file can be shown in a multipage preview layout like the djvu handler does. Without this extension, pdfs will not display properly when uploaded.
- Additionally, this extension allows Extension:ProofreadPage to handle pdfs in side-by-side view for transcribing/proofreading, as is done on Wikisource
Pre-requisites
This extension requires the following packages to be installed first:
Package | Description | Link |
---|---|---|
ghostscript
|
Renders the page images.
It provides the command |
www.ghostscript.com |
imagemagick
|
Does dynamic resizing and thumbnailing of images.
It provides the command |
www.imagemagick.org |
xpdf-utils or poppler-utils
|
Extracts metadata from PDF files. If you see "0 × 0 pixel" on the file description of a PDF, you lack this package.
It provides the command |
www.xpdfreader.com
On Ubuntu and Debian, the "poppler-utils" package can be used instead of "xpdf-utils". |
Type the following in your shell to see if you have the above installed first (it should list 4 rows):
which gs convert pdfinfo pdftotext
If something is missing, install the related packages. Example in Debian and Ubuntu:
sudo apt install ghostscript imagemagick xpdf-utils
If you are unable to install these packages, kindly contact your server administrator for assistance in your environment.
Installation
- Make sure that the required software is installed before you continue!
- <translate> [[<tvar name=2>Special:ExtensionDistributor/PdfHandler</tvar>|Download]] and move the extracted <tvar name=name>
PdfHandler
</tvar> folder to your <tvar name=ext>extensions/
</tvar> directory.</translate>
<translate> Developers and code contributors should install the extension [[<tvar name=git>Special:MyLanguage/Download from Git</tvar>|from Git]] instead, using:</translate>cd extensions/
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/PdfHandler - <translate> Add the following code at the bottom of your <tvar name=1>LocalSettings.php </tvar> file:</translate>
wfLoadExtension( 'PdfHandler' );
- Configure as required. (see also the examples provided)
- File:OOjs UI icon check-constructive.svg <translate> Done</translate> – <translate> Navigate to <tvar name=special>Special:Version</tvar> on your wiki to verify that the extension is successfully installed.</translate>
Configuration
You can (or, depending on the operating system of the server, will have to) set some variables in the "LocalSettings.php" file:
$wgPdfProcessor
(default = "gs")- path to your ghostscript implementation
$wgPdfPostProcessor
(default = "convert")- path to your imagemagick convert
$wgPdfInfo
(default = "pdfinfo")- path to your pdfinfo
$wgPdftoText
(default = "pdftotext")- path to your pdftotext
$wgPdfOutputExtension
(default = "jpg")- preferred output format[4]
$wgPdfHandlerDpi
(default = "150" )- resolution in dpi
- The extension extracts a bitmap image for each page of the PDF, using this resolution (dpi = dots per inch). For example, a PDF page with the European size A4 is 210 mm wide, corresponding to 595 points (at 72 dpi). This yields an image 1240 pixels wide (at 150 dpi). If instead this parameter is set to 300 dpi, the width will be 2480 pixels.
$wgPdfHandlerJpegQuality
(default = "95" / since MW 1.24+ )- Quality level, which the post processor should use.
- Variables below are not specific to this extension
- Enable PDF uploads, if you haven't already:
$wgFileExtensions [] = 'pdf';
- Enable ImageMagick, if you haven't already:
$wgUseImageMagick = true;
$wgMaxShellMemory
- memory limit for gs, convert and pdfinfo. The default value might be too low.
Ubuntu
File:OOjs UI icon lightbulb-yellow.svg <translate> Note:</translate> This is identical to the default settings for this extension.
$wgPdfProcessor = 'gs';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; // if defined via ImageMagick
// $wgPdfPostProcessor = 'convert'; // if not defined via ImageMagick
$wgPdfInfo = 'pdfinfo';
$wgPdftoText = 'pdftotext';
Debian
$wgPdfProcessor = '/usr/bin/gs';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; // if defined via ImageMagick
// $wgPdfPostProcessor = '/usr/bin/convert'; // if not defined via ImageMagick
$wgPdfInfo = '/usr/bin/pdfinfo';
$wgPdftoText = '/usr/bin/pdftotext';
Windows
$wgPdfProcessor = 'C:\Program Files\gs\gs8.60\bin\gswin32.exe';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; // if defined via ImageMagick
// $wgPdfPostProcessor = 'C:\Program Files\ImageMagick-6.6.2-Q16\convert.exe'; // if not defined via ImageMagick
$wgPdfInfo = 'C:\Program Files\xpdf-3.02pl1-win32\pdfinfo.exe';
$wgPdftoText = 'C:\Program Files\xpdf-3.02pl1-win32\pdftotext.exe';
macOS
$wgPdfProcessor = '/usr/local/bin/gs';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; // if defined via ImageMagick
// $wgPdfPostProcessor = '/usr/local/bin/convert'; // if not defined via ImageMagick
$wgPdfInfo = '/usr/local/bin/pdfinfo';
$wgPdftoText = '/usr/local/bin/pdftotext';
Troubleshooting
- General issues
- If PDF files do not display after upload, make sure that MediaWiki can execute the pdfinfo command and that configuration parameter
$wgPdfInfo
is set properly. Also check your error log, and make sure that your host hasn't disabled running external commands. - If PDF files do not show properly after installation (for example saying 0 height and width) you may need to use the maintenance scripts "refreshImageMetadata.php -f" and "rebuildImages.php"
- Example:
php refreshImageMetadata.php --mime application/pdf --force
- Also try to purge the page of the file. See Manual:Purge.
- If PDF files are rendered randomly check if the "C.UTF-8" locale is available on your server by running
locale -a
and make sure that configuration parameter $wgShellLocale is set to this locale. - If the main preview image of PDF files is broken (image not found by browser), but all other images are working, also check your configuration parameter $wgShellLocale . If it is set to a locale that does not use the point
.
as a decimal separator (e.g. "de-de", which use a comma), thesrcset
for theimg
tags will be broken. MediaWiki strongly recommends to use the "C.UTF-8" locale.
- Special solutions for Windows Server running MW 1.31.x
If you are running this extension on a windows machine with PHP 7, please see Tommeyheyser's workaround description.
If you are having pdfoinfo and/or pdftotext hanging issue that prevents a big PDF upload, please check the modified extension by SeongMoon
Notes
- ↑ PROBABLY not required any more: with WebStore enabled, the extension automatically generates images from the specified page
- ↑ This allows users to transcribe books and other documents as is commonly done with DjVu files (particularly in Wikisource).
- ↑ This single page option was introduced quite long ago (r25575)
- ↑ This does not preclude rendering to other formats, as the picture is served in a format determined by the extension (suffix) in its
src=
path, not by$wgPdfOutputExtension
. The server-side choice can be overridden with a user script – see example.
See also
- http://wiki.4intra.net/PdfHandler — a fork with much faster thumbnail generation by using pdftocairo instead of ghostscript+imagemagick
- Extension:Proofread Page — may be used in conjunction with PdfHandler
- How to use DjVu with MediaWiki
- Extension PDFEmbed
File:Wikimedia-logo black.svg | <translate> This {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extension|skin}} is being used on one or more [[<tvar name=2>m:Special:MyLanguage/Wikimedia projects</tvar>|Wikimedia projects]].</translate> <translate> This probably means that the {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extension|skin}} is stable and works well enough to be used by such high-traffic websites.</translate> <translate> Look for this {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extension's|skin's}} name in Wikimedia's <tvar name=2>CommonSettings.php</tvar> and <tvar name=3>InitialiseSettings.php</tvar> configuration files to see where it's installed.</translate> <translate> A full list of the {{<tvar name=1>#ifeq:Extension|Extension</tvar>|extensions|skins}} installed on a particular wiki can be seen on the wiki's <tvar name=ver>Special:Version</tvar> page.</translate> |
File:OOjs UI icon information-progressive.svg |
- Pages with script errors
- Extensions bundled with MediaWiki 1.21
- Pages with broken file links
- Stable extensions
- Extensions without an image
- Media handling extensions
- Extensions with release branches compatibility policy
- GPL licensed extensions
- Extensions in Wikimedia version control
- All extensions
- Extensions used on Wikimedia
- Transcription extensions
- PDF extensions