Pandoc

This is a handy conversion tool for various formats. What's more, I have combined it with GNU EMACS and Markdown to create a text creation system that is universal regardless of target or platform, can leverage a lot of modules I use (or even have written), and leverages one common set of keystrokes.

At it's most basic:

pandoc input file -f input format -t output format -o output filename

Format Code Description Notes
asciidoc ASCII Doc Output Only
beamer LaTeX beamer slide show Output Only
commonmark CommonMark Markdown
context ConTeXt Output Only
creole Creole 1.0 Input Only
docbook DocBook Input Only
docbook or docbook4 DocBook 4 Output Only
docbook5 DocBook 5 Output Only
docx Word docx
dokuwiki DokuWiki markup Output Only
dzslides DZSlides HTML5 + JavaScript slide show Output Only
epub EPUB Input Only
epub or epub3 EPUB v3 book Output Only
epub2 EPUB v2 Output Only
fb2 FictionBook2 e-book
gfm GitHub-Flavored Markdown
haddock Haddock markup
html HTML
html4 XHTML 1.0 Transitional Output Only
icml InDesign ICML Output Only
jats JATS XML
json JSON version of native AST
latex LaTeX
man groff man Output Only
markdown Pandoc s Markdown
markdown_mmd MultiMarkdown
markdown_phpextra PHP Markdown Extra
markdown_strict original unextended Markdown
mediawiki MediaWiki markup
ms groff ms Output Only
muse Muse
native native Haskell
odt OpenOffice text document
opendocument OpenDocument Output Only
opml OPML
org Emacs Org mode
plain plain text Output Only
pptx PowerPoint slide show Output Only
revealjs reveal.js HTML5 + JavaScript slide show Output Only
rst reStructuredText
rtf Rich Text Format Output Only
s5 S5 HTML and JavaScript slide show Output Only
slideous Slideous HTML and JavaScript slide show Output Only
slidy Slidy HTML and JavaScript slide show Output Only
t2t txt2tags Input Only
tei TEI Simple Output Only
texinfo GNU Texinfo Output Only
textile Textile
tikiwiki TikiWiki markup Input Only
twiki TWiki markup Input Only
vimwiki Vimwiki Input Only
zimwiki ZimWiki markup Output Only

A PDF can be specified by putting a .pdf extension on the output file name.

Data Directory

Pandoc will look for data files, such as default templates and reference files in the user’s home directory.

For UNIX/OS X, it first checks ~/.local/share, then ~/.pandoc.

In Windows, it’s c:\Users\USERNAME\AppData\Roaming\pandoc (AKA %USERPROFILE%/AppData\Roaming\pandoc).

Templates and Reference Documents

Pandoc can use a file as a template or reference document. A reference document works essentially the same way as a template, but for Microsoft Word or OpenOffice.

Reference Documents

A reference document will define how various standard styles (such as “Normal” (basic text), “Heading 1,” etc.) should appear. This could include type face, color, etc. This works for OpenOffice, Microsoft Word, and Microsoft PowerPoint.

For the purposes of this guide, I’m going to describe the process for Microsoft Word. OpenOffice docs work essentially the same way, but with .odt file extension instead of .docx. Likewise, substitute pptx for PowerPoint.

Creating a Reference Document

To create a reference document:

  1. Grab a copy of the default reference document: pandoc -o custom-reference.docx --print-default-data-file reference.docx.
  2. Edit custom-reference.docx in Word. Each text type is in the body of the document, and that can be modified directly (i.e. highlight the text and adjust), but I personally prefer to modify the associated style–this would allow me to make a template from that document for use directly in Word.

Using a Reference Document

To convert a document using that reference document, use the --reference-doc= option: pandoc document.md -f markdown -t docx -o document.docx --reference-doc=custom-reference.docx

If a file called reference.docx (or reference.odt or reference.pptx) exists in the Pandoc Data Directory, it will be used by default in all cases.

Templates

Other formats, such as LaTeX/TeX (which is also used for PDFs) and HTML, when used with the standalone (-s or --standalone) option. These use variables to set what is used, and it fairly involved. The Pandoc User Manual has a great section on templates to consult for more details.

To view the default template, type pandoc -D FORMAT. For instance, for HTML, type pandoc -D html.

To use a template, use the --template option:

pandoc document.md -f markdown -t html --template mytemplate.html -o document.html

Metadata

Some formats or commands may require some metadata to be added, such as a title. For this, use the --metadata option.

A YAML file can be used. This would be a file with key/variable pairs separated by a colon. The file should begin and end with three hyphens:

---
	key: value
	key: value
	---

The YAML file can be specified with the --metadata-file option.

Using STDIN as a Source

Sometimes, it's handy to simply type text "directly" into a document. Or, more likely, paste some text and have it converted in one fell swoop. This is possible with Pandoc.

STDIN from Windows

Example assumes output to Word. Adjust to match your preferred output format.

  1. Type pandoc con -f markdown -t docx -o filenamex.docx at the command line.
  2. Type stuff (or paste)
  3. Hit C-z, for end of file, then Enter.

STDIN from UNIX

Also going to assume Word.

  1. cat|pandoc -f markdown -t docx -o filename.docx
  2. Type stuff (or paste)
  3. Hit C-d.

Integrating with EMACS

It's possible to use the STDIN capability to convert a buffer in GNU EMACS, then either replace the text in the buffer, or output to a file.

Output to a File

I'm going to assume we're going from Markdown to HTML for the sake of this example.
  1. Select the region you wish to convert.
  2. Type M-| (which is mapped to shell-command-on-region)
  3. When prompted in the minibuffer, type pandoc -f markdown -t html -o outputfile.html and press Enter

Replace the Region

Again, I'm going to assume we're going from Markdown to HTML for the sake of this example.
  1. Select the region you wish to convert.
  2. Type M-1 (or any non-zero number), then M-| (which is mapped to shell-command-on-region)
  3. When prompted in the minibuffer, type pandoc -f markdown -t html and press Enter.
The output will also be in your "kill ring" (EMACS clipboard). Depending on how your OS and version of EMACS interacts, the output may also be in your system clipboard.

Return to Reference Index

Created by I. Charles Barilleaux
Last Update: 2020-10-13