Linux Productivity

How to Easily Convert Between Document Formats in Linux

Aaron Peters 08-05-2017

One of the oft-cited issues with switching to Linux is file compatibility. You’ll invariably send users of other operating systems files, and they won’t look the same when opened in applications like Word. While you can install fonts or try VMs or emulators 4 Ways to Make Linux Compatible With Even More Software Read More to try to ensure things come out with consistent look, another approach is to do your work in a plain text format, then convert it after you’re done.


One tool you can use to convert between formats is pandoc, an essential tool in any Linux user’s toolbox.

Basic Pandoc Installation and Usage

Installing pandoc on most Linux distributions is a matter of a simple trip to the repositories. On Ubuntu-based systems, the following command installs it for you:

sudo apt-get install pandoc

Once installed, you can start using the command line program to convert files. Excellent at handling Markdown and other lightweight markup languages, if you have an .MD file lying around, you can convert this to HTML with the following:

pandoc -o myfile.html

pandoc markdown source target

The -o flag tells the name of the output file you want. In this case it also infers the output format (HTML) by the filename extension. You can use the -r (for read) and -w (for write) flags to tell pandoc the type of conversion you want. Suppose you’re used to writing in Markdown, but need to post something to a Mediawiki-based page:

pandoc -r markdown -w mediawiki -o

pandoc convert markdown source mediawiki

In its earlier versions, pandoc was focused on “upgrading” files, in the sense it could convert simpler formats (such as Markdown) to more complex ones (e.g. ODT or Microsoft’s DOCX). But it will now read these more complicated formats as well. This means if you’re accustomed to a word processor but are tempted by all the reasons to use a smaller and more portable plain text format What Is Markdown? 4 Reasons Why You Should Learn It Now Tired of HTML and WYSIWYG editors? Then Markdown is the answer for you no matter who you are. Read More , it has become a lot easier.

Given a directory full of Word files, the following command will convert each of them to Markdown:

for file in *
 pandoc -r docx -w markdown -o "$file".md "$file"

pandoc convert batch result


Note that this will leave you with files named, so you’ll need to run a quick rename command (or better yet, add it to the above as a shell script 5 Beginner Linux Setup Ideas For Cron Jobs & Shell Scripts With bash scripting, you can do a complex series of tasks in one quick go so it's great for elaborate and repetitive needs. It's also a great way to get to know terminal. Read More ).

Pandoc Command Line Options

Now that you’ve got some basics, we’ll look at some of the more advanced options of pandoc’s command line options.

ODT/DOCX Reference Files

Suppose you’ve converted all your old, bulky word processor files to Markdown. While you’re reveling in the joy of authoring in plain text, at some point you’ll need to share these with someone. And that someone may not be as enlightened as you. You can simply reverse the read and write flags to convert your file back to Word format:

pandoc -r markdown -w docx -o wordfile.docx

But some folks like their Word files with particular fonts, numbered headings, etc. Pandoc’s DOCX back-end supports template files, called reference files, for just such an occasion. These are ODT or DOCX files you’ve set up with all the styling you need. Then pandoc applies these styles when it converts if you pass it the reference file at the command line:

pandoc -r markdown -w odt --reference-odt=/home/user/path/to/ref-file.odt -o lowriter.odt

pandoc odt reference style

Notice how the fonts configured in the reference file above (Arial Black for Heading 1, etc.) display in the converted file below. You can create as many of these reference files as you need (for example, one per client). Then ignore formatting entirely while you’re writing, and apply the styling in one step as you convert.

pandoc odt reference output

PDF Rendering Back-Ends

Creating PDFs is also a simple exercise, once you install some necessary packages. A lightweight way to get PDF-writing capability is to install the wkhtmltopdf package, a command line tool to convert HTML to PDF. pandoc supports this natively, so if you set the write flag to HTML, but the output file as PDF, it will interpret this as your intent to use wkhtmltopdf all by itself!

pandoc -r markdown -w html -o nicepub.pdf

Alternately, you can go for the full-featured option by using the TeTex typesetting system. Take advantage of the fact that these packages are Suggested Installs for the pandoc package by re-installing with the following command:

sudo apt-get install --install-suggests pandoc

Then, sit back while a lot (really, a lot) of packages install. Once they’re complete, you can convert your file directly to PDF by specifying it as the write flag:

pandoc -r markdown -w pdf -o nicepub-tetex.pdf

While the wkhtmltopdf option requires the install of only one package, you can get some more print-friendly results with TeTex. Namely, serif fonts are used by default, and the pages are automatically numbered.

pandoc pdf tetex result

Ebook Generator

Finally, pandoc can convert your files to ebooks suitable for reading on a phone or e-reader How To Manage Your Ebook Collection For The Amazon Kindle With Calibre The biggest problem with an eReader like the Amazon's Kindle is that it requires unreasonable effort to move books between different devices. Before Kindle, I fell in love with Calibre. A bit on the heavy... Read More . The epub and epub3 back-ends will give you a properly formatted ebook:

pandoc -r markdown -w epub -o mybook.epub

pandoc epub output

Advanced Tips

The advantages of pandoc go beyond its power as a command line utility… for example, it includes support for an improved version of Markdown, and can easily be integrated with graphical applications.

Pandoc’s Markdown Flavor

In addition to being a conversion tool, pandoc supports a slightly enhanced flavor of Markdown. By using pandoc instead of the standard markdown command, you have some additional features available, including the following:

  • Metadata — Pandoc’s flavor of Markdown allows you to include information in the header of your document such as author, date, email address, etc.
  • Text Decorations — You can apply text decorations such as strikethrough or super/subscript that aren’t supported in standard Markdown through pandoc.
  • Tables — This alone makes pandoc worthwhile compared to “vanilla” Markdown. Using the pipe character to separate table cells, you can create a table that ranges from really ugly to human-readable in plain text as well as rendered format.
  • Fancy Lists — Pandoc allows you to format lists with outline-style levels, e.g. “1.,” then “A.,” then “i.,” etc. You can also specify a starting number for lists, where lists in plain Markdown start from “1.”
  • Code Syntax Highlighting — You can have highlighting applied to your code blocks by telling pandoc what the language is.

The above are only a selection of pandoc Markdown’s features. Visit the manual page on for a full list of the extras this flavor of Markdown provides.

Use a GUI With pandoc

While pandoc is effective as a command-line tool, it does contain a lot of options. If you’re new to Linux, you may prefer to use pandoc with a GUI interface. While it doesn’t contain a graphical interface by default, you can install PanDocElectrion to convert your docs with point-and-click. Download the install script from the app’s website, then run it to install all the necessary packages and the program itself.

pandoc convert pandocelectron install

Once installed, the npm start command in the PanDocElectron directory will launch the application. With dropdown lists for formats and the ability to choose the input file with a dialog, this will help you get used to the “in and outs” of pandoc, as it were.

pandoc convert pandocelectron ui

If you’re comfortable with pandoc’s myriad options and flags but just want a way to easily call it, you can integrate it with your GUI text editor. For example, the Atom editor contains a number of packages that provide the ability to save the current file out to different formats using pandoc (package pandoc-convert):

pandoc convert atom commands

Another option is to run pandoc commands using an editor’s built-in functions, such as the build command. Atom’s build-tools package gives you the ability to specify custom commands:

pandoc convert buildtools config

Then, you can call the build command on your pandoc-compatible files, just as you would on source code:

pandoc convert buildtools command

Pandoc Takes Some of the Stress Out of Switching

With pandoc in your toolkit, you can rest easier knowing you can always get your documents to other people in the format they need. At the same time, you can take advantage of some of the great features of Linux (consider giving one of the terminal-based text editors like vim a try).

Do you often find yourself converting files back and forth between formats? If you’re running into compatibility problems, let us know in the comments, and we’ll see if we can use pandoc to sort you out!

Image Credit: Nirat.pix via

Related topics: File Conversion, Markdown.

Affiliate Disclosure: By buying the products we recommend, you help keep the site alive. Read more.

Whatsapp Pinterest

Leave a Reply

Your email address will not be published. Required fields are marked *