I’ve occasionally needed to extract text and/or images from a PDF. I’ve found a couple of easy, free ways to do this on MacOS.

There’s commercial software such as Adobe Acrobat that will extract images from a PDF, of course, but there’s an easier way: a free application called The Unarchiver that treats a PDF file as if it were a zip file and extracts everything into a folder. Just install the app, then right-click on a PDF file and select Open With.

Related pro-tip: if you want to extract all the images from a Keynote presentation, you can simply unzip the presentation using the commandline unzip application. It’ll expand into a folder that contains all the images and other assets. (Or you can right-click and open with the Archive Utility app.)

Mission accomplished, but you’ll probably have a bunch of .tiff files where you want compact .jpg or compressed .png files instead. If you’re a commandline user, and you have ImageMagick installed, you can convert them all at once with a Bash variable substitution like this:

find . -name '*.tiff' | while read line; do
   convert "$line" "${line%%tiff}jpg"

That’ll do the trick for the images. For the text, you can just open the PDF in Mac’s default PDF viewer, the Preview app. Use Cmd-A to select all of the text and other content, and then you can simply paste it into any plaintext destination. If you don’t have a favorite text editor such as Atom or Sublime Text, you can use Mac’s default TextEdit app. Just use Format > Make Plain Text to set it to plaintext mode.

Done! Now Read These:

Temporal Words Are Risky

I avoid using many time-related words in writing, because they can be ambiguous and cause problems.

Writing Kindly

Some of my rash writing has taught me better, more effective ways to say things.

Forestry, a Static Site CMS

Forestry integrates with GitHub and Hugo to create a WYSIWYG CMS for static websites.