How can I split a PDF file into single pages quickly (i.e. from the Terminal command line)?

25

12

I have a PDF file 6 pages long that I want to split into 1.pdf, 2.pdf, 3.pdf, etc...

Preview does not work for this surprisingly (unless I am missing something).

I would love to be able to do this simple task from the command line, but at this point I will take anything that gets the job done (without downloading sketchy software)

FYI http://users.skynet.be/tools/ does not work as advertised.

user391339

Posted 2014-10-16T21:46:19.757

Reputation: 593

2

A good command-line solution is from this SE answer. You can install ghostscript using Homebrew.

– fideli – 2014-10-16T21:52:32.250

Answers

22

Open up the pdf in preview and then on the view menu select thumbnails. Ctrl select the pages that you want now drag and drop them to the desktop.

eleethesontai

Posted 2014-10-16T21:46:19.757

Reputation: 336

1This worked well. Took me about 30 seconds to do this after flailing about for around 30 minutes. Some people are using this technique in conjunction w/ Automator but I haven't tried it yet. – user391339 – 2014-10-17T00:38:34.803

42

This can be achieved by using pdfseparate. You can install poppler with homebrew, by brew install poppler. This will also install pdfseparate. To split the PDF document.pdf into into single pages 1.pdf, 2.pdf, etc. use:

pdfseparate document.pdf %d.pdf

ttq

Posted 2014-10-16T21:46:19.757

Reputation: 531

1

Just installed poppler a day ago for being able to convert PDF documents to SVG with pdf2svg. Didn't notice that poppler comes with pdfseparate command. Since the accepted answer above (dragging and dropping all PDF pages with preview to desktop) requires me to "click around" and since I like solutions on terminal that work automagically by just a single command line, pdfseparate is exactly what I need. Thanks a lot for that hint!

– Arvid – 2015-12-18T08:47:19.390

Interestingly, pdfseparate produces pdfs whose total size is much much larger than the size of the original pdf. I had a 400 pages document with 1.9 MB. After splitting, I got something around 60 MB. – Konstantin – 2017-07-19T06:59:14.673

@ttq thank you. – arilwan – 2020-02-25T12:46:54.097

5

If you're interested in doing this from the command line, you can look at Benjamin Han's splitPDF python script to do the job. For instance:

splitPDF.py in.pdf 3 5

would split the file in.pdf into 3 files, splitting at pages 3 and 5.

Jean-Philippe Pellet

Posted 2014-10-16T21:46:19.757

Reputation: 151

This is good, and a bit more flexible in what you can output than pdfseparate above. Though it is mainly for splitting a pdf into chucks of pages, if you did want to split each page, you could easily use seq to produce a range of numbers in your command. Thanks! – dgig – 2016-04-22T18:20:56.487

1something like python splitPDF.py MyPDF.pdf $(seq -s ' ' 1 10 411) worked for me – dgig – 2016-04-22T18:29:27.617

1Words great. I confirm this works directly on MacOS 10.13.3 – MichaelCodes – 2018-03-28T21:52:13.417

1

For another alternative, see this answer. This uses the ImageMagick command line tools.

convert x.pdf -quality 100 -density 300x300 x-%04d.pdf

However, you have to be careful with the quality.

pheon

Posted 2014-10-16T21:46:19.757

Reputation: 171

1

If you want to extract a range of pages, you can use the following script which you call like this (assumed that you save it to file pdfextract.py somewhere on your system's PATH, e.g. /usr/local/bin, and assign it execution permission with chmod 744 pdfextract.py):

pdfextract.py --file-in /path/to/large/pdf --file-out /path/to/new/pdf --start --stop

#!/usr/bin/env python
# -*- coding: utf-8 -*-


import argparse
import os
import subprocess as sp


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--file-in', required=True, type=str, dest='file_in')
    parser.add_argument('--file-out', required=True, type=str, dest='file_out')
    parser.add_argument('--start', required=True, type=int, dest='start', default=-1)
    parser.add_argument('--stop', required=True, type=int, dest='stop', default=-1)

    args = parser.parse_args()
    assert os.path.isfile(args.file_in)
    assert not os.path.isfile(args.file_out)

    # remove temporary files
    for el in os.listdir('/tmp'):
        if os.path.isfile(os.path.join('/tmp', el)) and el[:12] == 'pdfseparate-':
            os.remove(os.path.join('/tmp', el))

    sp.check_call('pdfseparate -f {:d} -l {:d} {:s} /tmp/pdfseparate-%d.pdf'.format(args.start, args.stop, args.file_in), shell=True)

    cmd_unite = 'pdfunite '
    for i in range(args.start, args.stop + 1):
        cmd_unite += '/tmp/pdfseparate-{:d}.pdf '.format(i)
    cmd_unite += args.file_out
    sp.check_call(cmd_unite, shell=True)

    # remove temporary files
    for el in os.listdir('/tmp'):
        if os.path.isfile(os.path.join('/tmp', el)) and el[:12] == 'pdfseparate-':
            os.remove(os.path.join('/tmp', el))


if __name__ == "__main__":
    main()

Konstantin

Posted 2014-10-16T21:46:19.757

Reputation: 111