Manipulating pdf with Linux commandline tools

Few useful tips for manipulating pdf files. Useful if you just want to change few files without installing fancy software with GUI or if you want to make batch jobs.

  • Merge multiple pdf files into one pdf file

Can be done with pdfunite tool (package: poppler-utils)

pdfunite inputfileX.pdf inputfileY.pdf inputfileZ.pdf Outputfile.pdf

Merging can be done also with pdftk (package: pdftk )

pdftk inputfileX.pdf inputfileY.pdf inputfileZ.pdf output Outputfile.pdf
  • Split pdf file into pages

This will split input pdf – stores each page into separate file named pg_0001.pdf, pg_0002.pdf, etc.

pdftk inputfileX.pdf burst

This will split input pdf – stores each page into separate file named page_0001.pdf, page_0002.pdf, etc.

pdftk inputfileX.pdf burst output page_%04d.pdf

This will split input pdf and stores only specific pages into single new file.

pdftk inputfileX.pdf cat 1 2 3 4 9 10 output Outputfile.pdf
or
pdftk inputfileX.pdf cat 1-4 9-10 output Outputfile.pdf
  • Rotate pdf file

Use pdftk. From pdftk man page: Each option sets the page rotation as follows (in degrees): north: 0, east: 90, south: 180, west: 270, left: -90, right: +90, down: +180. left, right, and down make relative adjustments to a page’s rotation.

This will rotate only page one 90 degrees, page two 270 degrees and leaves all other pages unchanged.

pdftk inputfileX.pdf rotate 1east 2west output Outputfile.pdf

This will extract specific pages, rotates them and stores them in different order (4-3-2-1) into single new file.

pdftk inputfileX.pdf cat 4east 3south 2west 1left output Outputfile.pdf

This will rotate entire input document 90 degrees clockwise .

pdftk inputfileX.pdf cat 1-endright output Outputfile.pdf
or
pdftk inputfileX.pdf rotate 1-endright output Outputfile.pdf
  • Embed file to pdf

Use pdftk. pdfdk allows you embed/attach file to existing pdf and you can also choose on which page embed file is shown. File is “shown” as a special bullet on a side, which allows you to extract/download the attachment.

pdftk original_pdf_file.pdf attach_files attachment.doc to_page 1 output new_pdf_with_attachment.pdf
  • List and extract embed files from pdf

For listing embed files use pdfdetach

pdfdetach -list new_pdf_with_attachment.pdf  
1 embedded files 
1: attachment.doc

To extract embed files you can use pdftk or pdfdetach

# extract all files
pdftk new_pdf_with_attachment.pdf unpack_files
pdfdetach -saveall new_pdf_with_attachment.pdf

# extract first embed file and save it as attachment.doc
pdfdetach -save 1 -o attachment.doc new_pdf_with_attachment.pdf

If you found this useful, say thanks, click on some banners or donate, I can always use some beer money.