Effortless PDF Management: Merge & Split PDFs Using Python

PDF files are widely used for documents, reports, e-books, and business files, but managing them manually can be time-consuming. Whether you need to combine multiple PDFs into one or extract specific pages from a large document, Python provides a simple and efficient solution.

In this blog, we’ll explore how to merge and split PDFs in Python using PyPDF2 and understand their practical applications.

Why Merge & Split PDFs?

📌 Merging PDFs:
✔ Combine multiple documents into one cohesive file.
✔ Organize reports, scanned pages, and research papers.
✔ Simplify file sharing by reducing the number of attachments.

📌 Splitting PDFs:
✔ Extract specific sections from a large document.
✔ Remove unwanted pages before sharing.
✔ Split large PDFs into smaller, manageable parts.

These techniques are useful for students, professionals, and businesses that deal with digital documents.

Installing Required Library

To work with PDFs in Python, install the PyPDF2 library using:


pip install pypdf2

This library allows you to read, merge, and split PDFs with just a few lines of code.

Merging PDFs in Python

Let’s create a script to merge multiple PDF files into one document.


from PyPDF2 import PdfMerger

def merge_pdfs(pdf_list, output_filename):
    merger = PdfMerger()

    for pdf in pdf_list:
        merger.append(pdf)

    merger.write(output_filename)
    merger.close()
    print(f"Merged PDFs saved as {output_filename}")

# Example usage
if __name__ == "__main__":
    pdf_files = ["file1.pdf", "file2.pdf", "file3.pdf"]
    merge_pdfs(pdf_files, "merged_output.pdf")

🔹 How It Works?
✔ PdfMerger() – Creates a PDF merger instance.
✔ .append(pdf) – Adds each PDF file to the merger.
✔ .write(output_filename) – Saves the merged PDF.

📌 Use Case: Easily combine multiple invoices, reports, or scanned pages into a single file.

Splitting PDFs in Python

Now, let’s create a script to split a PDF file into separate pages.


from PyPDF2 import PdfReader, PdfWriter

def split_pdf(pdf_file):
    reader = PdfReader(pdf_file)

    for page_num in range(len(reader.pages)):
        writer = PdfWriter()
        writer.add_page(reader.pages[page_num])

        output_filename = f"split_page_{page_num + 1}.pdf"
        with open(output_filename, "wb") as output_pdf:
            writer.write(output_pdf)
        print(f"Page {page_num + 1} saved as {output_filename}")

# Example usage
if __name__ == "__main__":
    split_pdf("document.pdf")

🔹 How It Works?
✔ PdfReader(pdf_file) – Reads the input PDF.
✔ .add_page(reader.pages[page_num]) – Extracts each page separately.
✔ .write(output_filename) – Saves each page as a new PDF file.

📌 Use Case: Extract specific pages from a contract, user manual, or academic paper without editing the entire document.

Additional Features to Explore

✅ Add Watermarks: Overlay a watermark on each PDF page.
✅ Rotate Pages: Adjust page orientation within a PDF.
✅ Extract Text: Retrieve text content from PDF documents.

You can extend this functionality by integrating PDF editing tools into your automation workflow.

Conclusion

Python makes PDF merging and splitting simple with PyPDF2. Whether you’re a student organizing notes, a professional handling reports, or a business owner managing invoices, automating these tasks can save time and effort.

Try these scripts and simplify your PDF document management today! 🚀

Have Questions or Need Help?

Drop your questions in the comments below! 😊

pythonicbytes

Search This Blog

Must-Know Python Libraries for Every Developer (With Real-World Use Cases!) 🚀