Many times, we face the challenge of being unable to upload a PDF document to government portals or online platforms due to its large size. I have encountered this issue numerous times on government sites and banking portals. As a tech enthusiast, I decided to create a handy script that resizes PDFs to a size acceptable for these online platforms. Whether you’re dealing with large technical manuals, financial reports, or any other bulky PDFs, the ability to trim and compress these files efficiently can save time and storage space. In this tech blog post, I’ll share how to resize or trim PDFs using Python.
Why PDF Trimming?
PDFs are ubiquitous in the tech industry, serving as the standard for sharing and archiving documents. However, they can quickly become unwieldy, especially when they contain unnecessary pages or uncompressed images. The trimming and compression of PDFs can:
- Reduce file sizes for easier sharing and storage.
- Improve document management by removing irrelevant content.
- Streamline workflows by automating repetitive tasks.
Getting Started
To Script PDF trimming, we’ll leverage Python and the PyPDF2
library. This powerful library enables us to manipulate PDF files programmatically, making it an ideal tool for our needs.
Step 1: Setting Up Your Environment
First, ensure you have Python installed on your system. You can download it from python.org. Once installed, you’ll need to install the PyPDF2
library:
pip install PyPDF2
Step 2: Writing the Python Script
Here’s a script that trims/resize a PDF where one can retain specified pages. This script also takes an input file as a command-line parameter, making it versatile for different use cases.
import PyPDF2
import os
from pathlib import Path
import argparse
def trim_pdf(input_pdf_path, pages_to_keep):
"""
Trims the PDF by keeping only the specified pages.
:param input_pdf_path: Path to the input PDF file.
:param pages_to_keep: List of page numbers to keep (0-indexed).
"""
input_path = Path(input_pdf_path)
output_pdf_path = input_path.parent / f"{input_path.stem}_compressed{input_path.suffix}"
with open(input_pdf_path, 'rb') as input_pdf_file:
pdf_reader = PyPDF2.PdfReader(input_pdf_file)
pdf_writer = PyPDF2.PdfWriter()
for page_num in pages_to_keep:
pdf_writer.add_page(pdf_reader.pages[page_num])
with open(output_pdf_path, 'wb') as output_pdf_file:
pdf_writer.write(output_pdf_file)
print(f"Trimmed PDF saved as {output_pdf_path}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Trim a PDF by keeping only specified pages.")
parser.add_argument("input_pdf", type=str, help="Path to the input PDF file.")
parser.add_argument(
"pages_to_keep",
type=int,
nargs='+',
help="List of page numbers to keep (0-indexed)."
)
args = parser.parse_args()
trim_pdf(args.input_pdf, args.pages_to_keep)
Step 3: Understanding the Script
- Importing Modules: The script imports essential modules, including
PyPDF2
for PDF manipulation,os
andpathlib
for file path operations, andargparse
for command-line argument parsing. - Defining the
trim_pdf
Function: This function takes the input PDF path and a list of pages to keep. It reads the input PDF, creates a new PDF with the specified pages, and saves the trimmed PDF with a “_compressed” suffix. - Command-Line Interface: The script uses
argparse
to handle command-line arguments, making it flexible to use with different PDF files and pages.
Step 4: Running the Script
To run the script, open your terminal and use the following command:
python nextstruggle_trim_pdf.py path/to/your/input.pdf 0 2 4
This command will keep pages 1, 3, and 5 (0-indexed) of the input.pdf
file and save the trimmed PDF as input_compressed.pdf
in the same directory.
My Tech Advice: Automating PDF trimming with Python is a powerful way to enhance your productivity and manage your documents more efficiently. By following the steps outlined in this tech guide, you can create a versatile tool that simplifies the process of trimming and compressing PDF files. Embrace the power of automation and transform the way you handle PDFs in your tech workflow. Feel free to experiment with the script, customize it to fit your needs, and share your experiences in the comments below. Happy coding!
#AskDushyant
#Python #Automation #PDF #Coding #Scripting #Programming #CodeSnippet
Note: The above script has been tested with Python3. Depending on your Python version, you may need to adjust the library accordingly.
Leave a Reply