Automating boring stuff with Python — Real life edition

Sulove Bista
3 min readJun 7, 2021

This weekend, a relative of mine called me and asked me a favor. He runs a small school in my hometown and he had recently organized a virtual essay writing competition and a drawing competition in the school. He wants to hand out the certificate of participation to all the participants but, because of covid-19, he couldn’t go out and outsource the task to a vendor. He had 531 PDFs file with him that contained certificates of each participant which he took one whole day to prepare. Since the certificates were to be shared via Facebook(ikr), lot of students and their parents didn’t know how to download the PDF file and view the certificate. Due to this reason, he wanted all the certificates to be in image format(JPEG or PNG).

Enough of the context, let’s move to the fun part. I absolutely didn’t want to waste my entire afternoon converting PDFs into JPEG. That’s when I thought I would automate the process using python. With a few google search I found a handy Python library PyMuPDF .

It was simple, easy to get started and most-importantly, was capable of doing what I wanted. I installed the library by pip install PyMuPDF . I went through the official documentation and tried couple of things on my own. I then googled and found a wonderful Stack Overflow answer that was just suitable for my case.

Here’s how my code looked like:

import fitz
import os
os.chdir('./Certificates)files = os.listdir('.')for file in files:
if file.endswith('.pdf'):
filename = file.split('.')[0]
doc = fitz.open(file)
page = doc.loadPage(0)
pix = page.getPixmap()
output = "{}.png".format(filename)
pix.writePNG(output)
os.remove(file)

I am importing PyMuPDF from the first line import fitz . I am then changing my current working directory to the folder called Certificates where there were bunch of folders and pdf files. I was just interested in PDF files as it contained the certificates of participation that’ why I am checking if the file ends with .pdf . I am just opening every PDFs in the Certificate folder, loading the first page of the PDF file and writing it as PNG file with same filename as before. Then, I am deleting the original PDF file as I no longer need it.

Just with 5 minutes of google search and another 5 minutes of writing a python script, I was able to convert 531 PDF files into 531 image files. I felt truly satisfied as I was able to complete the otherwise boring task within 10 minutes using Python. I believe lot of people reading this can relate this. This was just a small example of automation using Python and this was just my use-case but I hope you got some motivation reading this article, especially the ones who’re starting their journey of programming. Thanks for reading. Happy coding

--

--

Sulove Bista

Senior Software Engineer | Python Developer | Avid Learner | Blockchain Enthusiast