Replace Text in Multiple Word Documents with Python
Introduction
Are you tired of manually replacing text in multiple Word documents? In this guide, I’ll show you how to use Python to quickly and efficiently replace text across various Word documents. For this example, I will replace the year 2022 with 2023 in my documents.
Prerequisite
Before we dive in, it’s important to note that this solution works only on Windows, as it utilizes the pywin32 package. Additionally, you need to have Microsoft Word installed on your machine. I attempted using the python-docx package, but I faced issues with image deletions and formatting problems in the documents.
Install pywin32
To get started, open your terminal or command prompt and execute the following command:
pip install pywin32
Import the dependencies
Once installed, let’s create a new Python file and import the necessary dependencies. We will use the pathlib module to handle file paths, and win32com.client from the pywin32 package to interact with Word.
from pathlib import Path
import win32com.client
Path settings
Next, we need to set up the paths for our input and output folders. The input folder contains the Word documents we want to modify, and the output folder is where we will save the modified documents.
Find and replace settings
Now, let’s define the text we want to find and what we want to replace it with. Additionally, I will set up two variables for Word’s find and replace parameters.
search_term = "2022"
replace_term = "2023"
wd_replace = 2 # wdReplaceAll
wd_find_wrap = 1 # wdFindContinue
Open Microsoft Word
We are now ready to open Microsoft Word in the background. Set the visibility to false to run it without displaying the application.
word = win32com.client.Dispatch('Word.Application')
word.Visible = False
word.DisplayAlerts = False
Find and replace text
Next, we will iterate over each Word document in the input folder, perform the find and replace operation, and then save the modified document in the output folder.
Find and replace text in shapes
One challenge I encountered was that some text was within text boxes (shapes) in the documents. To handle this, I adapted a VBA solution to Python to search and replace text within shapes.
Outro
After iterating through all the files and making the necessary changes, don’t forget to close the Word application. With this script, you can efficiently manage text replacements in multiple Word documents without losing formatting or images.