Combine All Excel Files in a Folder into One Workbook
Introduction
Managing data across multiple Excel files can be a daunting task, especially when you need to consolidate information from several worksheets into a single workbook. In this guide, I will take you through the process of using Python to achieve this efficiently.
Understanding the Setup
In my folder, I have multiple Excel files, each containing different data. One of these files even has several worksheets. The goal here is to copy all worksheets from all Excel files into one workbook using Python.
Getting Started with Python
First things first, I will create a blank Python file and import the necessary libraries. I will use pathlib for handling file paths and xlwings for interacting with Excel files.
Pathlib is a standard Python module, so no additional installation is required. If you don’t have xlwings installed yet, you can do so by running the following command:
pip install xlwings
Specifying the Source Directory
Next, I will specify the source directory that contains the Excel files. In my case, the folder is called ‘Files’ and is located in the same directory as the Python script. If you’re copying and pasting the script, this is the only change you need to make.
Storing File Paths
After specifying the source directory, I will store the Excel file paths in a Python list. This allows us to easily iterate through all the files later on.
Creating a New Excel Workbook
Now, I will create a new Excel workbook using xw.Book()
. This new workbook will serve as the destination for all the data from the Excel files.
Iterating Over Each Excel File
To merge the data, I will iterate over each Excel file using a for-loop. Inside this loop, I will open each Excel file and then use another for-loop to go through each worksheet in the opened workbook.
By doing so, I can copy each worksheet and paste it into our combined workbook right after the first worksheet. After copying all the necessary worksheets, I will close the Excel file.
Cleaning Up the Combined Workbook
Next, I will perform some cleanup by deleting the empty worksheet in our combined Excel file before saving it. For the file name, I will use 'all worksheets.xlsx'
.
Handling Excel Instances
This part can be a bit tricky. If we run our script and no previous Excel file was open, we create a new instance of Excel. Just closing the workbook does not quit the Excel instance, which may remain in memory.
To avoid this, I will check how many workbooks are already open. If there is only one (our new combined workbook), I can quit the Excel instance. Otherwise, I will simply close the workbook.
Preventing Overwrites
Every time you run this script, it will overwrite the existing workbook. To prevent this, I will append the current date and time as a unique value to the workbook’s name. I will import the time module and store the current local time in a variable.
Finalizing the Script
Once I format the timestamp, I can concatenate it with our Excel file name. After that, I can rerun the script, and our new Excel file name will reflect the current date and time.
Conclusion
And that’s all there is to it! By following these steps, you can effectively combine all worksheets from multiple Excel files into one workbook using Python. If you have any questions, feel free to ask in the comments.
Thanks for reading!