Excel SUMIFS, COUNTIFS & AVERAGEIFS in PYTHON (Pandas Library)
Introduction
In this post, I’ll dive into how to perform common Excel calculations such as Sum, Average, Count, and their conditional counterparts, using the Pandas library in Python. I’ll walk through each step, ensuring you can replicate these essential functions in your own projects.
Getting Started with Pandas
First, let’s set the stage. I have a dataset consisting of dummy sales records that I typically work with in Excel. To replicate my Excel workbook calculations in Python, I’ll start by importing the necessary libraries and loading the dataset.
Once the dataset is loaded into a variable called df, I can begin my calculations.
Basic Calculations: Sum, Average, and Count
To calculate the total sales, I simply specify the sales column and use the sum() method:
Similarly, I can compute the average and count by using the methods mean() and count(), respectively.
Conditional Calculations: SumIf, AverageIf, and CountIf
Next, let’s explore how to perform conditional calculations in Pandas. The query() method allows me to filter the dataset based on specific criteria.
For example, if I want to calculate the sum of sales specifically for Germany, I would write:
Likewise, the calculations for average and count follow a similar pattern. Just replace the method at the end with mean() or count().
Advanced Conditional Calculations: SumIfs, AverageIfs, and CountIfs
Now, let’s step it up a notch with SumIfs, AverageIfs, and CountIfs. This time, I can filter the dataset using multiple criteria. For instance, if I want the sum of sales for Germany in the Midmarket Segment, I can combine conditions using the ampersand:
Additionally, I can use variables within the query function by prefixing them with the @ symbol. In my example, @country will be replaced with Germany, and @segment with Midmarket.
Grouping Data: Unique Values and Pivot Tables
If you want to get the sum of all unique values in a column, the groupby() function comes in handy. This is quite similar to creating a Pivot Table in Excel.
Moreover, I can group sales by multiple columns, such as both country and segment, to get a more nuanced view of the data.
Conclusion
In this post, I covered how to perform various Excel calculations using the Pandas library in Python. From basic sums to advanced conditional calculations, you now have the tools to analyze your data effectively.