Calculate Quartiles in Python
Quartiles are values that divide a dataset into four equal parts, each containing 25% of the data. Quartiles are useful for understanding the spread and distribution of a dataset.
In general, there are three quartiles used. Q1 (first quartile), Q2 (second quartile), and Q3 (third quartile) are the values below which 25%, 50%, and 75% of the data fall.
In Python, the quartiles can be calculated using the quantile()
function from the NumPy and pandas package.
The general syntax of quantile()
looks like this:
# calculate quartiles using using NumPy
import numpy as np
np.quantile(x, [0.25, 0.5, 0.75])
# calculate quartiles using pandas
import pandas as pd
df['col_name'].quantile([0.25, 0.5, 0.75])
Where, x
is the dataset in array format and the second array is the probability for the quantiles to compute.
The following examples explain how to use the quantile()
function from NumPy and pandas to calculate quartiles
Example 1: calculate quartiles using quantile()
from NumPy
Suppose, you have a following dataset,
x = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]
Calculate the quartiles using quantile()
function from NumPy,
# import package
import numpy as np
# calculate quartiles
np.quantile(x, [0.25, 0.5, 0.75])
# output
array([18.75, 32.5 , 46.25])
The Q1, Q2, and Q3 quartile values are 18.75, 32.5, and 46.25, respectively.
Example 2: calculate quartiles using quantile()
from pandas
Suppose, you have the following pandas DataFrame,
# import package
import pandas as pd
# create random pandas DataFrame
df = pd.DataFrame({'col1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L'],
'col2': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60]})
# view first few rows
df.head(2)
col1 col2
0 A 5
1 B 10
# calculate quartiles
df['col1'].quantile([0.25, 0.5, 0.75])
# output
df['col2'].quantile([0.25, 0.5, 0.75])
0.25 18.75
0.50 32.50
0.75 46.25
Name: col2, dtype: float64
The output shows that Q1, Q2, and Q3 quartile values are 18.75, 32.5 , and 46.25, respectively.
Enhance your skills with courses Python
- Python for Everybody Specialization
- Python 3 Programming Specialization
- Introduction to Data Science in Python
- Mastering Data Analysis with Pandas: Learning Path Part 1
- Python for Data Analysis: Pandas & NumPy
This work is licensed under a Creative Commons Attribution 4.0 International License
Some of the links on this page may be affiliate links, which means we may get an affiliate commission on a valid purchase. The retailer will pay the commission at no additional cost to you.