🔁 Pandas Pivoting – Reshape Your Data for Better Analysis
🧲 Introduction – What Is Pivoting in Pandas?
Pivoting in Pandas allows you to reshape data from a long format to a wide format. It’s especially useful for transforming repeated observations into structured columns. You can pivot tables using .pivot() for simple reshaping, or .pivot_table() for advanced aggregation.
🎯 In this guide, you’ll learn:
- Use
.pivot()to reshape DataFrames - Use
.pivot_table()to aggregate while pivoting - Handle multiple index/column values
- Fill missing values and customize aggregation logic
📥 1. Create a Sample DataFrame
import pandas as pd
df = pd.DataFrame({
'Department': ['HR', 'HR', 'IT', 'IT'],
'Year': [2023, 2024, 2023, 2024],
'Budget': [50000, 52000, 60000, 63000]
})
👉 Original DataFrame (long format):
Department Year Budget
0 HR 2023 50000
1 HR 2024 52000
2 IT 2023 60000
3 IT 2024 63000
🔄 2. Pivot with .pivot()
df.pivot(index='Department', columns='Year', values='Budget')
👉 Output:
Year 2023 2024
Department
HR 50000 52000
IT 60000 63000
✔️ Transforms 'Year' values into new columns with corresponding 'Budget'.
⚠️ 3. .pivot() Requires Unique Index/Column Pairs
If the combination of index and columns is not unique, .pivot() will raise:
ValueError: Index contains duplicate entries, cannot reshape
🛠️ Use .pivot_table() to aggregate in such cases.
🧮 4. Pivot with Aggregation Using .pivot_table()
df2 = pd.DataFrame({
'Department': ['HR', 'HR', 'IT', 'IT', 'IT'],
'Year': [2023, 2023, 2023, 2024, 2024],
'Budget': [50000, 52000, 60000, 63000, 61000]
})
df2.pivot_table(index='Department', columns='Year', values='Budget', aggfunc='mean')
👉 Output:
Year 2023 2024
Department
HR 51000 NaN
IT 60000 62000
✔️ Takes the average where multiple values exist.
🧼 5. Fill Missing Values with fill_value
df2.pivot_table(index='Department', columns='Year', values='Budget', aggfunc='mean', fill_value=0)
✔️ Replaces NaN with 0 or any custom value.
🧠 6. Pivot with Multiple Index/Column Levels
df3 = pd.DataFrame({
'Region': ['East', 'East', 'West', 'West'],
'Department': ['HR', 'IT', 'HR', 'IT'],
'Year': [2023, 2023, 2024, 2024],
'Budget': [50000, 60000, 52000, 63000]
})
df3.pivot_table(index=['Region', 'Department'], columns='Year', values='Budget')
✔️ Produces a MultiIndex pivot table with hierarchical rows.
📉 7. Reverse Pivot with .melt()
df_pivot = df.pivot(index='Department', columns='Year', values='Budget').reset_index()
pd.melt(df_pivot, id_vars='Department', var_name='Year', value_name='Budget')
✔️ Converts wide format back to long format.
📌 Summary – Key Takeaways
Pivoting helps restructure your data to analyze, compare, and visualize it better. Use .pivot() when values are unique, and .pivot_table() when you need aggregation.
🔍 Key Takeaways:
- Use
.pivot()for simple reshaping (requires unique keys) - Use
.pivot_table()for grouped reshaping with aggregation - Fill missing data with
fill_value - Use
.melt()to reverse pivoted data - Multi-indexes can be used in both rows and columns
⚙️ Real-world relevance: Common in business reports, Excel-style dashboards, time-based summaries, and categorical comparisons.
❓ FAQs – Pivoting in Pandas
❓ When should I use .pivot() vs .pivot_table()?
- Use
.pivot()when index/column pairs are unique - Use
.pivot_table()for aggregation + reshaping
❓ Can I pivot on multiple columns?
Yes:
pivot_table(index=['A', 'B'], columns='C', values='D')
❓ What if I get duplicates when pivoting?
Switch to .pivot_table() and specify aggfunc (e.g., 'mean', 'sum').
❓ How do I flatten a MultiIndex column after pivoting?
Use:
df.columns = ['_'.join(map(str, col)) for col in df.columns]
❓ Can I reverse a pivoted table?
✅ Yes, using .melt().
Share Now :
