🔄 Pandas Melting – Convert Wide Format to Long Format for Tidy Analysis
🧲 Introduction – What Is Melting in Pandas?
Melting is the process of transforming a wide-format DataFrame into a long-format (tidy) structure where each row represents a single observation. Pandas provides .melt()
to help you reshape data—ideal for time series, plotting, aggregation, and machine learning preprocessing.
🎯 In this guide, you’ll learn:
- How to use
.melt()
to reshape your DataFrame - Control identifier and value columns
- Customize column names
- Handle multiple variable/value pairs
📥 1. Sample Wide-Format DataFrame
import pandas as pd
df = pd.DataFrame({
'Employee': ['Alice', 'Bob'],
'2022_Sales': [25000, 30000],
'2023_Sales': [27000, 32000]
})
👉 Output:
Employee 2022_Sales 2023_Sales
0 Alice 25000 27000
1 Bob 30000 32000
🔁 2. Melt the DataFrame
df_melted = pd.melt(df, id_vars='Employee', var_name='Year', value_name='Sales')
👉 Output:
Employee Year Sales
0 Alice 2022_Sales 25000
1 Bob 2022_Sales 30000
2 Alice 2023_Sales 27000
3 Bob 2023_Sales 32000
✔️ Converts column headers into rows, aligning each value with its identifier (Employee
).
✂️ 3. Clean Melted Data (Optional: Extract Year)
df_melted['Year'] = df_melted['Year'].str.extract(r'(\d{4})')
👉 Output:
Employee Year Sales
0 Alice 2022 25000
1 Bob 2022 30000
2 Alice 2023 27000
3 Bob 2023 32000
✔️ Extracts clean year values from original column headers.
🧠 4. Melt with Multiple ID Variables
df2 = pd.DataFrame({
'Region': ['East', 'West'],
'Employee': ['Alice', 'Bob'],
'Q1': [10000, 12000],
'Q2': [11000, 12500]
})
pd.melt(df2, id_vars=['Region', 'Employee'], var_name='Quarter', value_name='Revenue')
✔️ Use multiple identifier columns to preserve grouping context.
🧪 5. Melt with Multiple Value Variables (advanced)
If working with more than one group of related value columns, first reshape one group, then merge or join as needed.
🔁 6. Reverse .melt()
with .pivot()
df_melted.pivot(index='Employee', columns='Year', values='Sales').reset_index()
✔️ Converts long format back to wide format.
📌 Summary – Key Takeaways
.melt()
is perfect for transforming wide-format tables into a tidy structure. It’s commonly used for longitudinal data, visualization, and grouped analysis.
🔍 Key Takeaways:
- Use
pd.melt()
to convert columns into rows - Set
id_vars
for identifier columns, andvalue_name
,var_name
to rename - Combine with
.str.extract()
for column name cleanup - Melt multiple value groups with careful merging
- Use
.pivot()
to reverse the melt
⚙️ Real-world relevance: Used in reporting dashboards, time-series analysis, ML preprocessing, and data normalization.
❓ FAQs – Melting in Pandas
❓ When should I use .melt()
?
Use it when your data is in wide format and you need a long, tidy format for grouping, plotting, or modeling.
❓ Can I melt multiple sets of columns?
Not directly—you must melt them separately and merge manually.
❓ What’s the difference between melt()
and stack()
?
melt()
is more flexible and readable, especially for flat DataFramesstack()
is ideal for MultiIndex structures
❓ Does melt()
modify the original DataFrame?
No, it returns a new DataFrame unless inplace=True
is set.
❓ How do I melt all columns except a few?
Use:
pd.melt(df, id_vars=[col for col in df.columns if col != 'X'])
Share Now :