🔁 Pandas Iteration Over Data – Loop Through Rows & Columns Effectively
🧲 Introduction – Why Iterate in Pandas?
While Pandas is designed for vectorized operations (not loops), there are situations where row-by-row or column-wise iteration is necessary—for custom logic, debugging, or exporting row-wise data. Pandas provides several ways to iterate over data safely and efficiently when needed.
🎯 In this guide, you’ll learn:
- Different ways to iterate through DataFrame rows and columns
- When to use
.iterrows()
,.itertuples()
, and.apply()
- Best practices and performance tips
- Common use cases and pitfalls
📥 1. Sample DataFrame for Demonstration
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'Score': [88, 92, 85]
})
🔁 2. Iterate Over Rows Using .iterrows()
for index, row in df.iterrows():
print(f"{row['Name']} is {row['Age']} years old with a score of {row['Score']}")
✔️ Yields each row as a (index, Series) pair.
⚠️ Slower than vectorized operations. Not recommended for large DataFrames.
📦 3. Iterate Over Rows Using .itertuples()
for row in df.itertuples():
print(f"{row.Name} scored {row.Score} at age {row.Age}")
✔️ Yields each row as a named tuple. Faster and more memory-efficient than .iterrows()
.
🧠 4. Iterate with .apply()
– Preferred for Logic on Columns
df['Status'] = df.apply(lambda row: 'Pass' if row['Score'] >= 90 else 'Fail', axis=1)
✔️ Applies a function row-wise (axis=1
) or column-wise (axis=0
). Vectorized and faster than looping.
🔄 5. Iterate Over Columns
for col in df.columns:
print(f"Column '{col}' has values:\n{df[col].tolist()}")
✔️ Loops through each column name and its corresponding data.
🔬 6. Iterate with .items()
– Column Iteration (Series Format)
for col_name, col_data in df.items():
print(f"{col_name} → {list(col_data)}")
✔️ Similar to dict.items()
– returns column name and Series object for each column.
🚫 7. Avoid .iloc
in Loops When Possible
for i in range(len(df)):
print(df.iloc[i]['Name']) # Not efficient
❌ Slower than itertuples()
or apply()
, especially with large datasets.
🧮 8. Modify Rows During Iteration (Copy Required)
df_copy = df.copy()
for i, row in df_copy.iterrows():
if row['Score'] < 90:
df_copy.at[i, 'Status'] = 'Improve'
✔️ Use .at[]
for efficient single-value updates.
🧪 9. Conditional Iteration Example
for row in df.itertuples():
if row.Score >= 90:
print(f"{row.Name} is an A-grade student")
✔️ Use logic during iteration for filtering or reporting.
📌 Summary – Key Takeaways
Iteration in Pandas should be used with care. It’s best to avoid explicit loops when possible by using vectorized operations or .apply()
. When necessary, use .itertuples()
for speed and .iterrows()
for flexibility.
🔍 Key Takeaways:
.iterrows()
→ flexible but slow.itertuples()
→ fast and memory-efficient.apply()
→ best for row-wise logic.items()
→ for column-wise iteration- Avoid
.iloc
in loops—inefficient
⚙️ Real-world relevance: Useful in custom reporting, row-level validations, exporting, and conditional modifications.
❓ FAQs – Iterating in Pandas
❓ Which is faster – iterrows()
or itertuples()
?
✅ itertuples()
is significantly faster and uses less memory.
❓ Can I modify a DataFrame during iteration?
Yes, but avoid modifying inside iterrows()
. Instead, use .at[]
or .loc[]
on a copy.
❓ Should I always avoid loops in Pandas?
Use vectorized operations where possible. But if logic is complex or not vectorizable, loops are fine.
❓ How do I apply conditional logic to each row?
Use:
df.apply(lambda row: ..., axis=1)
❓ Can I iterate over DataFrame columns only?
Yes:
for col in df.columns: ...
Share Now :