Pandas Iteration Over Data – Loop Through Rows & Columns Effectively
Introduction – Why Iterate in Pandas?
While Pandas is designed for vectorized operations (not loops), there are situations where row-by-row or column-wise iteration is necessary—for custom logic, debugging, or exporting row-wise data. Pandas provides several ways to iterate over data safely and efficiently when needed.
In this guide, you’ll learn:
- Different ways to iterate through DataFrame rows and columns
- When to use
.iterrows(),.itertuples(), and.apply() - Best practices and performance tips
- Common use cases and pitfalls
1. Sample DataFrame for Demonstration
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'Score': [88, 92, 85]
})
2. Iterate Over Rows Using .iterrows()
for index, row in df.iterrows():
print(f"{row['Name']} is {row['Age']} years old with a score of {row['Score']}")
✔️ Yields each row as a (index, Series) pair.
Slower than vectorized operations. Not recommended for large DataFrames.
3. Iterate Over Rows Using .itertuples()
for row in df.itertuples():
print(f"{row.Name} scored {row.Score} at age {row.Age}")
✔️ Yields each row as a named tuple. Faster and more memory-efficient than .iterrows().
4. Iterate with .apply() – Preferred for Logic on Columns
df['Status'] = df.apply(lambda row: 'Pass' if row['Score'] >= 90 else 'Fail', axis=1)
✔️ Applies a function row-wise (axis=1) or column-wise (axis=0). Vectorized and faster than looping.
5. Iterate Over Columns
for col in df.columns:
print(f"Column '{col}' has values:\n{df[col].tolist()}")
✔️ Loops through each column name and its corresponding data.
6. Iterate with .items() – Column Iteration (Series Format)
for col_name, col_data in df.items():
print(f"{col_name} → {list(col_data)}")
✔️ Similar to dict.items() – returns column name and Series object for each column.
7. Avoid .iloc in Loops When Possible
for i in range(len(df)):
print(df.iloc[i]['Name']) # Not efficient
Slower than itertuples() or apply(), especially with large datasets.
8. Modify Rows During Iteration (Copy Required)
df_copy = df.copy()
for i, row in df_copy.iterrows():
if row['Score'] < 90:
df_copy.at[i, 'Status'] = 'Improve'
✔️ Use .at[] for efficient single-value updates.
9. Conditional Iteration Example
for row in df.itertuples():
if row.Score >= 90:
print(f"{row.Name} is an A-grade student")
✔️ Use logic during iteration for filtering or reporting.
Summary – Key Takeaways
Iteration in Pandas should be used with care. It’s best to avoid explicit loops when possible by using vectorized operations or .apply(). When necessary, use .itertuples() for speed and .iterrows() for flexibility.
Key Takeaways:
.iterrows()→ flexible but slow.itertuples()→ fast and memory-efficient.apply()→ best for row-wise logic.items()→ for column-wise iteration- Avoid
.ilocin loops—inefficient
Real-world relevance: Useful in custom reporting, row-level validations, exporting, and conditional modifications.
FAQs – Iterating in Pandas
Which is faster – iterrows() or itertuples()?
itertuples() is significantly faster and uses less memory.
Can I modify a DataFrame during iteration?
Yes, but avoid modifying inside iterrows(). Instead, use .at[] or .loc[] on a copy.
Should I always avoid loops in Pandas?
Use vectorized operations where possible. But if logic is complex or not vectorizable, loops are fine.
How do I apply conditional logic to each row?
Use:
df.apply(lambda row: ..., axis=1)
Can I iterate over DataFrame columns only?
Yes:
for col in df.columns: ...
Share Now :
