4️⃣ 🧹 Pandas Data Cleaning & Preprocessing
Estimated reading: 3 minutes 30 views

🚫 Pandas Cleaning Empty Cells – Handle Missing Data for Clean Analysis


🧲 Introduction – Why Handle Empty Cells?

Empty cells (or missing values) are common in real-world datasets due to incomplete entries, manual data entry errors, or system failures. If left unhandled, they can cause errors in analysis, misleading results, or even break machine learning models. Pandas makes it simple to detect, remove, or fill these cells efficiently.

🎯 In this guide, you’ll learn:

  • How to detect missing/empty cells (NaN)
  • Techniques to drop or fill empty values
  • Customize strategies column-wise
  • Example-driven explanations

🔍 1. Detect Empty Cells

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', np.nan, 'David'],
    'Age': [25, np.nan, 22, 28],
    'Score': [85, 90, np.nan, 88]
})

print(df.isnull())

✔️ .isnull() returns a DataFrame of True/False values indicating where data is missing (NaN).


🧮 2. Count Empty Cells Per Column

print(df.isnull().sum())

✔️ Sums up the number of NaN values in each column. Helps identify which columns are affected most.


🧹 3. Drop Rows with Empty Cells

df_clean = df.dropna()

✔️ Removes any row that has at least one missing cell.

📝 If only specific columns matter:

df_clean = df.dropna(subset=['Name', 'Score'])

✔️ Removes rows only if 'Name' or 'Score' are missing.


✂️ 4. Drop Columns with Empty Cells

df_drop_col = df.dropna(axis=1)

✔️ Removes columns that contain any missing value.


🩹 5. Fill Empty Cells with Static Values

df_filled = df.fillna(0)

✔️ Replaces all NaN values with 0.


🔁 6. Fill Empty Cells with Forward Fill / Backward Fill

df_ffill = df.fillna(method='ffill')

✔️ Forward fill – fills empty cells with the value from the previous row.

df_bfill = df.fillna(method='bfill')

✔️ Backward fill – fills with the value from the next row.


🧠 7. Fill Empty Cells Column-wise with Meaningful Defaults

df['Age'].fillna(df['Age'].mean(), inplace=True)

✔️ Fills missing values in the Age column with the column’s mean.

df['Name'].fillna('Unknown', inplace=True)

✔️ Fills missing names with a default string like 'Unknown'.


📌 Summary – Key Takeaways

  • Detect with .isnull(), count with .sum()
  • Remove empty rows/columns using .dropna()
  • Fill missing values using .fillna() with:
    • Static defaults
    • Mean/median/mode
    • Forward/backward fills

⚙️ Real-world relevance: Cleaning empty cells is essential for data reliability, model input sanity, and accurate visualization.


❓ FAQs – Cleaning Empty Cells in Pandas

❓ What’s the difference between NaN, None, and empty cells?
✅ In Pandas, all are treated as NaN (Not a Number) internally for consistency.


❓ Can I drop rows only if all columns are empty?

df.dropna(how='all')

✔️ Only removes rows if every column is missing.


❓ How do I fill different columns with different strategies?

df.fillna({
  'Age': df['Age'].median(),
  'Name': 'Unknown',
  'Score': 0
}, inplace=True)

✔️ Fills each column with a custom value or strategy.


❓ Does fillna() modify the original DataFrame?
❌ No, unless you use inplace=True.


Share Now :

Leave a Reply

Your email address will not be published. Required fields are marked *

Share

Pandas Cleaning Empty Cells

Or Copy Link

CONTENTS
Scroll to Top