🔃 Pandas Sorting and Reindexing – Organize Your Data Efficiently
🧲 Introduction – Why Sort and Reindex?
Sorting and reindexing are key steps in data cleaning, presentation, and efficient lookup. Pandas provides powerful tools to sort DataFrames by index or values, and to reindex rows or columns, helping you manage missing values and ensure consistent structure.
🎯 In this guide, you’ll learn:
- How to sort by index and by column values
- How to reindex with new labels or reshuffle the structure
- Handle missing labels during reindexing
- Reset or set a custom index
📥 1. Create a Sample DataFrame
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Score': [88, 95, 70, 82]
}, index=['d', 'b', 'a', 'c'])
print(df)
👉 Output:
Name Score
d Alice 88
b Bob 95
a Charlie 70
c David 82
🔢 2. Sort by Index
df_sorted_index = df.sort_index()
✔️ Sorts the DataFrame by its row index (alphabetical by default).
👉 Output:
Name Score
a Charlie 70
b Bob 95
c David 82
d Alice 88
Descending Order
df.sort_index(ascending=False)
✔️ Reverse sort the index.
🔠 3. Sort by Column Values
df_sorted_score = df.sort_values(by='Score')
✔️ Sorts DataFrame by the Score
column in ascending order.
👉 Output:
Name Score
a Charlie 70
c David 82
d Alice 88
b Bob 95
Multiple Columns Sorting
df.sort_values(by=['Score', 'Name'], ascending=[True, False])
✔️ Sorts by Score
, then by Name
(descending) for tied scores.
🔁 4. Reindex Rows with a New Order
df_reindexed = df.reindex(['a', 'b', 'c', 'd'])
✔️ Reorders rows by the specified index list.
👉 Output:
Name Score
a Charlie 70
b Bob 95
c David 82
d Alice 88
🧱 5. Reindex with Missing Labels
df.reindex(['a', 'b', 'x'])
✔️ Returns a DataFrame where label 'x'
does not exist—fills with NaN
.
👉 Output:
Name Score
a Charlie 70.0
b Bob 95.0
x NaN NaN
🧹 6. Fill Missing Data When Reindexing
df.reindex(['a', 'b', 'x'], fill_value='Unknown')
✔️ Fills missing labels with a default value.
🔧 7. Reset and Set Index
Reset Index
df_reset = df.reset_index()
✔️ Moves current index to a column and resets to default integer index.
Set a Column as Index
df_set = df.reset_index().set_index('Name')
✔️ Uses the 'Name'
column as the new index.
🧮 8. Reindex Columns
df.reindex(columns=['Score', 'Name'])
✔️ Changes the column order of the DataFrame.
📌 Summary – Key Takeaways
Sorting and reindexing help structure your DataFrame for clean reports, comparisons, and computation. Pandas offers .sort_index()
, .sort_values()
, and .reindex()
with great flexibility for reshaping and managing tabular data.
🔍 Key Takeaways:
- Sort by index with
.sort_index()
, or by column with.sort_values()
- Reindex with
.reindex()
to reorder or align datasets - Handle missing index/column labels gracefully
- Reset or set index with
.reset_index()
and.set_index()
⚙️ Real-world relevance: Used in data transformation, report formatting, merging datasets, and time series alignment.
❓ FAQs – Sorting and Reindexing in Pandas
❓ What’s the difference between sorting and reindexing?
- Sorting changes the order based on existing data.
- Reindexing aligns or reshapes using explicit label lists.
❓ What happens if I reindex with labels that don’t exist?
Pandas fills those rows or columns with NaN
unless fill_value
is specified.
❓ How do I sort in descending order?
Use:
df.sort_values(by='Score', ascending=False)
❓ Can I reindex both rows and columns at once?
Yes:
df.reindex(index=['a', 'b'], columns=['Score', 'Name'])
❓ Does sorting or reindexing modify the DataFrame?
❌ No—unless you use inplace=True
.
Share Now :