Estimated reading: 3 minutes 112 views

🧾 Pandas DataFrames – The Foundation of Tabular Data in Python

🧲 Introduction – What Is a Pandas DataFrame?

A Pandas DataFrame is a two-dimensional, tabular data structure that resembles a spreadsheet or SQL table. It’s one of the most powerful and widely used data types in Python for data analysis, manipulation, and visualization.

🎯 In this guide, you’ll learn:

How to create DataFrames from different data sources
Accessing and modifying data inside DataFrames
Key operations like filtering, slicing, and aggregating
Real-world examples of DataFrame usage

🛠️ 1. Create a DataFrame from a Dictionary

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Score': [85.5, 90.3, 78.9]
}

df = pd.DataFrame(data)
print(df)

👉 Output:

     Name  Age  Score
0   Alice   25   85.5
1     Bob   30   90.3
2  Charlie   35   78.9

✅ Each key becomes a column; values become rows.

📂 2. Create DataFrame from List of Lists

data = [['Alice', 25], ['Bob', 30], ['Charlie', 35]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
print(df)

✅ Use this when your source data is structured like rows in a table.

📥 3. Create DataFrame from CSV/Excel

df = pd.read_csv('data.csv')      # From CSV
df = pd.read_excel('data.xlsx')   # From Excel

✅ Pandas supports many formats including JSON, SQL, clipboard, and HTML.

🔍 4. Inspecting DataFrames

Method	Description
`df.head()`	First 5 rows
`df.tail()`	Last 5 rows
`df.shape`	Tuple of (rows, columns)
`df.info()`	Summary of structure
`df.describe()`	Statistical summary of numeric columns
`df.columns`	Column labels
`df.index`	Row index info

🎯 5. Accessing Data

Access Column(s)

print(df['Name'])           # Single column
print(df[['Name', 'Score']]) # Multiple columns

Access Row(s)

print(df.loc[0])   # By label
print(df.iloc[1])  # By position

✅ Use .loc[] for label-based and .iloc[] for position-based access.

🔄 6. Modifying Data

df['Passed'] = df['Score'] > 80    # Add new column
df.at[1, 'Age'] = 32               # Modify specific cell

✅ DataFrames are mutable—modify structure and content easily.

✂️ 7. Filtering and Slicing

print(df[df['Age'] > 28])         # Filter
print(df.iloc[0:2])               # Slice by position
print(df.loc[0:1, ['Name']])      # Slice by label and column

🔁 8. Aggregation and Summary Stats

print(df['Age'].mean())           # Average age
print(df['Score'].sum())          # Total score
print(df[['Age', 'Score']].max()) # Max values

✅ Supports built-in aggregation: mean(), sum(), max(), min(), count()

📋 9. Common DataFrame Operations

Operation	Syntax Example
Rename columns	`df.rename(columns={'Age':'Years'})`
Drop column	`df.drop('Score', axis=1)`
Drop row	`df.drop(0, axis=0)`
Sort by column	`df.sort_values(by='Score')`
Reset index	`df.reset_index(drop=True)`
Set new index	`df.set_index('Name')`

📌 Summary – Recap & Next Steps

Pandas DataFrames offer a rich, intuitive interface for 2D structured data, letting you manipulate, filter, and analyze information efficiently and expressively.

🔍 Key Takeaways:

DataFrames are 2D tables with labeled rows and columns
Easily created from dictionaries, lists, or files
Use .loc[] and .iloc[] for flexible data access
Perform filtering, aggregation, and transformation in a few lines

⚙️ Real-world relevance: DataFrames are used in everything from business analytics and machine learning to ETL pipelines and reporting dashboards.

❓ FAQs – Pandas DataFrames

❓ What is the difference between Series and DataFrame?
✅ A Series is 1D; a DataFrame is 2D with multiple columns.

❓ Can a DataFrame contain different data types?
✅ Yes. Each column can have a different data type.

❓ How to change the column order in a DataFrame?
Use:

df = df[['Score', 'Name', 'Age']]

❓ How do I export a DataFrame to CSV?
Use:

df.to_csv('output.csv', index=False)

❓ Can I merge or join DataFrames?
✅ Yes. Use pd.merge(), df.join(), or pd.concat() for combining.

« Previous Next »

Share Now :