🧾 Pandas DataFrames – The Foundation of Tabular Data in Python
🧲 Introduction – What Is a Pandas DataFrame?
A Pandas DataFrame is a two-dimensional, tabular data structure that resembles a spreadsheet or SQL table. It’s one of the most powerful and widely used data types in Python for data analysis, manipulation, and visualization.
🎯 In this guide, you’ll learn:
- How to create DataFrames from different data sources
- Accessing and modifying data inside DataFrames
- Key operations like filtering, slicing, and aggregating
- Real-world examples of DataFrame usage
🛠️ 1. Create a DataFrame from a Dictionary
import pandas as pd
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Score': [85.5, 90.3, 78.9]
}
df = pd.DataFrame(data)
print(df)
👉 Output:
     Name  Age  Score
0   Alice   25   85.5
1     Bob   30   90.3
2  Charlie   35   78.9
✅ Each key becomes a column; values become rows.
📂 2. Create DataFrame from List of Lists
data = [['Alice', 25], ['Bob', 30], ['Charlie', 35]]
df = pd.DataFrame(data, columns=['Name', 'Age'])
print(df)
✅ Use this when your source data is structured like rows in a table.
📥 3. Create DataFrame from CSV/Excel
df = pd.read_csv('data.csv')      # From CSV
df = pd.read_excel('data.xlsx')   # From Excel
✅ Pandas supports many formats including JSON, SQL, clipboard, and HTML.
🔍 4. Inspecting DataFrames
| Method | Description | 
|---|---|
| df.head() | First 5 rows | 
| df.tail() | Last 5 rows | 
| df.shape | Tuple of (rows, columns) | 
| df.info() | Summary of structure | 
| df.describe() | Statistical summary of numeric columns | 
| df.columns | Column labels | 
| df.index | Row index info | 
🎯 5. Accessing Data
Access Column(s)
print(df['Name'])           # Single column
print(df[['Name', 'Score']]) # Multiple columns
Access Row(s)
print(df.loc[0])   # By label
print(df.iloc[1])  # By position
✅ Use .loc[] for label-based and .iloc[] for position-based access.
🔄 6. Modifying Data
df['Passed'] = df['Score'] > 80    # Add new column
df.at[1, 'Age'] = 32               # Modify specific cell
✅ DataFrames are mutable—modify structure and content easily.
✂️ 7. Filtering and Slicing
print(df[df['Age'] > 28])         # Filter
print(df.iloc[0:2])               # Slice by position
print(df.loc[0:1, ['Name']])      # Slice by label and column
🔁 8. Aggregation and Summary Stats
print(df['Age'].mean())           # Average age
print(df['Score'].sum())          # Total score
print(df[['Age', 'Score']].max()) # Max values
✅ Supports built-in aggregation: mean(), sum(), max(), min(), count()
📋 9. Common DataFrame Operations
| Operation | Syntax Example | 
|---|---|
| Rename columns | df.rename(columns={'Age':'Years'}) | 
| Drop column | df.drop('Score', axis=1) | 
| Drop row | df.drop(0, axis=0) | 
| Sort by column | df.sort_values(by='Score') | 
| Reset index | df.reset_index(drop=True) | 
| Set new index | df.set_index('Name') | 
📌 Summary – Recap & Next Steps
Pandas DataFrames offer a rich, intuitive interface for 2D structured data, letting you manipulate, filter, and analyze information efficiently and expressively.
🔍 Key Takeaways:
- DataFrames are 2D tables with labeled rows and columns
- Easily created from dictionaries, lists, or files
- Use .loc[]and.iloc[]for flexible data access
- Perform filtering, aggregation, and transformation in a few lines
⚙️ Real-world relevance: DataFrames are used in everything from business analytics and machine learning to ETL pipelines and reporting dashboards.
❓ FAQs – Pandas DataFrames
❓ What is the difference between Series and DataFrame?
✅ A Series is 1D; a DataFrame is 2D with multiple columns.
❓ Can a DataFrame contain different data types?
✅ Yes. Each column can have a different data type.
❓ How to change the column order in a DataFrame?
Use:
df = df[['Score', 'Name', 'Age']]
❓ How do I export a DataFrame to CSV?
Use:
df.to_csv('output.csv', index=False)
❓ Can I merge or join DataFrames?
✅ Yes. Use pd.merge(), df.join(), or pd.concat() for combining.
Share Now :
