🧾 R Data Frames – Tabular Data Handling in R Programming
🧲 Introduction – What Are Data Frames in R?
A data frame in R is a two-dimensional, tabular data structure—similar to a spreadsheet or SQL table—where each column can have a different data type (numeric, character, logical, etc.), but all columns must have the same length.
Data frames are central to data analysis in R. Most real-world data (from CSVs, databases, APIs) is loaded into R as a data frame.
🎯 In this guide, you’ll learn:
- How to create, access, and modify data frames
- Use column filtering, row subsetting, and data transformation
- Apply useful built-in functions for summary and structure
- Handle missing values and combine multiple data frames
🛠️ Creating a Data Frame
students <- data.frame(
  Name = c("Alice", "Bob", "Charlie"),
  Age = c(22, 25, 23),
  Passed = c(TRUE, TRUE, FALSE)
)
✅ Data types:
- Name: character
- Age: numeric
- Passed: logical
Use str(students) to inspect structure.
🔍 Accessing Data Frame Elements
📌 By Column Name
students$Name         # Returns column as vector
students[["Age"]]     # Also returns column
🔢 By Index
students[1, ]         # First row
students[, 2]         # Second column
students[2, 1]        # Row 2, Col 1
🔎 Subsetting with Conditions
students[students$Passed == TRUE, ]   # Only those who passed
✏️ Modifying Data Frames
🔄 Add Column
students$Score <- c(90, 85, 70)
🧹 Remove Column
students$Score <- NULL
🔁 Add Row (with rbind())
new_row <- data.frame(Name = "David", Age = 24, Passed = TRUE)
students <- rbind(students, new_row)
🔁 Combining Data Frames
📍 Row Bind
df1 <- data.frame(A = 1:2, B = c("x", "y"))
df2 <- data.frame(A = 3:4, B = c("z", "w"))
rbind(df1, df2)
📍 Column Bind
df3 <- data.frame(C = c(TRUE, FALSE))
cbind(df1, df3)
🧠 Useful Data Frame Functions
| Function | Description | 
|---|---|
| str() | Structure of data frame | 
| summary() | Summary statistics | 
| nrow()/ncol() | Number of rows / columns | 
| names() | Column names | 
| rownames() | Row labels | 
| head()/tail() | Preview top or bottom rows | 
| subset() | Subset rows using condition | 
⚠️ Handling Missing Values (NA)
df <- data.frame(x = c(1, NA, 3))
is.na(df)            # Identify missing
na.omit(df)          # Remove rows with NA
📊 Converting Between Data Types
| Convert From | To | Function | 
|---|---|---|
| Matrix | Data Frame | as.data.frame() | 
| List | Data Frame | as.data.frame() | 
| Data Frame | Matrix | as.matrix() | 
📌 Summary – Recap & Next Steps
Data frames are the most practical and widely-used data structure for structured tabular data in R. Mastering their creation, manipulation, and filtering is key for data science workflows.
🔍 Key Takeaways:
- Create with data.frame()using named columns
- Access using $,[row, col], or logical conditions
- Add/remove rows and columns with rbind()/cbind()
- Use summary(),str(), andhead()for inspection
- Handle missing data with is.na()andna.omit()
⚙️ Real-World Relevance:
Used in nearly every R project: from importing Excel/CSV data, transforming datasets, modeling results, to exporting reports.
❓ FAQs – R Data Frames
❓ What is the difference between a data frame and a matrix in R?
✅ A matrix holds only one data type; a data frame allows mixed types in different columns.
❓ How can I filter rows in a data frame?
✅ Use logical conditions:
df[df$Age > 25, ]
❓ How do I add a new column to a data frame?
✅ Assign a vector directly:
df$NewCol <- c(1, 2, 3)
❓ How do I remove rows with missing values?
✅ Use:
na.omit(df)
❓ How can I preview the top 5 rows?
✅ Use:
head(df, 5)
Share Now :
