🗂️ R Data Structures – Vectors, Lists, Matrices, and More
🧲 Introduction – What Are Data Structures in R?
In R, data structures are essential tools for organizing, storing, and manipulating data. Whether you’re working with a list of names, a grid of values, or tabular data, R provides a variety of built-in data structures tailored to different types of analysis.
Each structure has specific characteristics for dimensionality, homogeneity, and indexing, making them useful for various statistical and data science tasks.
🎯 In this guide, you’ll learn:
- The key data structures in R
- How to create and access elements in each structure
- Differences between structures like vectors, lists, matrices, and data frames
- Real-world examples of when to use each
🧩 Overview of Core Data Structures in R
Data Structure | Dimensionality | Homogeneous | Description |
---|---|---|---|
Vector | 1D | ✅ Yes | Single type (numeric, character, etc.) |
List | 1D | ❌ No | Collection of mixed types |
Matrix | 2D | ✅ Yes | Rows × Columns, single type |
Array | nD | ✅ Yes | Multidimensional, single type |
Data Frame | 2D | ❌ No | Table of columns (like a spreadsheet) |
Factor | 1D | ✅ Categorical | Encoded categorical values |
🧮 1. Vector – Most Basic Data Type
num_vec <- c(1, 2, 3, 4)
char_vec <- c("apple", "banana")
✅ All elements must be of the same type.
Access:
num_vec[2] # 2
🧳 2. List – A Collection of Mixed Objects
my_list <- list(name = "Alice", age = 25, passed = TRUE)
✅ Elements can be of different types and lengths.
Access:
my_list$name # "Alice"
my_list[[2]] # 25
🧱 3. Matrix – 2D Homogeneous Structure
m <- matrix(1:6, nrow = 2, ncol = 3)
✅ All elements must be of the same type.
Access:
m[1, 2] # Row 1, Col 2
🧊 4. Array – Multidimensional Extension of Matrix
arr <- array(1:12, dim = c(2, 3, 2))
✅ Useful for storing 3D or 4D data in statistical models.
📊 5. Data Frame – Table of Heterogeneous Columns
df <- data.frame(Name = c("Tom", "Lily"), Score = c(90, 85))
✅ Each column can be a different type, like a spreadsheet.
Access:
df$Score # [1] 90 85
df[1, "Name"] # "Tom"
🎯 6. Factor – Encoded Categorical Data
grades <- factor(c("A", "B", "A", "C", "B"))
✅ Internally stores as integers with labels. Useful for modeling.
Access:
levels(grades) # "A" "B" "C"
table(grades) # Count occurrences
🔄 Convert Between Data Structures
From | To | Function |
---|---|---|
Vector | Matrix | matrix() |
Matrix | Data Frame | as.data.frame() |
List | Vector | unlist() |
Factor | Character | as.character() |
Data Frame | Matrix | as.matrix() |
📌 Summary – Recap & Next Steps
R offers flexible data structures for storing everything from raw numbers to labeled observations. Each has its role—from simple vectors to complex arrays and data frames.
🔍 Key Takeaways:
- Use vectors for 1D homogeneous data
- Use lists for storing mixed types and nested data
- Use matrices for 2D numeric or logical data
- Use arrays for higher-dimensional numeric data
- Use data frames for row-column datasets like spreadsheets
- Use factors for modeling categorical variables
⚙️ Real-World Relevance:
Understanding data structures is vital for tasks like building models, wrangling datasets, and creating visualizations in R-based data science workflows.
❓ FAQs – Data Structures in R
❓ What is the difference between a vector and a list?
✅ Vectors are homogeneous (same type), lists are heterogeneous (mixed types).
❓ When should I use a matrix vs a data frame?
✅ Use a matrix for numeric-only data; use a data frame for labeled tabular data with mixed column types.
❓ Can a list hold another list in R?
✅ Yes! Lists can be nested:
nested_list <- list(a = list(x = 1, y = 2))
❓ How do I check the structure of an object?
✅ Use str()
:
str(df)
❓ What are factors used for?
✅ Factors are used for representing categorical data in modeling and visualizations.
Share Now :