R Data Structures – Vectors, Lists, Matrices, and More
Introduction – What Are Data Structures in R?
In R, data structures are essential tools for organizing, storing, and manipulating data. Whether you’re working with a list of names, a grid of values, or tabular data, R provides a variety of built-in data structures tailored to different types of analysis.
Each structure has specific characteristics for dimensionality, homogeneity, and indexing, making them useful for various statistical and data science tasks.
In this guide, you’ll learn:
- The key data structures in R
- How to create and access elements in each structure
- Differences between structures like vectors, lists, matrices, and data frames
- Real-world examples of when to use each
Overview of Core Data Structures in R
| Data Structure | Dimensionality | Homogeneous | Description |
|---|---|---|---|
| Vector | 1D | Yes | Single type (numeric, character, etc.) |
| List | 1D | No | Collection of mixed types |
| Matrix | 2D | Yes | Rows × Columns, single type |
| Array | nD | Yes | Multidimensional, single type |
| Data Frame | 2D | No | Table of columns (like a spreadsheet) |
| Factor | 1D | Categorical | Encoded categorical values |
1. Vector – Most Basic Data Type
num_vec <- c(1, 2, 3, 4)
char_vec <- c("apple", "banana")
All elements must be of the same type.
Access:
num_vec[2] # 2
2. List – A Collection of Mixed Objects
my_list <- list(name = "Alice", age = 25, passed = TRUE)
Elements can be of different types and lengths.
Access:
my_list$name # "Alice"
my_list[[2]] # 25
3. Matrix – 2D Homogeneous Structure
m <- matrix(1:6, nrow = 2, ncol = 3)
All elements must be of the same type.
Access:
m[1, 2] # Row 1, Col 2
4. Array – Multidimensional Extension of Matrix
arr <- array(1:12, dim = c(2, 3, 2))
Useful for storing 3D or 4D data in statistical models.
5. Data Frame – Table of Heterogeneous Columns
df <- data.frame(Name = c("Tom", "Lily"), Score = c(90, 85))
Each column can be a different type, like a spreadsheet.
Access:
df$Score # [1] 90 85
df[1, "Name"] # "Tom"
6. Factor – Encoded Categorical Data
grades <- factor(c("A", "B", "A", "C", "B"))
Internally stores as integers with labels. Useful for modeling.
Access:
levels(grades) # "A" "B" "C"
table(grades) # Count occurrences
Convert Between Data Structures
| From | To | Function |
|---|---|---|
| Vector | Matrix | matrix() |
| Matrix | Data Frame | as.data.frame() |
| List | Vector | unlist() |
| Factor | Character | as.character() |
| Data Frame | Matrix | as.matrix() |
Summary – Recap & Next Steps
R offers flexible data structures for storing everything from raw numbers to labeled observations. Each has its role—from simple vectors to complex arrays and data frames.
Key Takeaways:
- Use vectors for 1D homogeneous data
- Use lists for storing mixed types and nested data
- Use matrices for 2D numeric or logical data
- Use arrays for higher-dimensional numeric data
- Use data frames for row-column datasets like spreadsheets
- Use factors for modeling categorical variables
Real-World Relevance:
Understanding data structures is vital for tasks like building models, wrangling datasets, and creating visualizations in R-based data science workflows.
FAQs – Data Structures in R
What is the difference between a vector and a list?
Vectors are homogeneous (same type), lists are heterogeneous (mixed types).
When should I use a matrix vs a data frame?
Use a matrix for numeric-only data; use a data frame for labeled tabular data with mixed column types.
Can a list hold another list in R?
Yes! Lists can be nested:
nested_list <- list(a = list(x = 1, y = 2))
How do I check the structure of an object?
Use str():
str(df)
What are factors used for?
Factors are used for representing categorical data in modeling and visualizations.
Share Now :
