Statistical Analysis with R
Estimated reading: 3 minutes 42 views

📈 R Linear & Multiple Regression – Predict and Analyze Relationships with Code Examples


🧲 Introduction – Modeling Relationships in R

Regression analysis is a cornerstone of statistical modeling. It helps you understand and predict the relationship between variables.

  • Use linear regression for one predictor variable
  • Use multiple regression for two or more predictors

R provides the lm() function, making it easy to fit, summarize, and visualize regression models.

🎯 In this guide, you’ll learn:

  • How to perform linear and multiple regression in R
  • Interpret regression output: coefficients, R-squared, residuals
  • Plot regression lines with plot() and ggplot2
  • Use real datasets like mtcars for demonstrations

🔹 1. Simple Linear Regression in R

✅ Example: Predict MPG based on Weight

data(mtcars)
model <- lm(mpg ~ wt, data = mtcars)
summary(model)

🔍 Output Highlights:

Coefficients:
(Intercept)        wt  
   37.285       -5.344  

Multiple R-squared: 0.753

🔍 Interpretation:

  • Intercept (37.285): Expected MPG when weight = 0
  • Slope (-5.344): Each unit increase in wt decreases MPG by ~5.3
  • R-squared (0.753): Model explains 75.3% of the variation in MPG

📊 2. Visualize Linear Regression Line

plot(mtcars$wt, mtcars$mpg, main = "MPG vs Weight")
abline(model, col = "blue", lwd = 2)

✅ With ggplot2:

library(ggplot2)
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth(method = "lm", col = "red") +
  labs(title = "Linear Regression: MPG ~ Weight")

🔹 3. Multiple Linear Regression

✅ Example: Predict MPG using multiple predictors

multi_model <- lm(mpg ~ wt + hp + cyl, data = mtcars)
summary(multi_model)

🔍 Output Highlights:

Coefficients:
(Intercept)        wt         hp        cyl  
   38.752     -3.167     -0.018     -1.227  

Multiple R-squared: 0.843

🔍 Interpretation:

  • More variables → potentially better fit
  • wt, hp, cyl all negatively affect MPG
  • R² = 0.843: Explains 84.3% of MPG variance

🧠 4. Compare Models (ANOVA)

anova(model, multi_model)

🔍 Use case:

  • Compare nested models (e.g., simple vs multiple regression)
  • See if adding variables significantly improves the model

🧮 5. Predict Values Using Model

newdata <- data.frame(wt = 3)
predict(model, newdata)

✅ For multiple regression:

predict(multi_model, data.frame(wt = 3, hp = 150, cyl = 6))

📋 6. Check Residuals and Diagnostics

par(mfrow = c(2, 2))
plot(model)

🔍 Produces:

  • Residuals vs Fitted
  • Normal Q-Q
  • Scale-Location
  • Residuals vs Leverage

These help you validate assumptions of linear regression.


📐 Regression Model Summary Table

TermDescription
InterceptValue of Y when all predictors = 0
CoefficientImpact of 1-unit change in X on Y
R-squared% of variance explained by model
p-valueSignificance of each predictor
ResidualsDifferences between predicted and actual Y

📌 Summary – Recap & Next Steps

Regression in R is a powerful and easy-to-use tool for modeling relationships. Whether it’s a single predictor or multiple, lm() gives you everything needed to fit, evaluate, and visualize regression models.

🔍 Key Takeaways:

  • Use lm() for linear and multiple regression
  • Interpret slope, intercept, and R² carefully
  • Visualize with abline() or geom_smooth()
  • Validate with residual plots and ANOVA
  • Predict outcomes using predict()

⚙️ Real-World Relevance:
Used in finance, marketing, engineering, machine learning, econometrics, and healthcare analytics for prediction, causality, and insight generation.


❓ FAQs – Linear & Multiple Regression in R

❓ How do I perform linear regression in R?
✅ Use:

lm(response ~ predictor, data = dataset)

❓ What does R-squared mean?
✅ It measures the proportion of variance in the response variable explained by the predictors.

❓ How to predict new values using regression?
✅ Use predict() with a new data frame:

predict(model, newdata)

❓ How to check if a model is statistically significant?
✅ Look at the p-values of coefficients and overall F-statistic in summary().

❓ How to plot regression line in R?
✅ Use:

abline(model)  # base R  
geom_smooth(method = "lm")  # ggplot2

Share Now :

Leave a Reply

Your email address will not be published. Required fields are marked *

Share

R – Linear / Multiple Regression

Or Copy Link

CONTENTS
Scroll to Top