Statistical Analysis with R
Estimated reading: 3 minutes 424 views

R Logistic & Poisson Regression – Model Binary and Count Data Easily


Introduction – Generalized Linear Models in R

Not all data fits into linear regression. When you’re predicting:

  • Binary outcomes (yes/no, 0/1): use Logistic Regression
  • Count outcomes (number of events): use Poisson Regression

Both are part of Generalized Linear Models (GLM) and are easily implemented in R using the glm() function.

In this guide, you’ll learn:

  • How to perform Logistic and Poisson regression in R
  • Use glm() with proper family (binomial, poisson)
  • Interpret coefficients, odds ratios, and model fit
  • Apply models to real datasets and make predictions

1. Logistic Regression in R (Binary Classification)

Example: Predict vs binary outcome

We’ll simulate binary data:

set.seed(123)
df <- data.frame(
  age = c(25, 30, 35, 40, 45, 50, 55),
  income = c(40, 45, 50, 60, 65, 70, 80),
  purchase = c(0, 0, 0, 1, 1, 1, 1)
)

model_log <- glm(purchase ~ age + income, family = binomial, data = df)
summary(model_log)

Key Outputs:

  • Estimate: Coefficients on log-odds scale
  • z value & Pr(>|z|): Indicates significance
  • Null deviance vs Residual deviance: Model fit improvement

Convert Log-Odds to Odds Ratio

exp(coef(model_log))

Output:

(Intercept)         age      income  
  0.0012         1.42        1.12

Interpretation:

  • A one-unit increase in age multiplies the odds of purchase by 1.42
  • More age/income = higher likelihood of purchase

Predict Probabilities

new <- data.frame(age = 38, income = 58)
predict(model_log, newdata = new, type = "response")  # Gives probability

2. Poisson Regression in R (Count Modeling)

Simulate Count Data

set.seed(123)
df2 <- data.frame(
  hours = c(1, 2, 3, 4, 5, 6, 7),
  events = c(1, 2, 3, 6, 9, 10, 13)
)

model_pois <- glm(events ~ hours, family = poisson(link = "log"), data = df2)
summary(model_pois)

Explanation:

  • Predicts count data
  • link = "log": Default for Poisson, log-linear model

Exponentiate Coefficients (Rate Ratios)

exp(coef(model_pois))

Output:

(Intercept)      hours  
   0.56           1.41

Interpretation:

  • Each additional hour multiplies event rate by 1.41
  • Log-linear increase in expected count

Predict Counts

new2 <- data.frame(hours = 5)
predict(model_pois, newdata = new2, type = "response")

Model Comparison: Logistic vs Poisson

FeatureLogistic RegressionPoisson Regression
Response VariableBinary (0 or 1)Count (0, 1, 2, …)
Family (GLM)binomialpoisson
Output InterpretationOdds ratioEvent rate / rate ratio
Use CasesClassificationFrequency modeling

Summary – Recap & Next Steps

Both Logistic and Poisson Regression are essential for modeling non-continuous data. R’s glm() makes it easy to define the appropriate model based on the outcome type.

Key Takeaways:

  • Use glm(..., family = binomial) for binary outcomes
  • Use glm(..., family = poisson) for count data
  • Use exp(coef(...)) to interpret in natural scale
  • Use predict(..., type = "response") for readable predictions

Real-World Relevance:
These models are widely used in healthcare (disease diagnosis), marketing (click prediction), insurance (claim count), and survey analysis.


FAQs – Logistic & Poisson Regression in R

What is the main difference between lm() and glm()?
lm() assumes normally distributed errors (continuous), glm() allows flexible distributions like binomial, Poisson, etc.

When should I use logistic regression?
When your dependent variable is binary (e.g., success/failure, 0/1).

How to interpret coefficients in logistic regression?
Coefficients are in log-odds. Use exp() to convert to odds ratios.

How to detect overdispersion in Poisson regression?
If residual deviance >> degrees of freedom, consider quasi-Poisson or negative binomial models.

Can I use multiple predictors in logistic or Poisson regression?
Yes. Just extend the formula:

glm(y ~ x1 + x2 + x3, family = binomial, data = ...)

Share Now :
Share

R – Logistic / Poisson Regression

Or Copy Link

CONTENTS
Scroll to Top