R Calculate and interpret odds ratio in logistic regression

0 votes

I am having trouble interpreting the results of a logistic regression. My outcome variable is Decision and is binary (0 or 1, not take or take a product, respectively).
My predictor variable is Thoughts and is continuous, can be positive or negative, and is rounded up to the 2nd decimal point.
I want to know how the probability of taking the product changes as Thoughts changes.

The logistic regression equation is:

glm(Decision ~ Thoughts, family = binomial, data = data)

According to this model, Thoughts has a significant impact on probability of Decision (b = .72, p = .02). To determine the odds ratio of Decision as a function of Thoughts:

exp(coef(results))

Odds ratio = 2.07.

Questions:

  1. How do I interpret the odds ratio?

    1. Does an odds ratio of 2.07 imply that a .01 increase (or decrease) in Thoughts affect the odds of taking (or not taking) the product by 0.07 OR
    2. Does it imply that as Thoughts increases (decreases) by .01, the odds of taking (not taking) the product increase (decrease) by approximately 2 units?
  2. How do I convert odds ratio of Thoughts to an estimated probability of Decision?
    Or can I only estimate the probability of Decision at a certain Thoughts score (i.e. calculate the estimated probability of taking the product when Thoughts == 1)?

Mar 26, 2022 in Machine Learning by Dev
• 6,000 points
17,186 views

1 answer to this question.

0 votes

A logit, or the log of the odds, is the coefficient provided by a logistic regression in r. You can use exponentiation to convert logits to odds ratios, as seen above. The function exp(logit)/(1+exp(logit)) can be used to convert logits to probabilities. There are a few things to keep in mind concerning this process.

To begin, I'll utilise some data that can be replicated.

library('MASS')
data("menarche")
m<-glm(cbind(Menarche, Total-Menarche) ~ Age, family=binomial, data=menarche)
summary(m)

The Output is:

Call:
glm(formula = cbind(Menarche, Total - Menarche) ~ Age, family = binomial, 
    data = menarche)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.0363  -0.9953  -0.4900   0.7780   1.3675  

Coefficients:
             Estimate Std. Error z value Pr(>|z|)    
(Intercept) -21.22639    0.77068  -27.54   <2e-16 ***
Age           1.63197    0.05895   27.68   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 3693.884  on 24  degrees of freedom
Residual deviance:   26.703  on 23  degrees of freedom
AIC: 114.76

Number of Fisher Scoring iterations: 4

As in your case, the coefficients displayed are for logits. We can see the sigmoidal function that is characteristic of a logistic model fit to binomial data if we plot this data with this model.

#predict gives the predicted value in terms of logits
plot.dat <- data.frame(prob = menarche$Menarche/menarche$Total,
                       age = menarche$Age,
                       fit = predict(m, menarche))
#convert those logit values to probabilities
plot.dat$fit_prob <- exp(plot.dat$fit)/(1+exp(plot.dat$fit))

library(ggplot2)
ggplot(plot.dat, aes(x=age, y=prob)) + 
  geom_point() +
  geom_line(aes(x=age, y=fit_prob))

enter image description here

It's worth noting that the rate of change in probability isn't constant; the curve rises slowly at initially, then accelerates in the middle, before levelling off towards the conclusion. The probability difference between 10 and 12 is much smaller than the probability difference between 12 and 14. This indicates that summarizing the link between age and probability with a single number is difficult without altering probabilities.

To respond to your specific inquiries:

What does it mean to interpret odds ratios?

The probabilities of a "success" (with your data, this is the odds of taking the product) when x = 0 is the odds ratio for the value of the intercept (i.e. zero thoughts). The rise in odds above this value of the intercept when you add one entire x value (i.e. x=1; one thought) is the odds ratio for your coefficient. Using the data from menarche:

exp(coef(m))

 (Intercept)          Age 
6.046358e-10 5.113931e+00 

We can deduce that the chances of menarche occurring at age 0 are.00000000006. Or, to put it another way, nearly impossible. The projected increase in the probabilities of menarche for each unit of age is calculated by exponentiating the age coefficient. It's little over a quintupling in this situation. A one-to-one odds ratio shows no change, while a two-to-one odds ratio indicates a doubling, and so on.

Your odds ratio of 2.07 means that increasing 'Thoughts' by one unit raises the chances of taking the product by a factor of 2.07.

How do you translate thinking odds ratios to a decision probability estimate?

Because the change is not consistent over the range of x values, as shown in the plot above, you must do this for selected values of thinking. Get the following response if you want to know the probability of some value for thoughts:

exp(intercept + coef*THOUGHT_Value)/(1+(exp(intercept+coef*THOUGHT_Value))

hope this helps.

Read the Artificial Intelligence tutorial to learn more about Artificial Intelligence and Machine Learning. Also, enrol in Machine Learning Course to become proficient.

answered Apr 4, 2022 by Nandini
• 5,480 points

Related Questions In Machine Learning

0 votes
1 answer

different results for Random Forest Regression in R and Python

Random Forests, as others have mentioned, have ...READ MORE

answered Apr 12, 2022 in Machine Learning by Dev
• 6,000 points
1,585 views
0 votes
1 answer

Plotting logistic regression in R with the Smarket dataset

The first, third, and fourth methods of ...READ MORE

answered Apr 12, 2022 in Machine Learning by Dev
• 6,000 points
1,027 views
0 votes
1 answer

How to add random and/or fixed effects into cloglog regression in R

The standard glm function can be used ...READ MORE

answered Apr 13, 2022 in Machine Learning by anonymous
724 views
0 votes
1 answer

Empirical probability in R with x1+x2>2*x3

It's easy to duplicate random draws with ...READ MORE

answered Mar 15, 2022 in Machine Learning by Dev
• 6,000 points
583 views
0 votes
1 answer

Calculate Z-Score from Probability Value - R programming

It's named qnorm qnorm(p=0.841344746068543) Output 1 The following family of functions ...READ MORE

answered Apr 4, 2022 in Machine Learning by Nandini
• 5,480 points
2,003 views
0 votes
1 answer

Calculate the probability in R for sum of two dice rolls

By converting the outer values to a ...READ MORE

answered Apr 4, 2022 in Machine Learning by Nandini
• 5,480 points
2,140 views
0 votes
1 answer

Union probability

You can try this if a, b, ...READ MORE

answered Apr 5, 2022 in Machine Learning by Dev
• 6,000 points
914 views
0 votes
1 answer
0 votes
1 answer

Plot logistic regression curve in R

The Code looks something like this: fit = ...READ MORE

answered Apr 4, 2022 in Machine Learning by Nandini
• 5,480 points
2,482 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP