The first, third, and fourth methods of visualizing the results of a logistic regression are all borrowed from the code for Chapter 5 of Gelman and Hill's Data Analysis Using Regression & Multilevel/Hierarchical Models. Some functions from the arm package (which comes with the book) are used in the code below
library(arm)
Plot the probability of Direction="Up" over a variety of Lag 1 values and for three different Volume values (other Lag values are set to 0, but you can change them if you want)
# Function to jitter class category values
jitter.binary <- function(a, jitt=.05){
ifelse (a-1==0, runif(length(a), 0, jitt), runif(length(a), 1-jitt, 1))
}
# Sequence of Lag1 values for plotting
x = seq(-5,5.7,length.out=100)
# Plot jittered Direction vs. Lag 1. This shows the actual distribution of the data.
with(Smarket, plot(Lag1, jitter.binary(as.numeric(Direction)), pch=16,cex=0.7,
ylab="Pr(Up)", xlab="Lag 1"))
# Add model prediction curves. These show the probability of Direction="Up" vs. Lag 1
# for three different fixed values of Volume.
curve(expr=invlogit(cbind(1, x,0,0,0,0,1.48) %*% coef(glm.fit)),
from=-5, to=5.7, lwd=.5, add=TRUE)
curve(expr=invlogit(cbind(1, x, 0,0,0,0, 0.36) %*% coef(glm.fit)),
from=-5, to=5.7, lwd=.5, add=TRUE, col="red", lty=2)
curve(expr=invlogit(cbind(1, x, 0,0,0,0, 3.15) %*% coef(glm.fit)),
from=-5, to=5.7, lwd=.5, add=TRUE, col="blue", lty=2)
Plot Lag 1 vs. Volume, color the spots according to the Direction value, and include the decision boundary. Because the decision boundary for your real regression is a five-dimensional hyperplane, I've created a new regression with just the two predictors for this. (You may still graph the decision boundary in two dimensions for models with numerous predictors by taking 2D slices through the multidimensional predictor space.)
# New regression model
fit2 = glm(Direction ~ Lag1 + Volume , data = Smarket, family=binomial)
# Probability of Direction="Up" for this model
Smarket$Pred2 = predict(fit2, type="response")
# Set Prediction to "Up" for probability > 0.5; "Down" otherwise.
Smarket$PredCat2 = cut(Smarket$Pred2, c(0,0.5,1), include.lowest=TRUE, labels=c("Down","Up"))
# Graph Lag1 vs. Volume with coloring and point-style based on value
# of Direction
with(Smarket, plot(Lag1, Volume, pch=ifelse(Direction=="Down", 3, 1),
col=ifelse(Direction=="Down", "red", "blue"), cex=0.6))
# Add the decision boundary
curve(expr= -(cbind(1, x) %*% coef(glm.fit2)[1:2])/coef(glm.fit2)[3],
from=-5,to=5.7, add=TRUE)
Hope this helps.
Ignite Your Future with Machine Learning Training!