interact_plot.Rd
interact_plot
plots regression lines at userspecified levels of a
moderator variable to explore interactions. The plotting is done with
ggplot2
rather than base graphics, which some similar functions use.
interact_plot( model, pred, modx, modx.values = NULL, mod2 = NULL, mod2.values = NULL, centered = "all", data = NULL, at = NULL, plot.points = FALSE, interval = FALSE, int.type = c("confidence", "prediction"), int.width = 0.95, outcome.scale = "response", linearity.check = FALSE, facet.modx = FALSE, robust = FALSE, cluster = NULL, vcov = NULL, set.offset = 1, x.label = NULL, y.label = NULL, pred.labels = NULL, modx.labels = NULL, mod2.labels = NULL, main.title = NULL, legend.main = NULL, colors = NULL, line.thickness = 1, vary.lty = TRUE, point.size = 1.5, point.shape = FALSE, jitter = 0, rug = FALSE, rug.sides = "b", partial.residuals = FALSE, point.alpha = 0.6, color.class = NULL, ... )
model  A regression model. The function is tested with 

pred  The name of the predictor variable involved
in the interaction. This can be a bare name or string. Note that it
is evaluated using 
modx  The name of the moderator variable involved
in the interaction. This can be a bare name or string. The same

modx.values  For which values of the moderator should lines be plotted? There are two basic options:
Default is If the moderator is a factor variable and 
mod2  Optional. The name of the second moderator
variable involved in the interaction. This can be a bare name or string.
The same 
mod2.values  For which values of the second moderator should the plot
be
facetted by? That is, there will be a separate plot for each level of this
moderator. Defaults are the same as 
centered  A vector of quoted variable names that are to be
meancentered. If 
data  Optional, default is NULL. You may provide the data used to
fit the model. This can be a better way to get mean values for centering
and can be crucial for models with variable transformations in the formula
(e.g., 
at  If you want to manually set the values of other variables in the model, do so by providing a named list where the names are the variables and the list values are vectors of the values. This can be useful especially when you are exploring interactions or other conditional predictions. 
plot.points  Logical. If 
interval  Logical. If 
int.type  Type of interval to plot. Options are "confidence" or "prediction". Default is confidence interval. 
int.width  How large should the interval be, relative to the standard error? The default, .95, corresponds to roughly 1.96 standard errors and a .05 alpha level for values outside the range. In other words, for a confidence interval, .95 is analogous to a 95% confidence interval. 
outcome.scale  For nonlinear models (i.e., GLMs), should the outcome
variable be plotted on the link scale (e.g., log odds for logit models) or
the original scale (e.g., predicted probabilities for logit models)? The
default is 
linearity.check  For twoway interactions only. If 
facet.modx  Create separate panels for each level of the moderator?
Default is FALSE, except when 
robust  Should robust standard errors be used to find confidence
intervals for supported models? Default is FALSE, but you should specify
the type of sandwich standard errors if you'd like to use them (i.e.,

cluster  For clustered standard errors, provide the column name of the cluster variable in the input data frame (as a string). Alternately, provide a vector of clusters. 
vcov  Optional. You may supply the variancecovariance matrix of the coefficients yourself. This is useful if you are using some method for robust standard error calculation not supported by the sandwich package. 
set.offset  For models with an offset (e.g., Poisson models), sets an offset for the predicted values. All predicted values will have the same offset. By default, this is set to 1, which makes the predicted values a proportion. See details for more about offset support. 
x.label  A character object specifying the desired xaxis label. If

y.label  A character object specifying the desired xaxis label. If

pred.labels  A character vector of 2 labels for the predictor if it is
a 2level factor or a continuous variable with only 2 values. If

modx.labels  A character vector of labels for each level of the
moderator values, provided in the same order as the 
mod2.labels  A character vector of labels for each level of the 2nd
moderator values, provided in the same order as the 
main.title  A character object that will be used as an overall title
for the plot. If 
legend.main  A character object that will be used as the title that
appears above the legend. If 
colors  See jtools_colors for details on the types of arguments
accepted. Default is "CUD Bright" for factor
moderators, "Blues" for +/ SD and userspecified 
line.thickness  How thick should the plotted lines be? Default is 1. 
vary.lty  Should the resulting plot have different shapes for each
line in addition to colors? Defaults to 
point.size  What size should be used for observed data when

point.shape  For plotted pointseither of observed data or predicted values with the "point" or "line" geomsshould the shape of the points vary by the values of the factor? This is especially useful if you aim to be black and white printing or colorblindfriendly. 
jitter  How much should 
rug  Show a rug plot in the margins? This uses 
rug.sides  On which sides should rug plots appear? Default is "b", meaning bottom. "t" and/or "b" show the distribution of the predictor while "l" and/or "r" show the distribution of the response. "bl" is a good option to show both the predictor and response. 
partial.residuals  Instead of plotting the observed data, you may plot
the partial residuals (controlling for the effects of variables besides

point.alpha  What should the 
color.class  Deprecated. Now known as 
...  extra arguments passed to 
The functions returns a ggplot
object, which can be treated
like a usercreated plot and expanded upon as such.
This function provides a means for plotting conditional effects for the purpose of exploring interactions in regression models.
The function is designed for two and threeway interactions. For additional terms, the effects package may be better suited to the task.
This function supports nonlinear and generalized linear models and by
default will plot them on their original scale
(outcome.scale = "response"
). To plot them on the linear scale,
use "link" for outcome.scale
.
While mixed effects models from lme4
are supported, only the fixed
effects are plotted. lme4
does not provide confidence intervals,
so they are not supported with this function either.
Note: to use transformed predictors, e.g., log(variable)
,
put its name in quotes or backticks in the argument.
Details on how observed data are split in multipane plots:
If you set plot.points = TRUE
and request a multipane (facetted) plot
either with a second moderator, linearity.check = TRUE
, or
facet.modx = TRUE
, the observed
data are split into as many groups as there are panes and plotted
separately. If the moderator is a factor, then the way this happens will
be very intuitive since it's obvious which values go in which pane. The
rest of this section will address the case of continuous moderators.
My recommendation is that you use modx.values = "terciles"
or
mod2.values = "terciles"
when you want to plot observed data on
multipane
plots. When you do, the data are split into three approximately
equalsized groups with the lowest third, middle third, and highest third
of the data split accordingly. You can replicate this procedure using
Hmisc::cut2()
with g = 3
from the Hmisc
package. Sometimes, the
groups will not be equal in size because the number of observations is
not divisible by 3 and/or there are multiple observations with the same
value at one of the cut points.
Otherwise, a more ad hoc procedure is used to split the data. Quantiles
are found for each mod2.values
or modx.values
value. These are not the
quantiles used to split the data, however, since we want the plotted lines
to represent the slope at a typical value in the group. The next step,
then, is to take the mean of each pair of neighboring quantiles and use
these as the cut points.
For example, if the mod2.values
are at the 25th, 50th, and 75th
percentiles
of the distribution of the moderator, the data will be split at the
37.5th and and 62.5th percentiles. When the variable is
normally distributed, this will correspond fairly closely to using
terciles.
Info about offsets:
Offsets are partially supported by this function with important
limitations. First of all, only a single offset per model is supported.
Second, it is best in general to specify offsets with the offset argument
of the model fitting function rather than in the formula. You are much
more likely to have success if you provide the data used to fit the model
with the data
argument.
Bauer, D. J., & Curran, P. J. (2005). Probing interactions in fixed and multilevel regression: Inferential and graphical techniques. Multivariate Behavioral Research, 40(3), 373400. http://dx.doi.org/10.1207/s15327906mbr4003_5
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analyses for the behavioral sciences (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Hainmueller, J., Mummolo, J., & Xu, Y. (2016). How much should we trust estimates from multiplicative interaction models? Simple tools to improve empirical practice. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.2739221
plotSlopes
from rockchalk performs a
similar function, but
with R's base graphicsthis function is meant, in part, to emulate
its features.
sim_slopes
performs a simple slopes analysis with a similar
argument syntax to this function.
# Using a fitted lm model states < as.data.frame(state.x77) states$HSGrad < states$`HS Grad` fit < lm(Income ~ HSGrad + Murder * Illiteracy, data = states) interact_plot(model = fit, pred = Murder, modx = Illiteracy)# Using interval feature fit < lm(accel ~ mag * dist, data = attenu) interact_plot(fit, pred = mag, modx = dist, interval = TRUE, int.type = "confidence", int.width = .8)#> Warning: 16.5667658200318 is outside the observed range of dist# Using second moderator fit < lm(Income ~ HSGrad * Murder * Illiteracy, data = states) interact_plot(model = fit, pred = Murder, modx = Illiteracy, mod2 = HSGrad)# With svyglm if (requireNamespace("survey")) { library(survey) data(api) dstrat < svydesign(id = ~1, strata = ~stype, weights = ~pw, data = apistrat, fpc = ~fpc) regmodel < svyglm(api00 ~ ell * meals, design = dstrat) interact_plot(regmodel, pred = ell, modx = meals) }#>#>#>#>#> #>#>#> #># With lme4 if (FALSE) { library(lme4) data(VerbAgg) mv < glmer(r2 ~ Anger * mode + (1  item), data = VerbAgg, family = binomial, control = glmerControl("bobyqa")) interact_plot(mv, pred = Anger, modx = mode) }