cat_plot.Rd
cat_plot
is a complementary function to interact_plot()
that is designed
for plotting interactions when both predictor and moderator(s) are
categorical (or, in R terms, factors).
cat_plot(model, pred, modx = NULL, mod2 = NULL, data = NULL, geom = c("point", "line", "bar"), pred.values = NULL, modx.values = NULL, mod2.values = NULL, interval = TRUE, plot.points = FALSE, point.shape = FALSE, vary.lty = FALSE, centered = "all", int.type = c("confidence", "prediction"), int.width = 0.95, line.thickness = 1.1, point.size = 1.5, pred.point.size = 3.5, jitter = 0.1, geom.alpha = NULL, dodge.width = NULL, errorbar.width = NULL, interval.geom = c("errorbar", "linerange"), outcome.scale = "response", robust = FALSE, cluster = NULL, vcov = NULL, pred.labels = NULL, modx.labels = NULL, mod2.labels = NULL, set.offset = 1, x.label = NULL, y.label = NULL, main.title = NULL, legend.main = NULL, colors = "CUD Bright", partial.residuals = FALSE, point.alpha = 0.6, color.class = NULL, ...)
model  A regression model. The function is tested with 

pred  A categorical predictor variable that will appear on the xaxis.
Note that it is evaluated using 
modx  A categorical moderator variable. 
mod2  For threeway interactions, the second categorical moderator. 
data  Optional, default is NULL. You may provide the data used to
fit the model. This can be a better way to get mean values for centering
and can be crucial for models with variable transformations in the formula
(e.g., 
geom  What type of plot should this be? There are several options here since the best way to visualize categorical interactions varies by context. Here are the options:

pred.values  Which values of the predictor should be included in the plot? By default, all levels are included. 
modx.values  For which values of the moderator should lines be plotted?
Default is If the moderator is a factor variable and 
mod2.values  For which values of the second moderator should the plot
be
facetted by? That is, there will be a separate plot for each level of this
moderator. Defaults are the same as 
interval  Logical. If 
plot.points  Logical. If 
point.shape  For plotted pointseither of observed data or predicted values with the "point" or "line" geomsshould the shape of the points vary by the values of the factor? This is especially useful if you aim to be black and white printing or colorblindfriendly. 
vary.lty  Should the resulting plot have different shapes for each
line in addition to colors? Defaults to 
centered  A vector of quoted variable names that are to be
meancentered. If 
int.type  Type of interval to plot. Options are "confidence" or "prediction". Default is confidence interval. 
int.width  How large should the interval be, relative to the standard error? The default, .95, corresponds to roughly 1.96 standard errors and a .05 alpha level for values outside the range. In other words, for a confidence interval, .95 is analogous to a 95% confidence interval. 
line.thickness  How thick should the plotted lines be? Default is 1. 
point.size  What size should be used for observed data when

pred.point.size  If TRUE and 
jitter  How much should 
geom.alpha  What should the alpha aesthetic be for the plotted
lines/bars? Default is NULL, which means it is set depending on the value
of 
dodge.width  What should the 
errorbar.width  How wide should the error bars be? Default is NULL,
meaning it is set depending on the value 
interval.geom  For categorical by categorical interactions.
One of "errorbar" or "linerange". If the former,

outcome.scale  For nonlinear models (i.e., GLMs), should the outcome
variable be plotted on the link scale (e.g., log odds for logit models) or
the original scale (e.g., predicted probabilities for logit models)? The
default is 
robust  Should robust standard errors be used to find confidence
intervals for supported models? Default is FALSE, but you should specify
the type of sandwich standard errors if you'd like to use them (i.e.,

cluster  For clustered standard errors, provide the column name of the cluster variable in the input data frame (as a string). Alternately, provide a vector of clusters. 
vcov  Optional. You may supply the variancecovariance matrix of the coefficients yourself. This is useful if you are using some method for robust standard error calculation not supported by the sandwich package. 
pred.labels  A character vector of equal length to the number of
factor levels of the predictor (or number specified in 
modx.labels  A character vector of labels for each level of the
moderator values, provided in the same order as the 
mod2.labels  A character vector of labels for each level of the 2nd
moderator values, provided in the same order as the 
set.offset  For models with an offset (e.g., Poisson models), sets an offset for the predicted values. All predicted values will have the same offset. By default, this is set to 1, which makes the predicted values a proportion. See details for more about offset support. 
x.label  A character object specifying the desired xaxis label. If

y.label  A character object specifying the desired xaxis label. If

main.title  A character object that will be used as an overall title
for the plot. If 
legend.main  A character object that will be used as the title that
appears above the legend. If 
colors  Any palette argument accepted by

partial.residuals  Instead of plotting the observed data, you may plot
the partial residuals (controlling for the effects of variables besides

point.alpha  What should the 
color.class  Deprecated. Now known as 
...  extra arguments passed to 
The functions returns a ggplot
object, which can be treated
like a usercreated plot and expanded upon as such.
This function provides a means for plotting conditional effects
for the purpose of exploring interactions in the context of regression.
You must have the
package ggplot2
installed to benefit from these plotting functions.
The function is designed for two and threeway interactions. For
additional terms, the
effects
package may be better suited to the task.
This function supports nonlinear and generalized linear models and by
default will plot them on
their original scale (outcome.scale = "response"
).
While mixed effects models from lme4
are supported, only the fixed
effects are plotted. lme4
does not provide confidence intervals,
so they are not supported with this function either.
Note: to use transformed predictors, e.g., log(variable)
,
provide only the variable name to pred
, modx
, or mod2
and supply
the original data separately to the data
argument.
Info about offsets:
Offsets are partially supported by this function with important
limitations. First of all, only a single offset per model is supported.
Second, it is best in general to specify offsets with the offset argument
of the model fitting function rather than in the formula. You are much
more likely to have success if you provide the data used to fit the model
with the data
argument.
library(ggplot2) fit < lm(price ~ cut * color, data = diamonds) cat_plot(fit, pred = color, modx = cut, interval = TRUE)# 3way interaction ## Will first create a couple dichotomous factors to ensure full rank mpg2 < mpg mpg2$auto < "auto" mpg2$auto[mpg2$trans %in% c("manual(m5)", "manual(m6)")] < "manual" mpg2$auto < factor(mpg2$auto) mpg2$fwd < "2wd" mpg2$fwd[mpg2$drv == "4"] < "4wd" mpg2$fwd < factor(mpg2$fwd) ## Drop the two cars with 5 cylinders (rest are 4, 6, or 8) mpg2 < mpg2[mpg2$cyl != "5",] mpg2$cyl < factor(mpg2$cyl) ## Fit the model fit3 < lm(cty ~ cyl * fwd * auto, data = mpg2) # The line geom looks good for an ordered factor predictor cat_plot(fit3, pred = cyl, modx = fwd, mod2 = auto, geom = "line", interval = TRUE)