mpred: Generic Predictive Margins

Many dependent variables of interest in the study of political behaviour and opinion formation are categorical. Statistical models involving such dependent variables generally pose a challenge for their interpretation (apart from the fact that estimation usually is more difficult than in models for numeric dependent variables). Because of the difficulty of interpretation one may want to resort to graphics in which predicted values of a depedent variables are plotted against values of one or more independent variables of interest. The problem of such plots based on models for categorical responses is that the pattern of dependence between dependent variable and independent variable of interest usually is non-linear and typically also depends on values of other independent variables that may not of interest. E.g. in a logistic regression of a binary dependent variable Y with independent variables X_1 and X_2,

\Pr(Y=1|X_1=x_1,X_2=x_2)=\frac{\exp(\beta_0+\beta_1x_1+\beta_2)}{1+\exp(\beta_0+\beta_1x_1+\beta_2)}

the unit change of a positive outcome probability

\Pr(Y=1|X_1=x_1,X_2=x_2)-\Pr(Y=1|X_1=x_1+1,X_2=x_2)

is not constant (as in a linear regression), but depends on the particular value of x_1. Furthermore, it also depends on the particular value of x_2. This variation in the unit change, when identified as the “unit effect” of x_1 on y, has led various authors to claim that the presence of interaction terms (e.g. x_1x_2) in a logistic regression model (or other model for categorical dependent variables) is neither a necessary or sufficient condition for the existence of interaction effects (citation coming soon). A way out of the ensuing complications is to focus instead on the expectation or the average of this unit change:

\sum_z\Pr(Y=1|X_1=x_1,X_2=z)f(z)-\sum_z\Pr(Y=1|X_1=x_1+1,X_2=z)f(z)

where f(z) is either the density function or probability mass function of X_2 or the emprical distribution of X_2. If the empirical distribution is used then this difference is a difference of what is also called the predictive margins of Y for X_1=x_1 and X_1=x_1+1. A predictive margin for X_1=x_1 from the logistic regression model under discussion is defined as

\frac1n\sum_i\Pr(Y=1|X_1=x_1,X_2=z_i)

where n is the sample size and z_1,\ldots,z_n are the sample values of X_2. There is an apparent relation between a predictive margin and the “do-operator” in the terminology of Judea Pearl. It is defined in the present context (if X_2 is discrete) as

\Pr(Y=1|do(X_1=x_1)) := \sum_z\Pr(Y=1|X_1=x_1,X_2=z)\Pr(X_2=z)

where the sum is over the range of X_2 so that \sum_z\Pr(X_2=z)=1. A predictive margin could be interpreted as an estimate of \Pr(Y=1|do(X_1=x_1)).

This package provides a generic function to compute predictive margins from models with given covariate settings. All classes of model objects can be used that have a predict() method that allows for a newdata= argument. The package is available not (yet) available from CRAN but only on GitHub.

If you have the package devtools installed, you can install the package by

library(devtools)
install_github("melff/mpred/pkg")