mclogit: Multinomial Logit Models, with or without Random Effects or Overdispersion¶
The package ‘mclogit’ allows the estimation of the two main varieties of multinomial logit models: baseline-category logit models and conditional logit models. It is published on CRAN. Development occurs on GitHub, where both releases and the development tree can be found.
Baseline-category logit models¶
Multinomial baseline-category logit models are a generalisation of logistic
regression, that allow to model not only binary or dichotomous responses, but
also polychotomous responses. In addition, they allow to model responses in the
form of counts that have a pre-determined sum. These models are described in
Agresti (2002). Estimating these models is also
supported by the function
multinom() in the R package “nnet” (Venables, and Ripley 2002).
In the package “mclogit”, the function to estimate these models is called
mblogit() (see the relevant manual page), which uses
the infrastructure for estimating conditional logit models, exploiting the fact
that baseline-category logit models can be re-expressed as condigional logit
Baseline-category logit models are constructed as follows. Suppose a categorical dependent variable or response with categories is observed for individuals . Let denote the probability that the value of the dependent variable for individual is equal to , then the baseline-category logit model takes the form:
where the first category () is the baseline category.
Equivalently, the model can be expressed in terms of log-odds, relative to the baseline-category:
Here the relevant parameters of the model are the coefficients which describe how the values of independent variables (numbered ) affect the relative chances of the response taking a value versus taking the value . Note that there is one coefficient for each independent variable and each response other than the baseline category.
Conditional logit models¶
Conditional logit models are motivated by a variety of considerations, notably as a way to model binary panel data or responses in case-control-studies. The variant supported by the package “mclogit” is motivated by the analysis of discrete choices and goes back to McFadden (1974). Here, a series of individuals is observed to have made a choice (represented by a number ) from a choice set , the set of alternatives at the individual’s disposal. Each alternatives in the choice set can be described by the values of attribute variables (where the variables are enumerated as ). (Note that in contrast to the baseline-category logit model, these values vary between choice alternatives.) Conditional logit models then posit that individual chooses alternative from his or her choice set with probability
It is worth noting that the conditional logit model does not require that all individuals face the same choice sets. Only that the alternatives in the choice sets can be distinguished from one another by the attribute variables.
The similarities and differences of these models to baseline-category logit model becomes obvious if one looks at the log-odds relative to the first alternative in the choice set:
Conditional logit models appear more parsimonious than baseline-category logit
models in so far as they have only one coefficient for each independent
variables. 1 In the “mclogit” package, these models can be estimated using the
mclogit() (see the relevant manual page).
My interest in conditional logit models derives from my research into the influence of parties’ political positions on the patterns of voting. Here, the political positions are the attributes of the alternatives and the choice sets are the sets of parties that run candidates in a countries at various points in time. For the application of the conditional logit models, see my doctoral thesis (Elff 2006).
Random effects in baseline logit models and conditional logit models¶
The “mclogit” package allows for the presence of random effects in baseline-category logit and conditional logit models. In baseline-category logit models, the random effects may represent (unobserved) characteristics that are common the individuals in clusters, such as regional units or electoral districts or the like. In conditional logit models, random effects may represent attributes that share across several choice occasions within the same context of choice. That is, if one analyses voting behaviour across countries then an random effect specific to the Labour party may represent unobserved attributes of this party in terms of which it differs from (or is more like) the Social Democratic Party of Germany (SPD). My original motivation for working on conditional logit models with random effects was to make it possible to assess the impact of parties’ political positions on the patterns of voting behaviour in various European countries. The results of this research are published in an article in Electoral Studies (Elff 2009).
In its earliest incarnation, the package supported only a very simple random-intercept extension of conditional logit models (or “mixed conditional logit models”, hence the name of the package). These models can be written as
where represents values of independent variables, are coefficients, are dummy ariables (that are equal to if is in cluster and equal to otherwise), are random effects with a normal distribution with expectation and variance parameter .
Later releases also added support for baseline-category logit models (initially only without random effects). In order to support random effects in baseline-category logit models, the package had to be further modified to allow for conditional logit models with random slopes (this is so because baseline-categoy logit models can be expressed as a particular type of conditional logit models). (The relations between these various model variants will be discussed on a dedicated page as soon as I find the time to write it.)
It should be noted that estimating the parameters of random effects multinomial logit models (whether of baseline-category logit variety or the conditional logit variety) involves the considerable challenges already known from the “generalized linear mixed models” literature. The main challenge is that the likelihood function involves analytically intractable integrals (i.e. there is know way to “solve” or eliminate the intergrals from the formula of the likelihood function). This means that either computationally intensive methods for the computation of such integrals have to be used or certain approximations (most notably the Laplace approximation technique and its variants), which may lead to biases in certain situations. The “mclogit” package only supports approximate likelihood-based inference. Most of the time the PQL-technique based on a (first-order) Laplace approximation was supported, release 0.8, “mclogit” also supports the MQL technique, which is based on a (first-order) Solomon-Cox approximation. The ideas behind the PQL and MQL techniques are described e.g. in Breslow, and Clayton (1993). A dedicated page will describe these techniques as soon as I find the time for this.
Agresti, Alan. 2002. Categorical Data Analysis. New York: Wiley.
Breslow, Norman E. and David G. Clayton. 1993. "Approximate Inference in Generalized Linear Mixed Models". Journal of the American Statistical Association 88(421): 9-25.
Elff, Martin. 2006. Politische Ideologien, soziale Konflikte und Wahlverhalten: Die Bedeutung politischer Angebote der Parteien für den Zusammenhang zwischen sozialen Merkmalen und Parteipräferenzen in zehn westeuropäischen Demokratien. Baden-Baden: Nomos.
Elff, Martin. 2009. "Social Divisions, Party Positions, and Electoral Behaviour". Electoral Studies 28(2): 297-308.
McFadden, Daniel. 1974. "Conditional Logit Analysis of Qualitative Choice Behaviour". 105-142 in Frontiers in Econometrics, ed. by Paul Zarembka. New York: Academic Press.
Venables, W. N. and B. D. Ripley. 2002. Modern Applied Statistics with S. New York: Springer.