The package ‘mclogit’ allows the estimation of the two main varieties of
multinomial logit models: baseline-category logit models and conditional logit
models. It is published on CRAN. Development occurs on GitHub, where both releases and the development tree can be found.
Multinomial baseline-category logit models are a generalisation of logistic
regression, that allow to model not only binary or dichotomous responses, but
also polychotomous responses. In addition, they allow to model responses in the
form of counts that have a pre-determined sum. These models are described in
Agresti (2002). Estimating these models is also
supported by the function multinom() in the R package “nnet” (Venables, and Ripley 2002).
In the package “mclogit”, the function to estimate these models is called
mblogit() (see the relevant manual page), which uses
the infrastructure for estimating conditional logit models, exploiting the fact
that baseline-category logit models can be re-expressed as condigional logit
Baseline-category logit models are constructed as follows. Suppose a categorical
dependent variable or response with categories is observed for
individuals . Let denote the probability that the value of
the dependent variable for individual is equal to , then the
baseline-category logit model takes the form:
where the first category () is the baseline category.
Equivalently, the model can be expressed in terms of log-odds, relative to the
Here the relevant parameters of the model are the coefficients
which describe how the values of independent variables (numbered ) affect the relative
chances of the response taking a value versus taking the value .
Note that there is one coefficient for each independent variable and each
response other than the baseline category.
Conditional logit models are motivated by a variety of considerations, notably
as a way to model binary panel data or responses in case-control-studies. The
variant supported by the package “mclogit” is motivated by the analysis of
discrete choices and goes back to McFadden (1974). Here, a
series of individuals is observed to have made a choice
(represented by a number ) from a choice set , the set of
alternatives at the individual’s disposal. Each alternatives in the choice set
can be described by the values of
attribute variables (where the variables are enumerated as ).
(Note that in contrast to the baseline-category logit
model, these values vary between choice alternatives.) Conditional logit
models then posit that individual chooses alternative from his or her
choice set with probability
It is worth noting that the conditional logit model does not require that all
individuals face the same choice sets. Only that the alternatives in the choice
sets can be distinguished from one another by the attribute variables.
The similarities and differences of these models to baseline-category logit model
becomes obvious if one looks at the log-odds relative to the first alternative
in the choice set:
Conditional logit models appear more parsimonious than baseline-category logit
models in so far as they have only one coefficient for each independent
variables. 1 In the “mclogit” package, these models can be estimated using the
function mclogit() (see the relevant manual page).
My interest in conditional logit models derives from my research into the
influence of parties’ political positions on the patterns of voting. Here, the
political positions are the attributes of the alternatives and the choice sets
are the sets of parties that run candidates in a countries at various points in
time. For the application of the conditional logit models, see my doctoral
thesis (Elff 2006).
It is nevertheless possible to re-express baseline-category logit
models as conditional logit models, as is shown on this page
The “mclogit” package allows for the presence of random effects in
baseline-category logit and conditional logit models. In baseline-category logit
models, the random effects may represent (unobserved) characteristics that are
common the individuals in clusters, such as regional units or electoral
districts or the like. In conditional logit models, random effects may
represent attributes that share across several choice occasions within the same
context of choice. That is, if one analyses voting behaviour across countries
then an random effect specific to the Labour party may represent unobserved
attributes of this party in terms of which it differs from (or is more like) the
Social Democratic Party of Germany (SPD). My original motivation for working on
conditional logit models with random effects was to make it possible to assess
the impact of parties’ political positions on the patterns of voting behaviour
in various European countries. The results of this research are published in an
article in Electoral Studies (Elff 2009).
In its earliest incarnation, the package supported only a very simple random-intercept
extension of conditional logit models (or “mixed conditional logit models”,
hence the name of the package). These models can be written as
where represents values of independent variables,
are coefficients, are dummy ariables (that are equal to
if is in cluster and equal to otherwise), are
random effects with a normal distribution with expectation and variance
Later releases also added support for
baseline-category logit models (initially only without random effects). In order
to support random effects in baseline-category logit models, the package had to
be further modified to allow for conditional logit models with random slopes
(this is so because baseline-categoy logit models can be expressed as a
particular type of conditional logit models). (The relations between these
various model variants will be discussed on a dedicated page as soon as I find
the time to write it.)
It should be noted that estimating the parameters of random effects multinomial
logit models (whether of baseline-category logit variety or the conditional
logit variety) involves the considerable challenges already known from the
“generalized linear mixed models” literature. The main challenge is that the
likelihood function involves analytically intractable integrals (i.e. there is
know way to “solve” or eliminate the intergrals from the formula of the
likelihood function). This means that either computationally intensive methods
for the computation of such integrals have to be used or certain approximations
(most notably the Laplace approximation technique and its variants), which may lead to
biases in certain situations. The “mclogit” package only supports approximate
likelihood-based inference. Most of the time the PQL-technique based on a (first-order)
Laplace approximation was supported, release 0.8, “mclogit” also supports the
MQL technique, which is based on a (first-order) Solomon-Cox approximation.
The ideas behind the PQL and MQL techniques are described e.g. in
Breslow, and Clayton (1993). A dedicated page will
describe these techniques as soon as I find the time for this.
Agresti, Alan. 2002. Categorical Data Analysis. New York: Wiley.
Breslow, Norman E. and David G. Clayton. 1993. "Approximate Inference in Generalized Linear Mixed Models". Journal of the American Statistical Association 88(421): 9-25.
Elff, Martin. 2006. Politische Ideologien, soziale Konflikte und Wahlverhalten: Die Bedeutung politischer Angebote der Parteien für den Zusammenhang zwischen sozialen Merkmalen und Parteipräferenzen in zehn westeuropäischen Demokratien. Baden-Baden: Nomos.
Elff, Martin. 2009. "Social Divisions, Party Positions, and Electoral Behaviour". Electoral Studies 28(2): 297-308.
McFadden, Daniel. 1974. "Conditional Logit Analysis of Qualitative Choice Behaviour". 105-142 in Frontiers in Econometrics, ed. by Paul Zarembka. New York: Academic Press.
Venables, W. N. and B. D. Ripley. 2002. Modern Applied Statistics with S. New York: Springer.