=====================================================================================
mclogit: Multinomial Logit Models, with or without Random Effects or Overdispersion
=====================================================================================
.. bibsource:: mclogit.bib
.. bibsource:: ../research/publications.bib
.. toctree::
:hidden:
mclogit on CRAN
.. division:: left-aligned a-no-ul
|Travis build status|
|Current release on GitHub|
|CRAN|
|Total downloads from RStudio CRAN mirror|
|Monthly downloads from RStudio CRAN mirror|
The package 'mclogit' allows the estimation of the two main varieties of
multinomial logit models: baseline-category logit models and conditional logit
models. It is published on `CRAN
`__. Development occurs on `GitHub
`__, where both `releases
`__ and the `development tree
`__ can be found.
Baseline-category logit models
==============================
Multinomial baseline-category logit models are a generalisation of logistic
regression, that allow to model not only binary or dichotomous responses, but
also polychotomous responses. In addition, they allow to model responses in the
form of counts that have a pre-determined sum. These models are described in
:citet:`agresti:categorical.data.analysis.2002`. Estimating these models is also
supported by the function ``multinom()`` in the *R* package "nnet" :cite:`MASS`.
In the package "mclogit", the function to estimate these models is called
``mblogit()`` (see the relevant `manual page `__), which uses
the infrastructure for estimating conditional logit models, exploiting the fact
that baseline-category logit models can be re-expressed as condigional logit
models.
Baseline-category logit models are constructed as follows. Suppose a categorical
dependent variable or response with categories $j=1,\ldots,q$ is observed for
individuals $i=1,\ldots,n$. Let $\pi_{ij}$ denote the probability that the value of
the dependent variable for individual $i$ is equal to $j$, then the
baseline-category logit model takes the form:
.. math::
\pi_{ij} =
\begin{cases}
\dfrac{\exp(\alpha_{j0}+\alpha_{j1}x_{1i}+\cdots+\alpha_{jr}x_{ri})}
{1+\sum_{k>1}\exp(\alpha_{k0}+\alpha_{k1}x_{1i}+\cdots+\alpha_{kr}x_{ri})}
& \text{for } j>1\\[20pt]
\dfrac{1}
{1+\sum_{k>1}\exp(\alpha_{k0}+\alpha_{k1}x_{1i}+\cdots+\alpha_{kr}x_{ri})}
& \text{for } j=1
\end{cases}
where the first category ($j=1$) is the baseline category.
Equivalently, the model can be expressed in terms of log-odds, relative to the
baseline-category:
.. math::
\ln\frac{\pi_{ij}}{\pi_{i1}}
=
\alpha_{j0}+\alpha_{j1}x_{1i}+\cdots+\alpha_{jr}x_{ri}.
Here the relevant parameters of the model are the coefficients $\alpha_{jk}$
which describe how the values of independent variables (numbered $k=1,\ldots,r$) affect the relative
chances of the response taking a value $j$ versus taking the value $1$.
Note that there is one coefficient for each independent variable and *each
response* other than the baseline category.
Conditional logit models
========================
Conditional logit models are motivated by a variety of considerations, notably
as a way to model binary panel data or responses in case-control-studies. The
variant supported by the package "mclogit" is motivated by the analysis of
discrete choices and goes back to :citet:`mcfadden:conditional.logit`. Here, a
series of individuals $i=1,\ldots,n$ is observed to have made a choice
(represented by a number $j$) from a choice set $\mathcal{S}_i$, the set of
alternatives at the individual's disposal. Each alternatives $j$ in the choice set
can be described by the values $x_{1ij},\ldots,x_{1ij}$ of $r$
attribute variables (where the variables are enumerated as $i=1,\ldots,r$).
(Note that in contrast to the baseline-category logit
model, these values vary between choice alternatives.) Conditional logit
models then posit that individual $i$ chooses alternative $j$ from his or her
choice set $\mathcal{S}_i$ with probability
.. math::
\pi_{ij} = \frac{\exp(\alpha_1x_{1ij}+\cdots+\alpha_rx_{rij})}
{\sum_{k\in\mathcal{S}_i}\exp(\alpha_1x_{1ik}+\cdots+\alpha_rx_{rik})}.
It is worth noting that the conditional logit model does not require that all
individuals face the same choice sets. Only that the alternatives in the choice
sets can be distinguished from one another by the attribute variables.
The similarities and differences of these models to baseline-category logit model
becomes obvious if one looks at the log-odds relative to the first alternative
in the choice set:
.. math::
\ln\frac{\pi_{ij}}{\pi_{i1}}
=
\alpha_{1}(x_{1ij}-x_{1i1})+\cdots+\alpha_{r}(x_{rij}-x_{ri1}).
Conditional logit models appear more parsimonious than baseline-category logit
models in so far as they have only one coefficient for each independent
variables. [1]_ In the "mclogit" package, these models can be estimated using the
function ``mclogit()`` (see the relevant `manual page `__).
My interest in conditional logit models derives from my research into the
influence of parties' political positions on the patterns of voting. Here, the
political positions are the attributes of the alternatives and the choice sets
are the sets of parties that run candidates in a countries at various points in
time. For the application of the conditional logit models, see my doctoral
thesis :cite:`elff:politische.ideologien`.
.. [1] It is nevertheless possible to re-express baseline-category logit
models as conditional logit models, as is shown on `this page `__
Random effects in baseline logit models and conditional logit models
====================================================================
The "mclogit" package allows for the presence of random effects in
baseline-category logit and conditional logit models. In baseline-category logit
models, the random effects may represent (unobserved) characteristics that are
common the individuals in clusters, such as regional units or electoral
districts or the like. In conditional logit models, random effects may
represent attributes that share across several choice occasions within the same
context of choice. That is, if one analyses voting behaviour across countries
then an random effect specific to the Labour party may represent unobserved
attributes of this party in terms of which it differs from (or is more like) the
Social Democratic Party of Germany (SPD). My original motivation for working on
conditional logit models with random effects was to make it possible to assess
the impact of parties' political positions on the patterns of voting behaviour
in various European countries. The results of this research are published in an
article in *Electoral Studies* :cite:`elff:divisions.positions.voting`.
In its earliest incarnation, the package supported only a very simple random-intercept
extension of conditional logit models (or "mixed conditional logit models",
hence the name of the package). These models can be written as
.. math::
\pi_{ij} = \frac{\exp(\eta_{ij})}{\sum_{k\in\mathcal{S}_i}\exp(\eta_{ik})}
with
.. math::
\eta_{ij}=\sum_h\alpha_hx_{hij}+\sum_kz_{ik}b_{jk}
where :math:`x_{hij}` represents values of independent variables,
:math:`\alpha_h` are coefficients, $z_{ik}$ are dummy ariables (that are equal to
$1$ if $i$ is in cluster $k$ and equal to $0$ otherwise), :math:`b_{jk}` are
random effects with a normal distribution with expectation $0$ and variance
parameter $\sigma^2$.
Later releases also added support for
baseline-category logit models (initially only without random effects). In order
to support random effects in baseline-category logit models, the package had to
be further modified to allow for conditional logit models with random slopes
(this is so because baseline-categoy logit models can be expressed as a
particular type of conditional logit models). (The relations between these
various model variants will be discussed on a dedicated page as soon as I find
the time to write it.)
It should be noted that estimating the parameters of random effects multinomial
logit models (whether of baseline-category logit variety or the conditional
logit variety) involves the considerable challenges already known from the
"generalized linear mixed models" literature. The main challenge is that the
likelihood function involves analytically intractable integrals (i.e. there is
know way to "solve" or eliminate the intergrals from the formula of the
likelihood function). This means that either computationally intensive methods
for the computation of such integrals have to be used or certain approximations
(most notably the Laplace approximation technique and its variants), which may lead to
biases in certain situations. The "mclogit" package only supports approximate
likelihood-based inference. Most of the time the PQL-technique based on a (first-order)
Laplace approximation was supported, release 0.8, "mclogit" also supports the
MQL technique, which is based on a (first-order) Solomon-Cox approximation.
The ideas behind the PQL and MQL techniques are described e.g. in
:citet:`breslow.clayton:approximate.inference.glmm`. A dedicated page will
describe these techniques as soon as I find the time for this.
Documentation of the Package
============================
.. toctree::
:titlesonly:
:maxdepth: 1
mclogit/manual-pages
mclogit/manual-index
mclogit/technical
References
==========
.. bibliography::
:citations:
.. |Travis build status| image:: https://travis-ci.org/melff/mclogit.svg?branch=master
:target: https://travis-ci.org/melff/mclogit
.. |Current release on GitHub| image:: https://img.shields.io/github/release/melff/mclogit.svg
:target: https://github.com/melff/mclogit/releases/
.. |CRAN| image:: https://www.r-pkg.org/badges/version/mclogit
:target: https://cran.r-project.org/package=mclogit
.. |Total downloads from RStudio CRAN mirror| image:: https://cranlogs.r-pkg.org/badges/grand-total/mclogit
:target: https://cran.r-project.org/web/packages/mclogit/index.html
.. |Monthly downloads from RStudio CRAN mirror| image:: https://cranlogs.r-pkg.org/badges/mclogit
:target: https://cran.r-project.org/web/packages/mclogit/index.html