The IWLS algorithm used to fit conditional logit models¶
The package “mclogit” fits conditional logit models using a maximum likelihood
estimator. It does this by maximizing the loglikelihood function using an
iterative weighted leastsquares (IWLS) algorithm, which follows the
algorithm used by the glm.fit()
function from the “stats” package of R.
If is the probability that individual chooses alternative from his/her choice set , where
and if is the dummy variable with equals 1 if individual chooses alternative and equals 0 otherwise, the loglikelihood function (given that the choices are identically independent distributed given ) can be written as
If the data are aggregated in the terms of counts such that is the number of individuals with the same choice set and the same choice probabilities that have chosen alternative , the loglikelihood is (given that the choices are identically independent distributed given )
where .
If
then the gradient of the loglikelihood with respect to the coefficient vector is
and the Hessian is
Here is , while is a diagonal matrix with diagonal elements .
NewtonRaphson iterations then take the form
where and are evaluated at .
Multiplying by gives
where is a generalized inverse of and is a “working response vector” with elements
The IWLS algorithm thus involves the following steps:
 Create some suitable starting values for , , and
 Construct the “working dependent variable”

Solve the equation
for .
 Compute updated , , , and bm{y}^*.

Compute the updated value for the loglikelihood or the deviance
 If the decrease of the deviance (or the increase of the loglikelihood) is smaller than a given tolerance criterian (typically ) stop the algorighm and declare it as converged. Otherwise go back to step 2 with the updated value of .
The starting values for the algorithm used by the mclogit package are constructe as follows:

Set
(where is the size of the choice set )
 Compute the starting values of the choice probalities according to the equation at the beginning of the page

Compute intial values of the working dependent variable according to