Purpose
Multinomial logit model is used to estimate probability of each categorical outcome from multiple choices. Basic idea is same to binary logit model; set a hidden factor z for each probability and build regression equations on them. Its likelihood is given by a function involving probabilities. Finally, maximizing sum of logarithm of likelihood leads to estimates of coefficients.
Note
Most important feature of multinomial logit model is to set a “base category.” It is needed because multinomial logit estimates probabilities of shift from base to other categories.
In short, it estimates relative probability of outcomes to base outcome.
Calculation
Let’s say here is a case where there can be k outcomes. First, set a latent factor z for each outcome and builid regression equations on them.
Probability of each outcome is given by a function below;
where z_{0} = 0, so therefore exp(z_{0}) = 1.
Then, the likelihood is computed as product of all probabilities; L(θ; y)=Πp_{i}
Now, maximizing sum of logarithm of likelihood leads to coefficients estimation.
Example
Here is an example of medicine choices from 3 brands by 735 patients. Brand A, B, and C are denoted here as 1, 2, and 3. Independent variables are gender and age.
Now, hidden variable z for each outcome should be created.
Computing z for 1(z_{0}), 2(z_{1}), and 3(z_{2}). Note β_{1} has 2 coefficients for dummy variable (female). So, equation is z_{i} = α_{i} + β_{1(0)}(10)(female) + β_{1(1)}(female) + β_{2}(age)
Next, logit of each z (estimated probability) is calculated as below. Look into the form of equations.
As mentioned above, again, the equation is p_{i} = exp(z_{i})/Σexp(z_{i}) where z_{1} = 0 thus exp(z_{1}) = 1.
Note: By the way, as you can see, z_{0} here is not used anywhere else. So, actually it isn’t needed.
And product of all probabilities is likelihood. The goal is to maximize sum of its logarithm.
And the predicted category will be naturally the choice of highest probability.
Finally, Excel solver will find coefficients to maximize sum of loglikelihood.
Now, it’s done. See the picture below;
Results. See coefficients of z_{1} are unchanged, because z_{1} is not involving calculation of estimated probabilities.
“Goodness” in the picture above is the proportion of correct prediction; this model predicted 315 patients’ choices correctly out of 735 patients.
Appendix: z_{0} = 0, thus exp(z_{0}) = 1
Here is a short note about why z_{0} = 0.

As mentioned earlier, multinomial estimates probability of shift from “base category” to others. So, probability controller z for base category should be both mean and median of all its possible values.

As mentioned in the post on binary logit model, z ~ (∞, ∞). Thus, z_{0} = 0.

In other words, multinomial logit model estimates shift probabilities by estimating shift of z from 0.
Dear Len,
It would be of interes if you upload the Excel file used to sustain your explications. Can you?
Thank you.
Best regards,
aspi
Dear Len, Is it possible for you to share the Excel for the multinomial logit model? Best, Praveen