Multinomial Logistic Regression

Nominal Target Variable

It is easy to generalize the binary logistic regression model to the ordinal logistic regression model. However, we need to change the underlying model and math slightly to extend to nominal target variables.

In binary logistic regression we are calculating the probability that an observation has an event. In ordinal logistic regression we are calculating the probability that an observation has at most that event in an ordered list of outcomes. In nominal (or multinomial) logistic regression we are calculating the probability that an observation has a specific event in an unordered list of events.

We will be using the alligator data set to model the association between various factors and alligator food choices - fish, invertebrate, birds, reptiles, and other. The variables in the data set are the following:

Variable Description
size small (< 2.3 meters) or large (> 2.3 meters)
lake lake captured (George, Hancock, Oklawaha, Trafford)
gender male or female
count number of observations with characteristics

Generalized Logit Model

Instead of modeling the typical logit from binary or ordinal logistic regression, we will model the generalized logits. These generalized logits are built off target variables with \(m\) categories. The first logit will summarize the probability of the first category compared to the probability of the reference category. The second logit will summarize the probability of the second category compared to the probability of the reference. This continues \(m-1\) times as the last logit compares the \(m-1\) category to the reference category. In essence, with a target variable with \(m\) categories, we are building \(m-1\) logistic regressions. However, unlike the ordinal logistic regression where we have different intercepts with the same slope parameters, in nominal logistic regression we have both different intercepts and different slope parameters for each model.

The logits are also not traditional logits. Instead of the natural log of the odds, they are natural logs of relative risks - ratios of two probabilities that do not sum to 1.

Let’s see how to build these generalized logit models in each of our softwares!

Interpretation

Similar to binary and ordinal logistic regression models, we exponentiate the coefficients in our nominal logistic regression model to make them interpretable. However, the interpretation changes since these are not odds ratios anymore, but relative risk ratios. Let’s use the coefficient for the birds model and the size variable. It would be incorrect to say that the probability of eating birds is 2.076 times as likely for large alligators compared to small alligators. The correct interpretation would be that the predicted relative probability of eating birds rather than fish is 2.076 times as likely in large alligators compared to small alligators. Notice how our interpretation is in comparison to the reference level. Sometimes these are called conditional interpretations.

Let’s see how to get these from each of our softwares!

Predictions & Diagnostics

Nominal logistic regression has a lot of similarities to binary logistic regression:

  • Multicollinearity still exists
  • Non-convergence problems still exists
  • AIC and BIC metrics can still be calculated
  • Generalized \(R^2\) remains the same

There are some inherent differences though between binary and nominal logistic regression. A lot of the diagnostics cannot be calculated for multinomial logistic regression. ROC curves and residuals cannot typically be calculated because there is actually more than one logistic regression occurring.

Predicted probabilities are actually for each category though. Let’s see how we can get predicted probabilities from each of our softwares!