Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Analyzing Ordinal Data in SAS using the Multinomial Distribution.
Artificial Intelligence

Analyzing Ordinal Data in SAS using the Multinomial Distribution.

Last Updated on March 24, 2022 by Editorial Team

Author(s): Dr. Marc Jacobs

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

This post is an extension of an earlier introductory post I made on using Generalized Linear Mixed Models inΒ SAS.

Below, I will use a dataset containing the diarrhea scores of pigs to show how to analyze ordinal data. Before doing so, one has to consider that the assessment of diarrhea scores is done in a subjective way, using raters, who rate diarrhea on a score from 0 to 3 in steps of 1β€Šβ€”β€Š0,1,2,3. A nine equals the absence of feces from which to assess diarrhea. It is not immediately clear if that actually means that there is no diarrhea or just plain absence. It can also have an effect on the analysisΒ itself.

Score distribution of a swine population by five raters. It is not difficult to spot that the results of a trial are dependent on not just the β€˜actual’ score but also the person who rated theΒ feces.
Including or excluding a particular score can have an influence, since it affects the denominator and thus the probability estimate.

Ordinal data can be analyzed in multiple ways, of which we show the ordinal and multinomial way in this post. The type of analysis you will use depends on your assumptions, the specific question you would like to answer, and what the model is able to accommodate.

The type of analysis should be based on the specific question you would like to answer, the availability of the data, and the ability of theΒ models.
Model-wise, it definitely has an effect if you model the data across time or per time point. Ordinal and multinomial models, in SAS, cannot accommodate analyses per time-point.

Diarrhea is a discrete outcome and as such should be treated as a proportion and not a continuous distribution with a true mean. Renaming the levels of diarrhea from {0, 1, 2, 3, 9} into {A, B, C, D, Z} would solve the confusion. Diarrhea can be measured in multiple ways the choice of which should depend onΒ the:

  1. the specific question you want to beΒ answered
  2. availability ofΒ data

Let's do some actual analysis and use a dataset that looks like this, to startΒ with.

The dataset in which, for each pig, we have the diarrhea score per day up to 42Β days.

The multinomial approach is perhaps the easiest, to begin with. Remember, Ordinal and multinomial models cannot handle repeated data. Thus only total periods may be used (f.e. 0–42, 0–14, etc.). So, to use ordinal/multinomial models, specific data management is required:

  1. per Block, Treatment & Fecal Consistency Score, the frequency needs to be calculated and put in theΒ model.
  2. In case of fecal consistency, category β€˜9’ (FC=9) is removed, as these are often deemed missing values (see earlier discussion points)

Below you see the PROC SQL command to transform the data in a dataset that is necessary for both multinomial and ordinal modeling. The trick is to get frequencies per category per treatment perΒ block.

Which will make the dataset look likeΒ this.
And immediately shows the distribution of scores across treatments.

Below, you see the full code used the model the multinomial. The distribution is set at multinomial, but this is also the case if you want to model using the ordinal approach. Hence, the key is the link function, set here at glogit and will be set to clogit for the ordinal approachβ€Šβ€”β€Šglogit stands for generalized logit and means that we will estimate the proportions separately. Treatment four is set as the reference treatment. For the estimates, you need to use the bycat function to get estimates per category of the feacal consistency score.

Here you can see the full code used to analyze the dataset using a multinomial model.
No treatment differences in the distribution of the feacal scores. No surprise, if you look at the plot based on the observedΒ data.
And the differences. Since we are dealing with a logit function, the comparison has to be done via the Odds Ratio once again. However, what is more important to notice is that the comparison is not done at the treatment level aloneβ€Šβ€”β€Šit is also done at the feacal score level. Hence, treatments are compared for their probability of achieving a particular feacal score. Since we are doing it the multinomial way, there is no interdependence.
And the most important plot by far. This plot shows the probability of achieving a particular score given a particular treatment.

Plotting the results is often a better way of getting to terms with your model. This graph shows the probabilities for each of the diarrhea scores across treatments. From the graph you can seeΒ that:

  1. Score 3 is almostΒ absent
  2. Score 0 is mostΒ present
  3. Treatment difference seems to be most present in scoreΒ 2

To understand the Odds Ratios from the previous table, you need to get comfortable with proportions. So, let's use another example but to do so, we need to once again transform the data to let PROC GLIMMIX processΒ it.

The score distribution by blocks and treatments.

And below the same code, we sawΒ before.

The code and the probability distribution. This time, one can clearly see that treatment two sticksΒ out.

Now, the plot above is quite easy to interpret but of course, you need to have actual numerical estimates to make sense of the effect size of the differences between treatments. In a multinomial model, the interpretation depends on the choices you madeβ€Šβ€”β€Ševen more than for any other kind ofΒ model.

So, as you can see, the table is a bit different than normal, but we already knew that. Comparisons are made per score category. But, if you look more closely, you can see that score 0 is missing. That is because it is the reference score. If you want to have specific estimates for score=0, then you need to set a different reference score.

The reference score determines all.

For the rest, it is pretty much straightforward. You get an estimate and a mean per an estimate, of which the mean is the probability. Once we go towards direct comparisons and treatment differences, you will have to deal with Odds RatiosΒ again.

The estimates in aΒ table.
The estimates in a graph in the left and middle, and a direct comparison to the right. The comparison to the right is NOT an odds ratio. Also, the two left plots show if the treatments are similar, per feacal score (1 & 2). The middle plot shows that, for feacal score 2, treatment2 two, four, and five are different. This is confirmed by the plot to the right in which treatments two, four, and fiveΒ crosses.

The influence of the reference score is easiest done by repeating the analysis, but by changing the reference score. Below, you see the probability plots for reference score zero en two. Now, please look closely and see that the actual probability plot does NOT change. That is because it cannot change. However, the estimates do differ, because the scores are compared to a DIFFERENT reference score.

Probability plots stay the same regardless of the reference score. That is because they are reference agnostic and treatment dependent.
Now this is where you can see the difference. If you change the reference score, the treatment estimates and differences change looking at the same score. This is not a true change, but just a change in estimate by using a different anchorΒ point.

Below you can see what changing the reference score does for interpretation. It is quite something, so be careful if you look at these plots. In fact, I would like to advise you to only look at these plots in conjunction with the overall probability plot.

They are all connected. Pieces of the same puzzle. Just look at theΒ numbers.
You need this plot to make sense of the plots above andΒ below.
Differences are linked to the reference used

In summary, when analyzing diarrhea scores, the multinomial model is the most straightforward as it looks into the scores if they are mutually exclusive, and analyzes the treatment effect per score category. Beware, the reference score affects the results so keep thinking in terms of probabilitiesβ€Šβ€”β€Šgraphs helpβ€Šβ€”β€Šand in terms of odds ratios which are not intuitive. You could run the model multiple times, changing the reference score and treatment to get the full picture. Another caveat is that you cannot analyze the data in a repeated measures wayβ€Šβ€”β€Šonly suitable for the totalΒ period.

Now, let's move to the ordinal approach.

The ordinal model is built on the same foundation as the multinomial model, however, it does approach the data quite differently. In fact, the ordinal model builds in assumptions to make analyses and comparisons more easy and direct. Let's see if that truly is theΒ case.

The majority of multi-level variables are ordinal which means that categories have a specific ordering although the β€œdistance” between the levels is not clear. However, the categories are man-made and there is no actual underlying theoretical distribution. When using an ordinal model, you however do assume that there is a latent continuous variable underneath its categories. You also assume the Proportional Odds Assumption which is often not met. One way to defend the use of an ordered model is to say that underneath the observed scores there is a latent variable, Y*. This latent variable has thresholds, and crossing a threshold will place you in a different category. This is why an ordinal model is often called a β€˜threshold model’. As a result, an ordinal model will come up with a single equation for Y*. This is also directly the difference with the multinomial model which allows for more than one equation since it treats the categories asΒ separate

A distribution on top of observed categories? Or, did we use categories to describe an unobserved distribution? This is the β€˜problem’ an ordinal model tries to deal with by incorporating it as an advantage. However, most of the time, ordinal models are difficult to fit because the equal space they assume between categories makes no sense whatsoever.

In an ordinal model, the only parameter that changes is the intercept. The fixed effects remain the same for each category. The intercept defines the boundaries between categories when the linear predictor is zero. Changes in the linear predictor move the boundaries together so that the distance between them remains CONSTANT!! This is the definition of the Proportional Odds assumption

It is the intercept that determines the threshold for the next category. Hence, in a five-category outcome variable, we have four intercepts as you can see in the exampleΒ below.

An ordinal model will come up with a single equation for Y*. The intercepts themselves have no real meaningβ€Šβ€”β€Šthey are placed on a latent distribution which is a mathematical construct as you could see from the picturesΒ above.
The PROC SQL code to transform the data into the necessary structure. It looks exactly the same as the structure needed for the multinomial model. Because itΒ does!
The part of the model that I needed to change to not run an ordinal model instead of a multinomial modelβ€Šβ€”β€Šthe cumlogit or clogit. The name is not in vain as it means that probability is modeled in a cumulative way, not general/separate as in a multinomial.
The most important table from an ordinal analysis. If you see a table like this, you already know that category three will bring you problems. Perhaps best to combine it with categoryΒ two?
No differences between treatments.
Advanced codes, but nothing special to be honest. Just ways to fully understand what the model has done by requesting more extensive output.
A short summary on how to interpret the output. The biggest difference is that the comparisons are done on the treatment level, not the treatment * score. This is because the ordinal model assumes proportionality. Hence, the difference between treatments for score one is the same as score two or three. Remember, this was not the case in the multinomial model. Truth be told, the assumption often makes little sense, if it DOES hold you will have a much more globalΒ model.
The probabilities are cumulative, meaning that the probability for score one is actually the probability for score zero plus the actual probability for scoreΒ one.

The best way to deal with complex math is to look at this plotβ€Šβ€”β€Šthe cumulative probability plot. It shows the cumulative distribution of the frequencies. Here, it is clearly visible that score 0 is most frequent and that scores 2 and 3 add only a little. Hence, it is perhaps best to combine scores 1, 2 & 3 vs scoreΒ 0

The key assumption to the proportional odds model is that the relationship BETWEEN categories remains the SAME. Hence, probabilities are not category-specific and treatment effects move the boundaries between categories as a group. As such, the odds ratios of a treatment effect for NOT SEVERE vs SEVERE are the same as for SEVERE vs VERYΒ SEVERE.

Remember, an ordinal model tries to find a linear predictor that will predict the change from one category to the next, but the probabilities are not category-specific. We need to test this assumption using PROCΒ GLIMMIX

The estimate calls are made to test the proportionality assumption. What I want to see are estimates that are similar within a threshold.

If the assumptions of the ordered cumlogit model are met, then all of the corresponding coefficients (except the intercepts) should be the same. The assumptions of the model are therefore sometimes referred to as the parallel lines or parallel regressions assumptions

The estimates are actually quite similar, except for treatments 4 and 5 for the threshold defining the boundary between scores 1 and 2 (c=1). I am also not completely sure about treatments 4 and 5 for c=0. Hence, a multinomial model is perhaps safer toΒ use.

In summary, assuming that the data is ordinal means that there is a certain ordering of the discrete data. The specific ordering is determined by a latent continuous distribution. The data for the ordinal approach needs the same setup as for the nominal approach. Results show the effect of the treatment, regardless of the specific score. Hence, there is no reference to take into account. Checking if the proportional odds assumption was upheld isΒ tricky.

I hope you liked this post! More toΒ come!


Analyzing Ordinal Data in SAS using the Multinomial Distribution. was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓