Join thousands of AI enthusiasts and experts at the Learn AI Community.

Publication

Latest

Bias vs Fairness vs Explainability in AI

Last Updated on August 27, 2021 by Editorial Team

Author(s): Ed Shee

Machine Learning

Photo by Lukas onย Unsplash

Over the last few years, there has been a distinct focus on building machine learning systems that are, in some way, responsible and ethical. The terms โ€œBiasโ€, โ€œFairnessโ€ and โ€œExplainabilityโ€ come up all over the place but their definitions are usually pretty fuzzy and they are widely misunderstood to mean the same thing. This blog aims to clear thatย upโ€ฆ

Bias

Before we look at how bias appears in machine learning, letโ€™s start with the dictionary definition for theย word:

โ€œinclination or prejudice for or against one person or group, especially in a way considered to beย unfairโ€

Look! The definition of bias includes the word โ€œunfairโ€. Itโ€™s easy to see why the terms bias and fairness get confused for each other aย lot.

Bias can impact machine learning systems at pretty much every stage. Hereโ€™s an example of how historical bias from the world around us can creep into yourย data:

Imagine youโ€™re building a model to predict the next word in a sequence of text. To make sure youโ€™ve got lots of training data, you give it every book written in the last 50 years. You then ask it to predict the next word in this sentence:

โ€œThe CEOs name isย ____โ€.

You then notice, perhaps unsurprisingly, that your model is much more likely to predict male names for the CEO than female ones. What has happened is youโ€™ve unintentionally taken the historical stereotypes that exist in our society and baked them into yourย model.

Bias doesnโ€™t just occur in the data though, it can appear in the model too. If the data used to test a model doesnโ€™t accurately represent the real world, you end up with whatโ€™s called evaluation bias.

A good example of this would be training a facial recognition system and then using photos from Instagram to test it. Your model might have really high accuracy on the test set but it is likely to underperform in the real world because the majority of Instagram users are between the ages of 18 and 35. Your model is now biased towards that age group and will perform worse on the faces of older or youngerย people.

There are actually loads of different types of bias in machine learning, Iโ€™ll cover all of those in a separateย blog.

The word bias almost always comes with negative connotations but itโ€™s important to note that this isnโ€™t always the case in machine learning. Having prior knowledge of the problem youโ€™re trying to solve can help you to select relevant features during modeling. This introduces human bias but can often speed up or improve the modelingย process.

Photo by Emily Morter onย Unsplash

Explainability

Sometimes referred to as interpretability, explainability attempts to explain how a machine learning model makes predictions. It is about interrogating a model, gathering information on why a particular prediction (or series of predictions) was made, and then presenting this information back to humans in a comprehensible manner.

There are typically two situations youโ€™ll be in when trying to explain how a modelย works:

  • Black Boxโ€Šโ€”โ€ŠYou have no access or information about the underlying model. The inputs and outputs of the model are all you can use to generate an explanation.
  • White Boxโ€Šโ€”โ€ŠYou have access to the underlying model so itโ€™s easier to provide information about exactly why a certain prediction wasย made.

On the whole, โ€œwhite boxโ€ models tend to be simpler in design, sometimes deliberately, so that explanations can be easily generated. The downside is that using a simpler, more interpretable model might fail to capture the complexity of the relationships in your data which means you could be faced with a tradeoff between interpretability and model performance.

When doing explainability, weโ€™re typically interested in one of twoย things:

  • Model Viewโ€Šโ€”โ€ŠOverall, what features are more important than others to theย model?
  • Instance Viewโ€Šโ€”โ€ŠFor a particular prediction, what factors contributed?

The techniques used for explainability depend on whether your model is a black box or white box, whether youโ€™re interested in the model view or instance view, and also depends on the type of data youโ€™re exploring. The open source library Alibi does a great job of explaining these techniques in furtherย detail.

Personally, I like to think of white-box models as โ€œInterpretabilityโ€ (because of the requirement for an interpretable model) and black-box models as โ€œExplainabilityโ€ (because we are attempting to explain the unknown). Sadly, however, there is no official definition and the words are often used interchangeably.

Photo by Piret Ilver onย Unsplash

Fairness

Fairness is by far the most subjective of the three terms. As we did for bias, letโ€™s glance at its everyday definition before looking at how itโ€™s applied in machine learning:

โ€œimpartial and just treatment or behaviour without favouritism or discrimination.โ€

Applying this to the context of machine learning, the definition I like to useย is:

โ€œAn algorithm is fair if it makes predictions that do not favour or discriminate against certain individuals or groups based on sensitive characteristics.โ€

Most definitions youโ€™ll see (including mine above) tend to narrow the scope to machine learning that affects humans. Typically this is where AI can have disastrous consequences, and so fairness is super important. Something like a mortgage approval or a healthcare diagnosis is such a life-changing event that itโ€™s critical we handle predictions in a fair and responsible way.

Youโ€™re probably asking yourself โ€œWhatโ€™s a โ€œsensitive characteristicโ€ though?โ€ which is a very good question. The interpretation of the definition depends heavily on what you class as sensitive. Some obvious examples tend to be things like race, gender, sexual orientation, disability, etcโ€ฆ

One approach is to just remove all โ€œsensitiveโ€ attributes when building a model. This seems like a sensible thing to do at first but there are actually multiple issues withย this:

  • The sensitive features might actually be critical to the model. Imagine youโ€™re trying to predict the height a child will be when they are fully grown. Removing sensitive attributes like age and sex will make your predictions useless.
  • Fairness is not necessarily about being agnostic. Sometimes itโ€™s important to include sensitive features in order to favor those who might be discriminated against in other features. An example of this is university admissions, where raw grades alone may not be the best way to find the brightest pupils. Those who had access to fewer resources or a lower quality of education might have had better scores otherwise.
  • Sensitive features might be hidden in other attributes. It is often possible to determine the values for sensitive features using a combination of non-sensitive ones. For example, an applicantโ€™s full name might allow a machine learning model to infer their race, nationality, orย gender.

The reality is that AI fairness is an incredibly difficult field. It requires policymakers to define what โ€œfairโ€ looks like for each use case which can sometimes be very subjective. Often there is also a trade-off between group fairness and individual fairness. Using the university admissions example from earlier, making your algorithm fairer for an underprivileged group who didnโ€™t have the same educational resources (group fairness) comes at the cost of those who had a good educational background and whose grades are now no longer quite good enough (individual fairness).

Summary

In summary, bias, explainability, and fairness are not the same thing. Whilst trying to explain all or part of a machine learning model, you might find that the model contains bias. The existence of that bias might even mean that your model is unfair. That doesnโ€™t, however, mean that explainability, bias, and fairness are the sameย thing.

TL;DR

Bias is a preference or prejudice against a particular group, individual, or feature and comes in manyย forms.

Explainability is the ability to explain how or why a model makes a predictions

Fairness is the subjective practice of using AI without favoritism or discrimination, particularly pertaining toย humans


Bias vs Fairness vs Explainability in AI was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback โ†“