Understanding Bias in the Simplest Plausible Way
Last Updated on May 23, 2020 by Editorial Team
Author(s): Anirudh Dayma
I have recently started exploring Neural networks, and I came across the term activation function and biases. Activation function kinda made some sense to me, but I found it difficult to get the exact essence of biases in Neural Network.
I explored various source and all they had wasΒ –
Bias in Neural Networks can be thought of as analogous to the role of an intercept in linear regression.
But what the heck does this mean? I very well understand that intercept is the point where the line crosses the y-axis. And if we donβ have an intercept, then our line would always pass through the origin, but that isnβt how the things are in the real world, so we use intercept to add some flexibility.
So how do biases add flexibility to Neural networks?
You would find the answer to this question by the end of thisΒ article.
The above-quoted definition is the most common one we come across when we talk about biases. So I thought that letβs ignore my above question and go ahead with this mediocre explanation of bias (which ainβt making any sense). Because anyone, this is the mugged up answer people have when asked about bias. But then I came across this below image, and it hit me reallyΒ hard.
If I am not able to explain biases to a six-year-old, then I guess even I havenβt understood it (which indeed is the truth). So I started going through some more resources, and then finally, biases started making sense. Letβs understand bias with the help of a perceptron.
Perceptron
A perceptron can be imagined as something that takes in some set to binary inputs x1, x2,β¦ and produces a single binaryΒ output.
The weights correspond to the importance of the inputs, AKA features. The output i.e., 0 or 1, depends on whether the weighted sum of weights and input is greater than some threshold.
Mathematically
Letβs consider an example, suppose you like a person, and you want to decide whether or not you should tell that person about your feelings. You might come up to the final decision depending on the following questions.
- Do I really love that person, or is it just a mere infatuation?
- What if me speaking my heart out would ruin my equation with thatΒ person?
- Is it really worth with (would it have negative consequences on yourΒ career)
I know there might be many other questions as well, but these are common ones. So your decision to whether or not to speak your heart out depends on the above questions.
So let us consider x1, x2 and x3 as your 3 questions and x1 = 0 if it is an infatuation and x1 = 1 if it isnβt. Similarly x2 = 0 if it wonβt ruin your equation and x3 = 0 if you feel that it isnβt worthΒ it.
It might so happen that not all the above questions carry equal importance. What you could do is assign some numbers to all the questions depending on their importance/relevance. These numbers are nothing butΒ weights.
Suppose you assign weights, w1 = 3, w2 = 2, w3 = 7 where w1, w2, w3 represent weights for question 1,2 and 3 respectively. Meaning you are more concerned about whether or not itβs worth it as you want to focus on your career, and you cannot afford to have distractions(as w3 has higher magnitude). You set the threshold as 6, so if the weighted sum comes out to be greater than 6, then you would speak your heart out. But from the above weight values, it can be so seen that you ainβt worried about whether itβs infatuation or not and whether or not it would ruin the things because even if x1 = 1, x2 = 1 it wonβt contribute much to your decision as their weight is less. So looking at the threshold and w3, we can say the deciding factor is x3 (question 3).
Let us consider a different scenario, imagine the weights being w1 = 10, w2 = 3, w3 = 5. So here whether it is an infatuation or not is more important than anything else. And say now you keep the threshold low as 2, so you would come to a decision as Yes quite quickly. Say x1 = 0, x2 = 1, x3 = 0. In this case, the weighted sum is greater than the threshold even when x1 and x3 are 0. So even if x1 had larger weight, but it didnβt play the role of a deciding factor as the threshold was too low. This means that you are more eager to speak your heart out by setting a lower threshold.
So changing the weights and threshold changes the decision asΒ well.
Imagine a case where we donβt have a threshold so you would get a Yes as soon as the weighted sum is greater than 0. But we donβt want this to happen. We want to get to a conclusion depending upon the magnitude of the weighted sum. If the weighted sum exceeds a certain threshold, only then the output should be Yes else it should be No. Hence we need a threshold.
Let us simplify the equation by taking the threshold to the left-hand side.
Similarly, we want our neuron to get activated when our weighted sum is greater than a specific threshold. If we donβt use a threshold, the neuron will get activated as soon as the weighted sum is greater thanΒ 0.
So bias b=βthreshold and using bias instead of the threshold, we get a familiar equation.
Here Ξ£wixi is written as w.x = Ξ£wixi, where w and x are vectors whose components are weights and inputs, respectively.
We can think as bias is used for inactivity, the neuron would be activated only if the weighted sum is greater than the threshold, as b=βthreshold the concepts reverse a bit. Earlier, we said larger the threshold larger should be the weighted sum of the neuron to activate, but now as bias is the negation of threshold, larger the bias smaller the weighted sum required to activate theΒ neuron.
In this way bias is adding flexibility to the Neural network by deciding when should a neuron get activated.
Obviously, the perceptron isnβt exactly similar to the way humans make complex decisions, but this example helps understanding biases in a simplerΒ way.
I hope I have explained what bias in the simplest plausible way is. Feel free to drop comments or questions below, you can find me on Linkedin.
Understanding Bias in the Simplest Plausible Way was originally published in Towards AIβββMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI