# Machine Learning Interview Questions-1

Last Updated on July 19, 2023 by Editorial Team

Originally published on Towards AI.

## Careers, Machine Learning

A Machine Learning Engineer has to cover the breadth concepts in ML, DL , Probability , Stats, and coding with a good depth of understanding . Machine Learning Engineer interview is not about ‘what’ we always think for why and how so I limited to 2–3 what questions in this discussion and focusing on why and how Questions.

1. How to use k-NN for classification and regression?

A) For Classification apply majority vote of neighbors and for Regression, we do mean or median of all k neighbors.

2)Why do we use the word ‘Regression’ in Logistic Regression even though we use it for Classification?

A) We use the Logistic Regression for classification after the model predicting the continuous output between (0–1) which we can interpret as the probability of the point belonging to the class.

• such as the sigmoid(W.Xq)> 0.5 we label it as positive else negative.

3)Explain intuition behind Boosting

A)Train the very base model h(0) with training data as the model that we fit this training data is set to be having a high bias, it makes more number of training errors.

• Then store the errors
• Train the next model on the errors got in the previous model
• If we keep on doing this..…each time we get residual errors and we try to predict them in the next model then our final model=Fi(x) = a0 h0(x) + a1 h1(x)…….+ai hi(x)

4)What does it mean by the precision of a model equal to zero is it possible to have precision equal to 0.

A) Precision represents out of all predicted positives of how many are actually positive.

precision = (True positives)/(True positives + Falsepositives)

• precision equal to 0 if every predicted point by the model is a false positive.

5 )Why we need Calibration?

• Calibration is a must if probabilistic class-label is needed as output
• If the metric is log-loss and that needs the P(Y_iU+007CX_i) values, then calibration is a must.
• The probabilities output by the models such as LR, naive Bayes are often NOT well calibrated which can be observed by plotting the calibration plot. Hence, we use calibration as a post-processing step to ensure that the final class-probabilities are well-calibrated.

6)Where is parameter sharing seen in deep learning?

A) Parameter sharing is the sharing of weights by all neurons in a particular feature map.CNN uses the same weight vector to perform the convolution operation and RNN has the same weights at every time stamps.

7)How many parameters do we have in LSTM?

A) 4(mn+m²+m) .For derivation checkout following blog.

## Summing up DL -1: LSTM Parameters

### why 4(nm+n^2+n)?

medium.com

8)What is box cox transform? When can it be used?

• Box-Cox transform helps us convert non-Gaussian distributed variables into Gaussian distributed variables.
• It is a good idea to perform it if your model expects features that are Gaussian distributed(Gaussian Naive Bayes).

9)How to use the K-S test to find two random variables X1 and X2 follow the same distribution?

• Plot CDF for both random variables
• Assume Null hypothesis: the two random variables come from the same distribution;
• Take Test statistic D = supremum (CDF(X1) — CDF(X2)) throughout the CDF range
• Null hypothesis is rejected when D > c(α) * sqrt( (n+m)/nm )
• where m and n are no of observations in CDF1 and CDF2 respectively .

10) Explain the backpropagation mechanism in dropout layers?

• While training Neural Network with dropout the output is calculated without considering those weights for chosen neurons that are selected to be dropped and they have the same value as they had in previous iterations. And that weight doesn't update while back-propagation.
• Note that weights will not become zero they just ignored for iteration.

11)Find the output shape and parameters after following operations?

(7,7,512) ⇒ flatern ⇒ Dense(512)

(7,7,512) ⇒ Conv (512,(7,7))

• for (7,7,512) ⇒ flatern ⇒ Dense(512) Trainable parameters = (7*7*512)*512=12845056 ,output shape =512
• For (7,7,512) ⇒ Conv (512,(7,7)) Trainable parameters = (7*7*512)*512=12845056 ,output shape =1,1,512

12)How will you calculate the P(x/y=0) in the case of x is continuous random variable?

A) If x is a numerical feature then assume that the feature follows Normal distribution. Then we can obtain likelihood probabilities from (PDF)density function whereas the absolute likelihood value for any continuous variable is zero.

13)Explain the Correlation and Covariance?

A) Covariance shows the linear relationship between variables we cant interpret how strong they are related whereas with Correlation it gives the linear relationship strength and direction of the two variables.

14)What is the problem with sigmoid during backpropagation?

A) The derivative of the Sigmoid function lies between 0 and 0.25. During chain rule multiplication of the gradients, it will tend to zero which results in vanishing gradients problems and that impact on weight update.

15)Difference between micro average F1 and macro average F1 for a multiclass class classification?

• F1 Score = 2*precision*Recall/(precision +recall)
• For 3 classes Classification for each class, there will be respective True positives, False positives, True Negative, False Negative

a)Micro Avg F1

• Microaverage of precision=TP1+TP2+TP3/(TP1+TP2+TP3+FP1+FP2+FP3)
• Microaverage of Recall=TP1+TP2+TP3/(TP1+TP2+TP3+FN1+FN2+FN3)
• Micro Avg F1=2*precision*Recall/(precision +recall)

b) Macro-average Method

• Macroaverage of precision=P1+P2+P3/3
• Macroaverage of Recall=R1+R2+R3/3
• Where P1=TP1/(TP1+FP1) , R1=TP1/(TP1+FN1) Same for P2,P3
• Macro Avg F1=2*precision*Recall/(precision +recall)

16) Why Image augmentation help in Image classification tasks?

A) Image data augmentation used to create or expand data by artificially generating new images from changing input images, such as translation, scaling, mirror, steering, Zoom, etc. Such that we can make our model be robust to input image change.

medium.com

medium.com

medium.com

medium.com

## References:

www.appliedaicourse.com

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI