Interview Questions: Object Detection

Last Updated on July 20, 2023 by Editorial Team

Author(s): Akula Hemanth Kumar

Originally published on Towards AI.

Interview Questions: Object Detection — Photo by pisauikan on Unsplash

I am currently in a job search for a Computer vision engineer. In this article, I am trying to share the things which I have learned. I would like to thank Jonathan for this awesome Object detection series.

“This is for my personal reference. If you find any mistakes, please comment I will correct them.”

U+1F4CCWhat is the loss function in YOLO? [src]

U+1F4A1 YOLO uses a sum of squared error between the predictions and the ground truth to calculate the loss. The loss function composes of:

The Classification loss.
The Localization loss (errors between the predicted boundary box and the ground truth).
The Confidence loss (the objectness of the box).

Loss function = classification loss + localization loss + confidence loss

U+1F4CCWhat is the advantage of two-stage methods? [src]

U+1F4A1 In two-stage methods like R-CNN, they first predict a few candidate object locations and then use a convolutional neural network to classify each of these candidate object locations as one of the classes or as background.

U+1F4CCWhat is the main problem faced with Single-shot methods?[src][src]

U+1F4A1 Single-shot methods like SSD suffer from extremely by class imbalance. SSD resamples the ratio of the object class and background class during training so it will not be overwhelmed by an image background.

U+1F4CCWhat is Focal Loss in RetinaNet?[src]

U+1F4A1Focal Loss helps in dealing with class imbalance. Focal loss (FL) adopts an approach to reduce the loss for a well-trained class. So whenever the model is good at detecting background, it will reduce its loss and reemphasize the training on the object class.

U+1F4CCWhat is the Loss function in SSD?[src]

U+1F4A1SSD’s loss function is a combination of two critical components :

Confidence Loss: This measures how confident the network is of the objectness of the computed bounding box. Categorical cross-entropy is used to compute this loss.
Location Loss: This measures how far away the network’s predicted bounding boxes are from the ground truth ones from the training set. L2-Norm is used here.

ssd_loss = confidence_loss + alpha * location_loss

The alpha term helps us to balance the contribution of the location loss.

U+1F4CCWhat is FPN?[src]

U+1F4A1 Feature Pyramid Network (FPN) is a feature extractor designed with a feature pyramid concept to improve accuracy and speed. Images are first to pass through the CNN pathway, yielding semantically rich final layers. Then to regain better resolution, it creates a top-down pathway by upsampling this feature map. While the top-down pathway helps detect objects of varying sizes, spatial positions may be skewed. Lateral connections are added between the original feature maps and the corresponding reconstructed layers to improve object localization. It currently provides one of the leading ways to detect objects at multiple scales, and YOLOv3, Faster R-CNN were build up with this technique.

U+1F4CC Why do we use data augmentation?[src]

U+1F4A1 Data augmentation is a technique for synthesizing new data by modifying existing data in such a way that the target is not changed, or it is changed in a known way. Data augmentation is important in improving accuracy. Augment data techniques like flipping, cropping, add noise, and color distortion.

Data augmentation helps in performance improvement in SSD300:

U+1F4CC What is the advantage of SDD over Faster R-CNN?[src]

U+1F4A1SSD speeds up the process by removing the need for the region proposal network(RPN) used in Faster R-CNN.

U+1F4CCWhat are the metrics used for object detection?[src]

U+1F4A1mAP (mean Average precision) is a popular metric in measuring the accuracy of object detectors. Average precision calculates the average precision value for recall value over 0 to 1.

U+1F4CCWhat is NMS?[src]

U+1F4A1 Non-Max Suppression (NMS) is a technique used in many computer vision object detection algorithms. It is a class of algorithms to select one bounding box out of many overlapping bounding boxes for a single class.

NMS implementation:

Sort the prediction confidence scores in decreasing order.
Start from the top scores, ignore any current prediction if we find any previous predictions that have the same class and IoU > Threshold(generally we use 0.5) with the current prediction.
Repeat the above step until all predictions are checked.

U+1F4CCWhat is IoU?[src]

U+1F4A1NMS uses the concept of Intersection over Union (IoU). IoU calculates intersection over the union of the two bounding boxes, the bounding box of the ground truth and the predicted bounding box.

U+1F4CCWhen do you say that an object detection method is efficient?

U+1F4A1 The performance efficiency of detection is measured using Floating-point Operations Per Second (FLOPS).

In the above figure, you can see that EfficientDet and YOLOv3 have fewer FLOPS, So we can say they are Efficient.

U+1F4CC Some questions about Hands-on experience in a custom object Detection

U+1F4A1 Try out Monk Object Detection Library

Tessellate-Imaging/Monk_Object_Detection

A one-stop repository for low-code easily-installable object detection pipelines. …

github.com

I am extremely passionate about computer vision and deep learning. I am an open-source contributor to Monk Libraries. Give us ⭐️ on our GitHub repo if you like Monk.

You can also see my other writings at:

Akula Hemanth Kumar – Medium

Read writing from Akula Hemanth Kumar on Medium. U+1F49FComputer VisionU+007C Linkedin…

medium.com

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.

Frequently Used, Contextual References

Resources

Interview Questions: Object Detection

Author(s): Akula Hemanth Kumar

Tessellate-Imaging/Monk_Object_Detection

A one-stop repository for low-code easily-installable object detection pipelines. …

Akula Hemanth Kumar – Medium

Read writing from Akula Hemanth Kumar on Medium. U+1F49FComputer VisionU+007C Linkedin…

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

Full-Stack Data Scientists for the Agentic Coding World

Building Production-Grade AI Skills with Snowflake Cortex AI Function Studio

I Tried 10 AI Agent Frameworks in 2026 — Here’s the Honest Guide I Wish I Had Earlier

How One Spring Boot Optimization Saved Our Startup $30,000 a Year

Inside Palantir AIP: How the World’s Most Controversial AI Platform Actually Works

What Is a Reverse Proxy? (And Why Every Backend Developer Should Care)

What Claude Opus 4.8 Actually Changes If You’re Building Agents

QWEN 3.7 Max Worked For 35 Hrs Straight And The Results Were Mind-blowing

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Interview Questions: Object Detection

Author(s): Akula Hemanth Kumar

Tessellate-Imaging/Monk_Object_Detection

A one-stop repository for low-code easily-installable object detection pipelines. …

Akula Hemanth Kumar – Medium

Read writing from Akula Hemanth Kumar on Medium. U+1F49FComputer VisionU+007C Linkedin…

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement