EfficientDet: When Object Detection Meets Scalability and Efficiency

Last Updated on July 20, 2023 by Editorial Team

Author(s): Aniket Maurya

Originally published on Towards AI.

EfficientDet, a highly efficient and scalable state of the art object detection model developed by Google Research, Brain Team. It is not just a single model. It has a family of detectors which achieve better accuracy with an order-of-magnitude fewer parameters and FLOPS than previous object detectors.

EfficientDet paper has mentioned its 7 family members.

Comparison of EfficientDet detectors[0–6] with other SOTA object detection models.

EfficientDet: When Object Detection Meets Scalability and Efficiency — Source: arXiv:1911.09070v1

Quick Overview of the Paper

EfficientNet is the backbone architecture used in the model. EfficientNet is also written by the same authors at Google. Conventional CNN models arbitrarily scaled network dimensions- width, depth and resolution. EfficientNet uniformly scales each dimension with a fixed set of scaling coefficients. It surpassed SOTA accuracy with 10x efficiency.
BiFPN: While fusing (applying residual or skip connections) different input features, most of the works simply summed them up without any distinction. Since both input features are at the different resolutions they don’t equally contribute to the fused output layer. The paper proposes a weighted bi-directional feature pyramid network (BiFPN), which introduces learnable weights to learn the importance of different input features.
Compound Scaling: For higher accuracy previous object detection models relied on — bigger backbone or larger input image sizes. Compound Scaling is a method that uses a simple compound coefficient φ to jointly scale-up all dimensions of the backbone network, BiFPN network, class/box network, and resolution.

Combining EfficientNet backbones with our propose BiFPN and compound scaling, we have developed a new family of object detectors, named EfficientDet, which consistently achieve better accuracy with an order-of-magnitude fewer parameters and FLOPS than previous object detectors.

BiFPN

Conventional FPN (Feature Pyramid Network) is limited by the one-way information flow. PANet added an extra bottom-up path for information flow. PANet achieved better accuracy but with the cost and more parameters and computations. The paper proposed several optimizations for cross-scale connections:

Remove Nodes that only have one input edge.
If a node has only one input edge with no feature fusion, then it will have less contribution to the feature network that aims at fusing different features.
Add an extra edge from the original input to output node if they are at the same level, in order to fuse more features without adding much cost.
Treat each bidirectional (top-down & bottom-up) path as one feature network layer, and repeat the same layer multiple times to enable more high-level feature fusion.

Weighted Feature Fusion

While multi-scale fusion, input features are not simply summed up. The authors proposed to add additional weight for each input during feature fusion and let the network to learn the importance of each input feature. Out of three weighted fusion approaches —
Unbounded fusion:

Source: https://arxiv.org/abs/1911.09070

Where W is a learnable weight that can be a scalar (per-feature), a vector (per-channel), or a multi-dimensional tensor (per-pixel). Since the scalar weight is unbounded, it could potentially cause training instability. So, Softmax-based fusion was tried for normalized weights.
Softmax-based fusion:

As softmax normalizes the weights to be the probability of range 0 to 1 which can denote the importance of each input. The softmax leads to a slowdown on GPU.
Fast normalized fusion:

Є is added for numeric stability. It is 30% faster on GPU and gave almost as accurate results as softmax.

Final BiFPN integrates both the bidirectional cross-scale connections and the fast normalized fusion.

EfficientDet Architecture

EfficientDet follows one-stage-detection paradigm. A pre-trained EfficientNet backbone is used with BiFPN as the feature extractor. BiFPNN takes {P3, P4, P5, P6, P7} features from the EfficientNet backbone network and repeatedly applies bidirectional feature fusion.
The fused features are fed to a class and bounding box network for predicting object class and bounding box.

EfficientDet: Scalable and Efficient Object Detection

Model efficiency has become increasingly important in computer vision. In this paper, we systematically study various…

arxiv.org

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for…

arxiv.org

Path Aggregation Network for Instance Segmentation

The way that information propagates in neural networks is of great importance. In this paper, we propose Path…

arxiv.org

Deep Residual Learning for Image Recognition

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of…

arxiv.org

Hope you liked the article.

U+1F449 Twitter: https://twitter.com/aniketmaurya
U+1F449 Mail: aniketmaurya@outlook.com

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.

Frequently Used, Contextual References

Resources

EfficientDet: When Object Detection Meets Scalability and Efficiency

Author(s): Aniket Maurya

Quick Overview of the Paper

BiFPN

Weighted Feature Fusion

EfficientDet Architecture

EfficientDet: Scalable and Efficient Object Detection

Model efficiency has become increasingly important in computer vision. In this paper, we systematically study various…

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for…

Path Aggregation Network for Instance Segmentation

The way that information propagates in neural networks is of great importance. In this paper, we propose Path…

Deep Residual Learning for Image Recognition

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of…

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

RNNs Cannot Think What Transformers Think Cheaply. ICLR 2026 Proved the Gap Is Exponential.

Time Series Made So Easy My Aunt Got It on the Second Read

Claude Cowork 101

Is 3-Bit KV Cache the Holy Grail? A Reality Check on Google’s TurboQuant

LangGraph Multi-Agent Architecture: Building a Self-Critiquing AI Debate System

AutoML on Autopilot

I Ran This Open-Source AI Tool on a Messy Codebase and Got 71x Fewer Tokens — Here Is Exactly What Happened

Month in 4 Papers (April 2026)

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

EfficientDet: When Object Detection Meets Scalability and Efficiency

Author(s): Aniket Maurya

Quick Overview of the Paper

BiFPN

Weighted Feature Fusion

EfficientDet Architecture

EfficientDet: Scalable and Efficient Object Detection

Model efficiency has become increasingly important in computer vision. In this paper, we systematically study various…

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for…

Path Aggregation Network for Instance Segmentation

The way that information propagates in neural networks is of great importance. In this paper, we propose Path…

Deep Residual Learning for Image Recognition

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of…

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement