Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

EfficientDet: When Object Detection Meets Scalability and Efficiency
Latest   Machine Learning

EfficientDet: When Object Detection Meets Scalability and Efficiency

Last Updated on July 20, 2023 by Editorial Team

Author(s): Aniket Maurya

Originally published on Towards AI.

EfficientDet, a highly efficient and scalable state of the art object detection model developed by Google Research, Brain Team. It is not just a single model. It has a family of detectors which achieve better accuracy with an order-of-magnitude fewer parameters and FLOPS than previous object detectors.

EfficientDet paper has mentioned its 7 family members.

Comparison of EfficientDet detectors[0–6] with other SOTA object detection models.

Source: arXiv:1911.09070v1

Quick Overview of the Paper

  1. EfficientNet is the backbone architecture used in the model. EfficientNet is also written by the same authors at Google. Conventional CNN models arbitrarily scaled network dimensions- width, depth and resolution. EfficientNet uniformly scales each dimension with a fixed set of scaling coefficients. It surpassed SOTA accuracy with 10x efficiency.
  2. BiFPN: While fusing (applying residual or skip connections) different input features, most of the works simply summed them up without any distinction. Since both input features are at the different resolutions they don’t equally contribute to the fused output layer. The paper proposes a weighted bi-directional feature pyramid network (BiFPN), which introduces learnable weights to learn the importance of different input features.
  3. Compound Scaling: For higher accuracy previous object detection models relied on β€” bigger backbone or larger input image sizes. Compound Scaling is a method that uses a simple compound coefficient Ο† to jointly scale-up all dimensions of the backbone network, BiFPN network, class/box network, and resolution.

Combining EfficientNet backbones with our propose BiFPN and compound scaling, we have developed a new family of object detectors, named EfficientDet, which consistently achieve better accuracy with an order-of-magnitude fewer parameters and FLOPS than previous object detectors.

BiFPN

Source: arXiv:1911.09070v1 β€” figure 2

Conventional FPN (Feature Pyramid Network) is limited by the one-way information flow. PANet added an extra bottom-up path for information flow. PANet achieved better accuracy but with the cost and more parameters and computations. The paper proposed several optimizations for cross-scale connections:

  1. Remove Nodes that only have one input edge.
    If a node has only one input edge with no feature fusion, then it will have less contribution to the feature network that aims at fusing different features.
  2. Add an extra edge from the original input to output node if they are at the same level, in order to fuse more features without adding much cost.
  3. Treat each bidirectional (top-down & bottom-up) path as one feature network layer, and repeat the same layer multiple times to enable more high-level feature fusion.

Weighted Feature Fusion

While multi-scale fusion, input features are not simply summed up. The authors proposed to add additional weight for each input during feature fusion and let the network to learn the importance of each input feature. Out of three weighted fusion approaches β€”
Unbounded fusion:

Source: https://arxiv.org/abs/1911.09070

Where W is a learnable weight that can be a scalar (per-feature), a vector (per-channel), or a multi-dimensional tensor (per-pixel). Since the scalar weight is unbounded, it could potentially cause training instability. So, Softmax-based fusion was tried for normalized weights.
Softmax-based fusion:

Source: https://arxiv.org/abs/1911.09070

As softmax normalizes the weights to be the probability of range 0 to 1 which can denote the importance of each input. The softmax leads to a slowdown on GPU.
Fast normalized fusion:

Source: https://arxiv.org/abs/1911.09070

Π„ is added for numeric stability. It is 30% faster on GPU and gave almost as accurate results as softmax.

Final BiFPN integrates both the bidirectional cross-scale connections and the fast normalized fusion.

EfficientDet Architecture

Source: arXiv:1911.09070v1 β€” figure 3

EfficientDet follows one-stage-detection paradigm. A pre-trained EfficientNet backbone is used with BiFPN as the feature extractor. BiFPNN takes {P3, P4, P5, P6, P7} features from the EfficientNet backbone network and repeatedly applies bidirectional feature fusion.
The fused features are fed to a class and bounding box network for predicting object class and bounding box.

EfficientDet: Scalable and Efficient Object Detection

Model efficiency has become increasingly important in computer vision. In this paper, we systematically study various…

arxiv.org

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for…

arxiv.org

Path Aggregation Network for Instance Segmentation

The way that information propagates in neural networks is of great importance. In this paper, we propose Path…

arxiv.org

Deep Residual Learning for Image Recognition

Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of…

arxiv.org

Hope you liked the article.

U+1F449 Twitter: https://twitter.com/aniketmaurya
U+1F449 Mail: [email protected]

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓