Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

ECCV 2020 Best Paper Award | A New Architecture For Optical Flow
Computer Vision   Research

ECCV 2020 Best Paper Award | A New Architecture For Optical Flow

Last Updated on January 6, 2023 by Editorial Team

Author(s): Louis Bouchard

Computer Vision,Β Research

Photo by Cris Ovalle onΒ Unsplash

ECCV 2020 Best Paper Award Goes to Princeton Team.
They developed a new end-to-end trainable model for optical flow.
Their method beats state-of-the-art architectures’ accuracy across multiple datasets and is way more efficient.

They even made the code available for everyone on their Github!
Let’s see how they achievedΒ that.

Paper’s introduction

https://github.com/princeton-vl/RAFT

The ECCV2020 conference happened last week.
A ton of new research papers in the field of computer vision was out just for this conference.
Here, I will be covering the β€œBest Paper Award” that they gave to Princeton Team.

In short, they developed a new end-to-end trainable model for optical flow called β€œRAFT: Recurrent All-Pairs Field Transforms for Optical Flow.”
Their method achieves state-of-the-art accuracy across multiple datasets and is way more efficient.

What is opticalΒ flow?

Gif by: https://gfycat.com/fr/wetcreepygecko

First, I will quickly explain what optical flow is.
It is defined as the pattern of apparent motions of objects in a video.
Which, in other terms, means the motion of objects between consecutive frames of a sequence.
It calculates the relative motion between the object and the scene.
It does that by using the temporal structure found in a video in addition to the spatial structure found in eachΒ frame.

Gif by: https://nanonets.com/blog/optical-flow/

As you can see, you can easily calculate the optical flow of a video using OpenCV’s functions:

import cv2
import numpy as np
cap = cv2.VideoCapture("vtest.avi")

ret, frame1 = cap.read()
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[...,1] = 255

while(1):
ret, frame2 = cap.read()
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)

flow = cv2.calcOpticalFlowFarneback(prvs,next, None, 0.5, 3, 15, 3, 5, 1.2, 0)

mag, ang = cv2.cartToPolar(flow[...,0], flow[...,1])
hsv[...,0] = ang*180/np.pi/2
hsv[...,2] = cv2.normalize(mag,None,0,255,cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv,cv2.COLOR_HSV2BGR)

cv2.imshow('frame2',rgb)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
elif k == ord('s'):
cv2.imwrite('opticalfb.png',frame2)
cv2.imwrite('opticalhsv.png',rgb)
prvs = next

cap.release()
cv2.destroyAllWindows()

It just takes a couple of lines of code to generate it in a live feed. Here are the results you get from a normal video frame using this shortcode:

Code & Image by: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_video/py_lucas_kanade/py_lucas_kanade.html

It is super cool and really useful for many applications.

Gif by:https://nanonets.com/blog/optical-flow/

Such as traffic analysis, vehicle tracking, object detection and tracking, robot navigation, and much more. The only problem is that it is quite slow and needs a lot of computing resources.

This new paper helps with both of these problems while producing even more accurateΒ results!

What is this paper? What did the researchers doΒ exactly?

Now, let’s dive a bit deeper into what this paper is all about and how it’s an improvement from current state-of-the-art approaches.
They improved the state-of-the-art methods in fourΒ ways.

Image by:https://arxiv.org/pdf/2003.12039.pdf

First, it can be directly trained on optical flow instead of requiring the network to be trained using an embedding loss between pixels making it much more efficient.
Then, regarding the flow prediction, current methods directly predict it between a pair of frames.
Instead, they optimized their computation time a lot by maintaining and updating a single high-resolution flowΒ field.

With this flow field, they had to use a GRU block, which is similar to an LSTM block, in order to refine their optical flow iteratively, as the best current approaches do.

Image by:http://dprogrammer.org/rnn-lstm-gru

This block allows them to share the weights between these iterations while allowing convergence using their fixed flow field when training.

Image by:https://blog.clairvoyantsoft.com/the-ascent-of-gradient-descent-23356390836f?gi=9b683d504450

The last distinction between their technique and the other approaches is that instead of explicitly defining a gradient with respect to an optimization objective, using backpropagation, they retrieve features from correlation volumes to propose the descent direction.

How have they doneΒ that?

Image by:https://arxiv.org/pdf/2003.12039.pdf

All these improvements were made using their new architecture.
It is basically composed of 3 main components.

Image by:https://arxiv.org/pdf/2003.12039.pdf

At first, there is an encoder that extracts the per-pixel features from two different frames along with another encoder which extracts features only from the first frame, in order to understand the context ofΒ the

Image by:https://arxiv.org/pdf/2003.12039.pdf

image.
Then, using all pairs of feature vectors, they generate a 4-dimensional volume using the width and height of bothΒ frames.

Image by:https://arxiv.org/pdf/2003.12039.pdf

Finally, they use an update operator which recurrently updates the optical flow.
This is where the GRU block is.
It retrieves values from the previous correlation volumes and iteratively updates the flowΒ field.

Results

Just look at how sharp the results are. While being faster than current approaches!

Watch the video showing theΒ results:

They even made the code available for everyone on their Github!
Which I linked in below if you’d like to try itΒ out.

Of course, this was a simple overview of this ECCV2020's best paper award winner.
I strongly recommend reading the paper linked below for more information.

OpenCV Optical Flow tutorial: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_video/py_lucas_kanade/py_lucas_kanade.html
The paper: https://arxiv.org/pdf/2003.12039.pdf
GitHub with code: https://github.com/princeton-vl/RAFT

If you like my work and want to support me, I’d greatly appreciate it if you follow me on my social media channels:

  • The best way to support me is by following me onΒ Medium.
  • Subscribe to my YouTubeΒ channel.
  • Follow my projects onΒ LinkedIn
  • Learn AI together, join our Discord community, share your projects, papers, best courses, find kaggle teammates and muchΒ more!


ECCV 2020 Best Paper Award | A New Architecture For Optical Flow was originally published in Towards AIβ€Šβ€”β€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓