Join thousands of AI enthusiasts and experts at the Learn AI Community.

Publication

Computer Vision   Research

ECCV 2020 Best Paper Award | A New Architecture For Optical Flow

Last Updated on January 6, 2023 by Editorial Team

Author(s): Louis Bouchard

Computer Vision,ย Research

Photo by Cris Ovalle onย Unsplash

ECCV 2020 Best Paper Award Goes to Princeton Team.
They developed a new end-to-end trainable model for optical flow.
Their method beats state-of-the-art architecturesโ€™ accuracy across multiple datasets and is way more efficient.

They even made the code available for everyone on their Github!
Letโ€™s see how they achievedย that.

Paperโ€™s introduction

https://github.com/princeton-vl/RAFT

The ECCV2020 conference happened last week.
A ton of new research papers in the field of computer vision was out just for this conference.
Here, I will be covering the โ€œBest Paper Awardโ€ that they gave to Princeton Team.

In short, they developed a new end-to-end trainable model for optical flow called โ€œRAFT: Recurrent All-Pairs Field Transforms for Optical Flow.โ€
Their method achieves state-of-the-art accuracy across multiple datasets and is way more efficient.

What is opticalย flow?

Gif by: https://gfycat.com/fr/wetcreepygecko

First, I will quickly explain what optical flow is.
It is defined as the pattern of apparent motions of objects in a video.
Which, in other terms, means the motion of objects between consecutive frames of a sequence.
It calculates the relative motion between the object and the scene.
It does that by using the temporal structure found in a video in addition to the spatial structure found in eachย frame.

Gif by: https://nanonets.com/blog/optical-flow/

As you can see, you can easily calculate the optical flow of a video using OpenCVโ€™s functions:

import cv2
import numpy as np
cap = cv2.VideoCapture("vtest.avi")

ret, frame1 = cap.read()
prvs = cv2.cvtColor(frame1,cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[...,1] = 255

while(1):
ret, frame2 = cap.read()
next = cv2.cvtColor(frame2,cv2.COLOR_BGR2GRAY)

flow = cv2.calcOpticalFlowFarneback(prvs,next, None, 0.5, 3, 15, 3, 5, 1.2, 0)

mag, ang = cv2.cartToPolar(flow[...,0], flow[...,1])
hsv[...,0] = ang*180/np.pi/2
hsv[...,2] = cv2.normalize(mag,None,0,255,cv2.NORM_MINMAX)
rgb = cv2.cvtColor(hsv,cv2.COLOR_HSV2BGR)

cv2.imshow('frame2',rgb)
k = cv2.waitKey(30) & 0xff
if k == 27:
break
elif k == ord('s'):
cv2.imwrite('opticalfb.png',frame2)
cv2.imwrite('opticalhsv.png',rgb)
prvs = next

cap.release()
cv2.destroyAllWindows()

It just takes a couple of lines of code to generate it in a live feed. Here are the results you get from a normal video frame using this shortcode:

Code & Image by: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_video/py_lucas_kanade/py_lucas_kanade.html

It is super cool and really useful for many applications.

Gif by:https://nanonets.com/blog/optical-flow/

Such as traffic analysis, vehicle tracking, object detection and tracking, robot navigation, and much more. The only problem is that it is quite slow and needs a lot of computing resources.

This new paper helps with both of these problems while producing even more accurateย results!

What is this paper? What did the researchers doย exactly?

Now, letโ€™s dive a bit deeper into what this paper is all about and how itโ€™s an improvement from current state-of-the-art approaches.
They improved the state-of-the-art methods in fourย ways.

Image by:https://arxiv.org/pdf/2003.12039.pdf

First, it can be directly trained on optical flow instead of requiring the network to be trained using an embedding loss between pixels making it much more efficient.
Then, regarding the flow prediction, current methods directly predict it between a pair of frames.
Instead, they optimized their computation time a lot by maintaining and updating a single high-resolution flowย field.

With this flow field, they had to use a GRU block, which is similar to an LSTM block, in order to refine their optical flow iteratively, as the best current approaches do.

Image by:http://dprogrammer.org/rnn-lstm-gru

This block allows them to share the weights between these iterations while allowing convergence using their fixed flow field when training.

Image by:https://blog.clairvoyantsoft.com/the-ascent-of-gradient-descent-23356390836f?gi=9b683d504450

The last distinction between their technique and the other approaches is that instead of explicitly defining a gradient with respect to an optimization objective, using backpropagation, they retrieve features from correlation volumes to propose the descent direction.

How have they doneย that?

Image by:https://arxiv.org/pdf/2003.12039.pdf

All these improvements were made using their new architecture.
It is basically composed of 3 main components.

Image by:https://arxiv.org/pdf/2003.12039.pdf

At first, there is an encoder that extracts the per-pixel features from two different frames along with another encoder which extracts features only from the first frame, in order to understand the context ofย the

Image by:https://arxiv.org/pdf/2003.12039.pdf

image.
Then, using all pairs of feature vectors, they generate a 4-dimensional volume using the width and height of bothย frames.

Image by:https://arxiv.org/pdf/2003.12039.pdf

Finally, they use an update operator which recurrently updates the optical flow.
This is where the GRU block is.
It retrieves values from the previous correlation volumes and iteratively updates the flowย field.

Results

Just look at how sharp the results are. While being faster than current approaches!

Watch the video showing theย results:

They even made the code available for everyone on their Github!
Which I linked in below if youโ€™d like to try itย out.

Of course, this was a simple overview of this ECCV2020's best paper award winner.
I strongly recommend reading the paper linked below for more information.

OpenCV Optical Flow tutorial: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_video/py_lucas_kanade/py_lucas_kanade.html
The paper: https://arxiv.org/pdf/2003.12039.pdf
GitHub with code: https://github.com/princeton-vl/RAFT

If you like my work and want to support me, Iโ€™d greatly appreciate it if you follow me on my social media channels:

  • The best way to support me is by following me onย Medium.
  • Subscribe to my YouTubeย channel.
  • Follow my projects onย LinkedIn
  • Learn AI together, join our Discord community, share your projects, papers, best courses, find kaggle teammates and muchย more!


ECCV 2020 Best Paper Award | A New Architecture For Optical Flow was originally published in Towards AIโ€Šโ€”โ€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback โ†“