Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

How To Get Started With Computer Vision In 2023?
Latest   Machine Learning

How To Get Started With Computer Vision In 2023?

Last Updated on July 25, 2023 by Editorial Team

Author(s): Hasib Zunair

Originally published on Towards AI.

A zero to a non-zero roadmap to becoming a computer vision engineer or researcher in 2023. Know what to learn and how to apply the learned skills in real-world projects to get into industry or academia.

Source: Image by possessedphotography at Unsplash.

Motivation

Computer vision (CompVis) is a field of artificial intelligence (AI) that involves training computers to interpret and understand images and videos. Practical applications of CompVis span from industrial manufacturing robots, self-driving cars, and video surveillance to medical imaging and augmented reality. In many cases, CompVis can automate tasks and saves time and effort for us Neanderthals, which makes it useful for practical applications. Additionally, in some cases, it also outperforms humans, making CompVis a vital tool for many industries. [1]

In this article, I’ll share a roadmap that you can use to get started with CompVis, either in industry or academia. First, I will share some free and publicly available learning resources. Then I will talk about platforms where you can apply the learned skills to build your portfolio. If you are new or have some experience, this guide can potentially make you even better in this very exciting and rapidly evolving field!

This article is organized as follows:

  1. Learning resources
  2. Online competitions
  3. Industry and research collaborations

Let’s get started!

Learning resources

In this section, I’ll go over three resources that you should consider taking in order to get a good understanding of the theory as well as practice behind building CompVis systems. This is to increase your depth as a CompVis practitioner. The next two will be those which you should go over to get an idea of the various tasks and learning paradigms in CompVis. This is to increase your breadth.

Deep Learning Specialization consists of a total of five courses that will teach you the foundations of deep learning applied to CompVis, natural language processing, etc. It covers both theoretical and practical concepts to build, train and test deep learning models. You’ll get to build and train your own models via the course assignments. Take your time to finish all five courses sincerely!

CS231n: Deep Learning for Computer Vision deep dives into the details of image classification architectures with a focus on learning end-to-end models. It consists of hands-on assignments which let you implement and train your own CompVis models on a real-world problem of your choice. It also provides details for practical engineering tips and tricks for training and fine-tuning deep learning models.

Deep Learning in Computer Vision with PyTorch gives you a quick and easy walkthrough of training and testing image classification and semantic segmentation algorithms on your own datasets. Finally, it shows you how to build and run a simple web interface so that anyone can use your newly trained models. (Shameless self-publicity!)

Deep Learning for Computer Vision, Justin Johnson covers implementing, training, and debugging neural networks and provides an in-depth understanding of cutting-edge research in CompVis. It covers CompVis tasks like object detection, semantic segmentation, 3D vision, and generative models, as well as reinforcement learning.

Deep Learning in Computer Vision, Prof. Kosta Derpanis is a more recent course that covers a range of topics like action recognition, vision, and language, graph neural networks. It also covers learning paradigms like metric learning and self-supervised learning.

Source: Photo by author. Deep Learning Specialization Certificate. The five courses represent five infinity stones! What’s the sixth one? πŸ˜‰

Some other learning resources that could be useful to look at:

  1. Roboflow tutorials on using SOTA computer vision models
  2. Hugging Face Tasks
  3. Hugging Face Transformers Tutorials

There are a lot of code examples in the three links. Once you’ve done the courses I mentioned above, you’ll already know what you need from them, so it will not be overwhelming. Pick your poison!

Online competitions

Next, I’ll enumerate some previous competitions/challenges you can do yourself and apply your learned skills from the courses mentioned above. This will also help you to get an idea as to how online competitions work (e.g., get data, train models, test and analyze, submit results, and iterate). Then, I’ll mention names of competition platforms that also host challenges from popular CompVis conferences where you could possibly start your first online competition!

Dogs vs. Cats: An image classification task where you will build a model to predict dogs and cats from images.

Flower Classification with TPUs: Similar task as Dogs vs. Cats but many classes. This is known as multi-class image classification. Here you will build a model to classify over 100 types of flowers. Instead of using GPUs, you’ll get familiar with using TPUs.

Carvana Image Masking Challenge: A semantic segmentation task where the goal is to develop a model to remove the photo studio background from the car. This is similar to image classification but at a pixel level where each pixel is assigned a class label which leads to a final output mask of the desired object (i.e., car).

Global Wheat Detection: An object detection problem where the goal is to build a model to localize (e.g., draw bounding boxes) on wheat heads from outdoor images of wheat plants.

RSNA STR Pulmonary Embolism Detection: Previous classification tasks deal with 2D images; in this challenge, the goal is to detect and classify abnormalities from chest CT scans which are 3D images. This is 3D image classification.

ML Competitions Platforms: The above competitions are hosted on Kaggle, which is the most popular competition platform. There exist other platforms where that host different competitions you could take part in. I’ll go over a few:

  1. Grand Challenge: Mostly for biomedical imaging problems. Conferences workshops in MICCAI host competitions here.
  2. AIcrowd: Businesses, universities, government agencies or NGOs host various challenges. Competitions are also hosted by NeurIPS as workshops.

You can also look at CodaLab and Eval.ai. To stay up-to-date with ongoing competitions, see mlcontests. GPU Issues? You’ve got Kaggle kernels and Google Colaboratory.

Industry and academic collaborations

Now in this final section, I’ll talk about ways that enable industry and academic collaborations. Once you do a few of the online competitions, they build your intuition on building CompVis systems, as they are mostly based on real-world data. From there, you can either go towards industry to work on business problems or academia to conduct research.

Omdena AI: I asked perplexity.ai what Omdena is, and this is what it said:

Omdena AI is a collaborative platform that builds AI and data science solutions to real-world problems. It is a community-first organization that empowers AI engineers worldwide to become change makers and helps mission-driven organizations and startups build impactful AI solutions through global collaboration. Omdena AI conducts challenges that bring together data scientists from around the world to work on specific projects, such as detecting wildfires in the Amazon.

Basically, it is a platform where you get to work with companies on real-world problems. One caveat is that, in the beginning, the work you will do is unpaid. However, as you finish a couple of projects (each with a different company), you build your portfolio and can get into the Omdena Top Talent program, where you get paid to work on projects or even work full-time! As a starter, I think this is the closest you get can work with people in the industry, apart from getting an internship! This is an effective way for someone (even you!) could build experience on real-world problems and break into the industry.

Your University: That’s right, you’re university! This seems very obvious, but I get this a lot. You can collaborate with your university professors, possibly as a research assistant, if you want to focus more on CompVis research and aim for good publications. This worked for me when I first started CompVis research. I’ll leave that story for another piece! Here’s what you can do. First, narrow down the professors in your university that you’d like to work with. Have a look at their research profile, what topics they work on, and see if you’re actually interested in those. Then, email all of them saying you would like to work with them, it is nice to mention what topics. It’s alright if you do not hear from most of them. This becomes a bit easy if you already know them in person and have taken their classes; just go to their offices! And that’s how you get into academia!

Conclusion

In this post, I talked about ways to get started with computer vision as a beginner, and break into the industry or in academia. I mentioned resources to learn the fundamentals of computer vision, as well as platforms to apply your new knowledge via online competitions and even get into industry/academic collaborations.

I am currently writing this piece on a layover in Doha as I am traveling from Montreal, Canada to Dhaka, Bangladesh. To people who have asked me β€œhow to get started with computer vision”, this one is for you! Good luck.

About the author

Aloha! I am a Ph.D. candidate at Concordia University in Montreal, Canada, working on computer vision problems. I also work part-time at DΓ©cathlon, where I help build data-driven tools to transform sports images and videos into actionable intelligence. If you’re interested to learn more about me, please visit my webpage here.

References

[1] Harl, Max., et al. β€œA Light in the Dark: Deep Learning Practices for Industrial Computer Vision”. In arXiv, 2022.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓