9 Lessons From the Tesla AI Team
Last Updated on July 26, 2023 by Editorial Team
Author(s): Gary Chan
Originally published on Towards AI.
What contributes to the success of Tesla, and what you can do
While OpenAI is famous for its success in NLP and DeepMind is well known for RL and decision making, Tesla is definitely one of the most impactful companies in computer vision. Even if you are not a computer vision practitioner, there are a lot of things you can learn from Tesla about its production-grade AI.
Teslaβs AI team does not publish blog posts like Uber or Airbnb, so it is quite hard for outsiders to understand what they did and how they achieved what they had today. However, the 2021 Tesla AI Day has revealed the techniques and lessons that all AI partitioners and companies can learn from.
Here are the lessons.
I. Business
1. Creative (re)use of the cutting-edge technology
Tesla announced its new product: a humanoid robot called Tesla Bot. While vehicles and robots may look vastly different, self-driving cars and AI-powered robots actually share lots of common building blocks. For instance, both Tesla cars and Tesla Bots require sensors, batteries, real-time computation and analysis of the received data, as well as instant decision-making capabilities. Thus the Tesla-built AI chips and hardware can be used on both products.
In terms of software and algorithms, the two Tesla products need a vision system and a planning system. Therefore Tesla car team can share the software codebase with the Tesla Bot team. The economies of scale further drive the average costs of development lower and potentially make Tesla more competitive in the market.
Another benefit for the Tesla AI team is that data received from Tesla Bot can also be used for the training of Tesla self-driving cars.
In fact, there are also other examples of reusing the in-house advanced technology. One of them is reCAPTCHA. Luis von Ahn first co-invented CAPTCHA for people to self-identify themselves by asking users to type the letters displayed on the screen. Later on, he challenged himself to make CAPTCHA more useful and finally he partnered with NY Times and invented reCAPTCHA, a system that breaks down long sentences in the book and asks users to input what they see. Just a slight adjustment is enough to help digitize millions of books in days which could have taken years to do so.
If your company has a unique technology, keep in mind the nature of the technology and think about what kind of people might benefit from your technology. Devote some mental resources to think about it and you will find your next business line.
II. Improvement and Progression
2. Iterative progress
In the Q&A session, Elon Musk said,
In general, innovation is how many iterations and what is the average progress between each iteration. And so if you can reduce the time between iterations, the rate of improvement is much better.
So, if it takes like sometimes a couple days for a model to train versus a couple hours, thatβs big deal.
For ML engineers and data scientists, I bet you would agree with me that he could not be more right. ML people spent their time, mostly not on machine learning modeling, but things like visualizing things, error analysis, cleaning data, studying data, getting more data, and so on. Without the right tools, it takes much longer for developers to make progress. Another common scenario is that multiple developers in the same team or multiple teams in the same company are making similar tools independently to make their own lives easier.
This has explained why Tesla has a tooling team and builds a lot of things in-house. And it is the next lesson.
3. Specifically designed system is better than a general one
Tesla understands that GPU is generally-built hardware and the massive computations could be made much faster with specifically built chips. Thatβs why Tesla has built their own AI chips: Dojo.
Other than the chips, Tesla also built its own AI compute infrastructure that runs 1 million experiments a week. They also built their own debugging tools to visualize the results. During the presentation, Andrej Karpathy mentioned that he found the data label management tool to be critical and they are proud of the tool.
If your team has limited resources to build your own tool, join the open-source community. You can build something that suits your needs on top of the open-source project. If your problem is new or you have a better idea of doing things, do the hard part of developing the initial prototype and writing good documentation to help people who share similar problems with you help you.
4. Mistakes are INEVITABLE. Learn from them.
If you do a little research on the backgrounds of the people standing on the stage, you will soon find that they are all extremely smart people who have got a Ph.D., graduated from top-tier universities, or done something impressive in the past.
These people make mistakes too. Andrej Karpathy shared that they partnered with 3rd party data provider in the beginning. I think they do this because they want to have the data faster and reduce the costs. However, they soon found that working with 3rd party for such critical things and the back and forth communications are just βnot going to cut itβ. They then brought the labelers in-house and now they have more than 1,000 labelers.
The lesson here is that innovation and technology advances are, as always, a trial and error process. Mistakes are part of it. If you avoid mistakes and blame others for the failure, you are not learning and you will not make progress.
III. AI Practice
5. Neural Network is a Lego block
During the Q&A sharing, Ashok Elluswamy viewed neural networks as just a block in the system that could be combined with anything. He explained that you can actually bake in search and optimization into planning into the network architecture, or you can combine physics-based blocks (models) and neural networks blocks to make a hybrid system.
I think the idea of combining non-neural network models with neural networks for training is quite interesting, and it is definitely a thing worth trying.
6. HydraNets
The idea of HydraNet can date back to 2018, which is a long time in the AI community. Still, I think the idea is great and would be useful in many scenarios. Andrej explained that HydraNet allows the neural network to share a common architecture, de-coupling the tasks and you can cache the intermediate features to save computations.
7. Simulation as a solution to insufficient data
Label imbalance is common and everywhere. The minority data are very difficult, if not impossible, to get. However, to deploy AI in the real world, it is always the edge cases that matter because it could imply severe and unwanted consequences.
Simulation is one of the data argumentation techniques to generate new data, but it is easier said than done. In Tesla, the simulation team used techniques like ray tracing to generate realistic video data which I personally could not tell if they are true or not at first glance.
I think this technique is really the secret weapon of Tesla since this makes the task of getting lots of unusual data extremely easy. Now they can generate video data like a couple running with a dog on a highway, which is not probable but definitely possible.
By the way, what do you think about the idea of simulation as a service?
8. 99.9% of the time you donβt need machine learning
An audience asked whether Tesla is using machine learning within its manufacturing design or other engineering processes, and this is what Elon said.
I would not comment on the 99.9% figure because it depends on what you are talking about. For example, you certainly do not need ML to find out your highest-spending customers. All you need is just a sorting algorithm.
Nevertheless, I do see this is a common mistake of many people. You need to have a set of conditions satisfied to make ML work. If not, there is a whole bunch of other tools that help you to solve your data science problems. For example, genetic algorithms, mathematical modeling, scheduling algorithms, and so on.
When you have a hammer, everything looks like a nail.
9. Data and compute
Ian Glow, manager of the Autopilot Simulation team, mentioned that there was a recent paper about photorealism enhancement paper showing the latest achievements, but his team could do a lot more than the team publishing the paper. It was because Tesla has way more data, computing power, and human resources.
It is just another example showing data and computing are essential for deep learning. I think the lesson here is that if you do need to use deep learning in production, do spend time to think about how to get data and use the compute power effectively and efficiently. Continue to do this until you make the average cost of acquiring data and using the data negligible.
Conclusion
While many people focused on the implementation details of the deep learning models, I think the big ideas, lessons, and the thinking process behind them are equally valuable. I hope this article can bring you something new and help you to develop a better ML practice.
Cheers.
If you like this article, please U+1F44F U+1F44F U+1F603. Your support is my biggest motivationU+1F4AA.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI