Accelerate your data journey. Join our AI Community!

Publication

AWS re:Invent 2020 Machine Learning Keynote Recap & Highlights
News   Technology

AWS re:Invent 2020 Machine Learning Keynote Recap & Highlights

Sharing Thoughts and Points to Ponder

Author(s): Juv Chan

Welcome to the first-ever Machine Learning Keynote at AWS re:Invent. It is a 2-hour virtual session delivered by Dr. Swami Sivasubramanian, VP of Amazon Machine Learning, on the latest developments, launches, and demos in AWS machine learning and AI, as well as customer insights and success stories. Let’s recap the key highlights in chronological order.

250+ New Features Released in 2020

It only takes slightly more than a single day on average for a new ML/AI feature release from AWS in 2020 based on working days! Even if it is based on calendar days, it still only takes less than 2 days on average to achieve this feat. The AWS ML/AI pace of innovation is really incredibly fast this year.

What do you think?

92% Cloud-based TensorFlow & 91% Cloud-based PyTorch runs on AWS

The information above is from the Nucleus Research U192 — Guidebook: Deep Learning on AWS — November 23, 2020. The percentage of cloud-based PyTorch runs on AWS should be 90% instead of 91%. On the other hand, the horizontal bar chart below from Kaggle’s State of Machine Learning and Data Science 2020 survey also shows that AWS is the most popular cloud platform among enterprise data scientists.

Enterprise Cloud Computing, Kaggle’s State of Machine Learning and Data Science 2020 survey

Habana Gaudi-based Amazon EC2 Instances

Habana Gaudi-based Amazon EC2 instances will be available in the first half of 2021. The Habana Gaudi AI processors are built specifically for ML training workloads, which can deliver up to 40% better price/performance than the current GPU-based Amazon EC2 instance.

Learn more from this press release and announcement blog.

AWS Trainium

AWS Trainium will be available in 2021. It is a new custom machine learning training chip designed and built by AWS to deliver the most cost-effective ML training in the cloud. I think it will be exciting to compare its price/performance with other AI training accelerator ASICs such as Habana Gaudi and Cloud TPU.

Faster Distributed Training on Amazon SageMaker

Amazon SageMaker distributed training is now generally available, which enables distributed training to complete on Amazon SageMaker for up to 40% faster at no additional cost. The two new SageMaker data parallelism and model parallelism distributed training libraries introduced are:

A convincing showcase is that AWS and NVIDIA have achieved the world’s fastest training times for Mask R-CNN and T5–3B with this feature.

Mask R-CNN (Region-based Convolutional Neural Network) is a state-of-the-art (SOTA) deep neural network architecture for instance segmentation in computer vision object detection.

T5–3B (Text-To-Text Transfer Transformer — 3 Billion parameters) is a SOTA Natural Language Processing (NLP) model from Google, pretrained on the Colossal Clean Crawled Corpus (C4) dataset. It achieves near-human performance on multiple NLP tasks on the SuperGLUE benchmark.

Learn more from this announcement blog, and the Get Started with Distributed Training guide.

Amazon SageMaker Data Wrangler

Amazon SageMaker Data Wrangler is now generally available. It enables faster and easier data preparation for machine learning via a visual interface.

Learn more from this announcement blog, and the Get Started with Data Wrangler guide.

Amazon SageMaker Feature Store

Amazon SageMaker Feature Store is now generally available. It serves as a fully managed repository to store, discover, and share machine learning (ML) features. This enables the re-use of machine learning features that save time and cost for machine learning workflows.

Learn more from this announcement blog, and the Get Started with Feature Store guide.

Amazon SageMaker Clarify

Amazon SageMaker Clarify is now generally available. It enables bias detection in both data and model as well as model explainability on understanding the model behaviour. This feature is useful in improving model fairness and transparency for building safer and more responsible AI solutions.

Learn more from this announcement blog and the Model Fairness and Explainability guide.

Deep Profiling for Amazon SageMaker Debugger

Deep Profiling for Amazon SageMaker Debugger is now generally available. It enables deep profiling of machine learning training jobs. This feature is useful for identifying training bottlenecks and system resource utilization.

Learn more from this announcement blog and the Identifying Bottlenecks, Improve Resource Utilization and Reduce ML Training Costs with Deep Profiling feature in Amazon SageMaker Debugger blog.

Amazon SageMaker Pipelines

Amazon SageMaker Pipelines is now generally available. It is the first purpose-built continuous integration and continuous delivery (CI/CD) service for machine learning. This feature enables automated end-to-end MLOps workflows with built-in or custom MLOps templates. Below is a demo of the SageMaker Pipelines MLOps workflow.

Learn more from this announcement blog and these SageMaker Pipelines get started guide.

Amazon SageMaker Edge Manager

Amazon SageMaker Edge Manager is now generally available. It simplifies the management of ML models across fleets of edge devices such as smart cameras, robots, personal computers, and mobile devices.

Learn more from this announcement blog, and this Edge Manager get started guide.

Amazon Redshift ML

Amazon Redshift ML is now available in preview. It enables data analysts and database developers to leverage SQL to create and train ML models from data in Amazon Redshift and use these models to make in-database predictions. Amazon Redshift is the most popular, fully managed, and petabyte-scale data warehouse.

Learn more from this announcement blog, and this Redshift ML get started guide.

Amazon Neptune ML

Amazon Neptune ML is now generally available. It uses Graph Neural Networks (GNNs) to make easy, fast, and more accurate predictions on graphs. Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets.

Learn more from this announcement blog and this Using Neptune ML on graphs guide.

Amazon QuickSight Q

Amazon QuickSight Q is now available in preview. It is a natural language search service for business intelligence that allows business users to ask data questions in plain language and get answers instantly. Amazon QuickSight is a scalable, serverless, embeddable, machine learning-powered business intelligence (BI) service built for the cloud. Below are demos of the results asked by the user in plain English shown in the QuickSight dashboards.

Learn more from this announcement blog.

Amazon Lookout for Metrics

Amazon Lookout for Metrics is now available in preview. It is a service that automatically detects and diagnoses anomalies from metrics such as a dip in product sales or a sudden increase in qualified sales leads. It also provides root cause analysis that enables businesses to take actions faster to deal with the anomalies.

Learn more from this announcement blog.

Amazon Monitron

Amazon Monitron is now generally available. It is an end-to-end predictive maintenance service that monitors industrial machinery equipment and automatically detects potential failures to minimize unplanned downtime. The Amazon Monitron Starter Kit is available now.

Learn more from this announcement blog.

Amazon Lookout for Equipment

Amazon Lookout for Equipment is now available in preview. It is an anomaly detection service that allows customers with existing equipment sensors to use AWS ML models to detect abnormal equipment behaviour and enable predictive maintenance.

Learn more from this announcement blog.

Amazon Lookout for Vision

Amazon Lookout for Vision is now available in preview. It is is a service that spots visual defects and anomalies in products using computer vision (CV) to automate quality inspection for manufacturing.

Learn more from this announcement blog and the Lookout for Vision developer guide.

AWS Panorama Appliance

AWS Panorama Appliance is now available in preview as part of AWS Panorama. It is a hardware device that adds computer vision (CV) capability to existing internet protocol (IP) cameras that weren’t built to accommodate CV. It turns existing IP cameras into smart cameras that can run CV models on multiple concurrent video streams with low latency and high data privacy.

Learn more from this announcement blog.

AWS Panorama SDK

AWS Panorama SDK is now available in preview as part of AWS Panorama. It is a software development kit (SDK) that enables third-party manufacturers to build new cameras that run CV models at the edge for tasks like object detection, facial recognition or activity recognition, and more.

Learn more from this announcement blog.

Amazon HealthLake

Amazon HealthLake is now available in preview. It is a fully managed HIPAA-eligible service that enables allows healthcare and life sciences customers to aggregate their health data from different silos and formats into a centralized AWS data lake at a petabyte-scale.

Learn more from this announcement blog, and the HealthLake get started guide.

Machine Learning Education

Here is a list of AWS public resources, Massive Open Online Courses (MOOC) with third parties such as Coursera, edX, Udacity, as well as educational devices for anyone who is interested in machine learning education.

  1. AWS Machine Learning University
  2. AWS Machine Learning
  3. AWS Machine Learning Training Library
  4. AWS Ramp-Up Guide: Machine Learning
  5. AWS Educate: Machine Learning Scientist Career Pathway
  6. Udacity-AWS: Machine Learning Engineer Nanodegree
  7. Coursera-AWS: Getting Started with AWS Machine Learning
  8. edX-AWS: Amazon SageMaker: Simplifying Machine Learning Application Development
  9. AWS DeepLens
  10. AWS DeepRacer
  11. AWS DeepComposer

The Five Tenets

  1. Provide firm foundations
  2. Create the shortest path to success
  3. Expand machine learning to more builders
  4. Solve real business problems, end-to-end
  5. Learn continuously

Some of these tenets are aligned with one or more of Amazon’s Leadership Principles, such as Learn and Be Curious (Tenet 5).

Final Thoughts

We can see that a lot of AWS ML/AI new innovations this year are centered on AWS SageMaker and industrial machine learning services. There is no doubt why AWS SageMaker becomes one of the fastest-growing services ever in AWS history. If AWS continues to keep or accelerate the pace of innovations in ML/AI in the future, I am positive that AWS will continue to maintain its lead and edge in the Gartner Magic Quadrant for Cloud AI Developer Services in the future.

Gartner’s Magic Quadrant for Cloud AI Developer Services (Feb. 2020)

Feedback ↓