Machine Learning in the Cloud using Azure ML Studio
Last Updated on December 7, 2020 by Editorial Team
Author(s): Ranganath Venkataraman
Cloud Computing
Model training and deployment for a new Nanodegree from Udacity and Microsoft
After completing a Foundations course launched in July that enrolled~ 10,000 students worldwide, I was one of 300 selected in October for a scholarship to get a Nanodegree. After completing the capstone yesterday and graduating, Iβm writing this article to document my experience with the mechanics of AzureML. Hereβs the link to my projectβs repo.
Context
As evident in my previous posts, my self-taught journey to date focused on building predictors and visualizations using machine learning algorithms with Python. My interface was the Jupyter notebook, and my work lived on my laptop. As I was getting ready to practice and learn deploying models, I started the Nanodegree thatβββamong other conceptsβββcovered the deployment of models inΒ Azure.
How this article is organized
Topics covered in this article about Nanodegreeβs Capstone projectΒ are:
- Purpose
- Setting up for theΒ project.
- DevelopmentβββHyperDrive andΒ AutoML
- Deployment
- Conclusions
Throughout this article, Iβll discuss points that were good learning opportunities for my fellow students andΒ me.
Purpose
The capstone project is the culmination of training on various concepts; therefore, it tested our abilityΒ to:
- run a HyperDrive experiment to comb through the hyperparameter search space to find settings that maximize the performance of a pre-selected scikit-learn algorithm on a problem ofΒ choice
- use AutoML to find the best algorithm that maximizes performance on the aforementioned problem
- deploying the best performing model as an active endpoint that can be queried by anybodyβββgetting the product out of my laptop so anybody can useΒ it
Setting up for the CapstoneΒ project
The first step was to select a problem and dataset: I chose the problem of predicting vehicle fuel efficiency with the Auto MPG dataset from the UCI repository. Since I was familiar with this dataset from a prior post, I could focus on the mechanics of Azureβs Machine Learning (ML)Β Studio.
Microsoft and Udacity provide a virtual machine for the capstone: Standard DS3V2 with 14 GB RAM that can provide up to 4 CPU nodes. After launching a workspace that uses this virtual machine, the next setting-up step is to upload and register the dataset, which currently is aΒ .csv file on myΒ desktop.
Once completed, AzureML provides the user with the necessary code to consume the registered dataset, as seen in Figure 1 below. The line declaring the Experiment is my own and not necessary to consume a registered dataset.
With the dataset ready for use and a workspace launched on a running virtual machine βcompute,β we are ready to develop theΒ project.
DevelopmentβββHyperDrive
Setting up the HyperDrive experiment for this problem in Azure ML Studio entails setting the search space for model hyperparametersβββsimilar to GridSearchCVβββand selecting an Early Stopping policy to halt the experimentβs runs upon achieving a certain performance factor. These two factors, along with a max number of trials, provide means of configuring the HyperDrive run, as observed in FigureΒ 2.
Other settings needed to configure the HyperDrive run include the primary metric name, our goal with the metric, i.e., maximize or minimize, and the estimator. The number of max_concurrent_runs must be less than or equal to the maximum number of nodes that can be provisioned by the available compute, in this case,Β 3<4.
Continuing with this articleβs focus on the mechanics of AzureML, letβs discuss the estimator. As you see above, itβs declared using AzureMLβs SKLearn class and uses an entry script as input. That entry script is the actual code that cleans the data, splits data into training and testing sets, trains the pre-selected scikit-learn algorithm, and evaluates performance on a test set. Letβs look at part of the entry script in Figure 3 below, with the whole code available in train_11_29_FINAL.py script, found in the projectΒ repo.
As observed above, I chose to vary tree depth and learning rate when searching for the bestΒ model.
I also used the default scorer used by GradientBoosting Regressor, which returns an R-square value. This score must be logged in the entry script with the same name as the name used as the primary_scoring_metric in the HyperDriveConfigβββcompare Figures 2 andΒ 3.
Experiments are βsubmittedβ to launch HyperDrive runs. When these runs fail, the main console returns messages that can be inadequate for diagnosis. The user opens a given link that has a list of HyperDrive runs, click on a failed run, and open that failed runβs output logs. Those logs point out the specific line with theΒ error.
The best-performing run can then be retrieved from the HyperDrive experiment as seen in Figure 4 below, with the corresponding hyperparameter settings specified in an instance of the algorithm and saved to the active directory as a model using joblib. Figure 5 shows the ID and performance of the bestΒ run.
A HyperDrive-optimized GradientBoosting Regressor has an R-squared value ofΒ 0.82
DevelopmentβββAutoML
AutoML in Azure ML Studio entails cleaning the dataset, specifying configuration settings through AutoMLConfig, and submitting an experiment with these settings. Figure 6 below goes through this processβββbecause there is no training script submitted with AutoML, we have to sanitize the data before feeding it as training_data.
The entire output is in the linked repoβββgetting the best model from the AutoML experiment reveals that itβs an XGBoost Regressor that produces an R-squared value of 0.87. This top-performing model is saved to the active directory using joblib.dump
Deployment
With the best performing, AutoML saved to the working directory, I then registered the model and created the necessary environment with the required packages to use in deploying the registered model as a web service. This web service is an active HTTP endpoint that can be queried with data to receive the modelβs prediction.
Registering the model creates a βcontainerβ with the application and libraries/dependencies needed for execution. The registered model is also saved on the working directory for subsequent use in deployment.
The Model classβs deploy method, as used in this project, takes the following inputs: a workspace, the registered model, an inference configuration, and a deployment configuration. Letβs take each of these at aΒ time:
- workspace = already definedΒ earlier
- registered model = coveredΒ above
- inference configuration = represents settings of the βcomputerβ used for deployment. The InferenceConfig class is used, whichβββfor our purposesβββtakes two inputs: an entry script and a curated environment. Weβll cover both of theseΒ short
- deployment configurations = configures the web service that hosts the registered model and entry script. For our purposes, weβll use the Azure Container Instance.
Now letβs discuss the inputs to the InferenceConfig class:
- the entry script takes data from the user and passes it to the model. The resulting prediction from the model is βreturnedβ as the output of the entry scriptβββsee below for a snippet from the score.py entry script, which can be found at this projectβs repo.
- the curated environment is expected to support the algorithms used by the model. In my case, since the best AutoML model is an XGBoost Regressor, I chose the βAzureML-AutoMLβ package, which covers this algorithm.
Figure 9 below shows the creation of the environment, inference configuration, and deployment configuration. These creations are fed along with the registered model into the deploy method of the ModelΒ class.
As seen below, deployment was successful with a healthy endpoint. Upon my asking, AzureML provides the scoring URI for use in querying theΒ model.
Passing a request to this scoring URI runs the entry script, which launches the registered model from its saved home in the working directory in the init() command. The run() command is then executed to actually generate a prediction on data passed to the active endpoint.
Figure 11 demonstrates that any user who uses the files at the project repo can pass data on a car and get the predicted mileage. Key learning for the group was the importance of passing in a dataframe that can be dumped into a JSON file while being transferred to the score script, which then unpackages, i.e., dumps the JSON file and transfers it back into a dataframe.
Conclusions
I entered this Nanodegree with no experience in deploying models and have come out with experience in one medium. While challenging, this Nanodegree was enjoyable, and Iβm looking forward to learning deployment with otherΒ tools.
Machine Learning in the Cloud using Azure ML Studio was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI