Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

In-depth Azure Machine Learning Model Train, Test, and Deploy Pipelines on Cloud With Endpoints for Web APIs
Cloud Computing

In-depth Azure Machine Learning Model Train, Test, and Deploy Pipelines on Cloud With Endpoints for Web APIs

Last Updated on November 20, 2022 by Editorial Team

Author(s): Amit Chauhan

An image by the Author

The workspace consists of various artifacts

  • Manage resources: It includes compute instances and compute clusters.
  • Linked Services:
  1. Data Stores: It is a service to store various data. For example β€” Blob storage, hive storage, and SQL database.
  2. Compute targets: These are the machines where we run our model and do the train and test.
  • Assets:
  1. Environments
  2. Experiments
  3. Pipeline
  4. Datasets
  5. Models
  6. Endpoints

The whole scope of the workspace depends on some dependencies, there will be various logs, various notebooks, entries of the assets, etc. For them, the workspace requires storage.

  • Dependencies
  1. Azure Storage account: Used for the administration and the working of the workspace.
  2. Azure container registry: When we deploy our model to the production and docker instances.
  3. Azure key vault: To store various keys, secret information, and privacy information.
  4. Azure application insight: It is used to monitor our machine learning applications and various information like response time, requests, failure conditions, performance, etc.

Basic concepts

  • Datasets

It is information composed in the form of rows and columns, i.e., a collection of data. There are many methods in azure to upload/fetch the dataset for machine learning experiments.

  • Data-stores

When we want to fetch the dataset from the local system, then we need some storage that is where the data store comes into the picture. Data-store is just the connection to the various storage types like account storage, database, or analytics as a data lake.

  • Various storage types

Blob, file storage, data lake, Azure SQL, Azure PostgreSQL, MySQL, Azure Data bricks. These are supported by the azure system.

Creating the machine learning workspace

Below are the following steps to create the workspace

  1. Open the azure dashboard, search for the machine learning resource, click on it and then create. If you don’t have an azure account, then follow the link below.

How to Open an Azure cloud account with Debit Card

A simple and easy process for all data scientist

amitprius.medium.com

An image by the Author

2. Fill in all the information.

If there is no resource group name, then create a new one. When we will write the workspace name, the other information like a key vault, storage account, and application insight is filled automatically. We will keep the container registry to β€˜None’ for now because it is required at the time of deployment.

An image by the Author

We can choose any region, but if we have a large amount of data, we can choose the nearest region of fast data transfer.

  • In the Networking option, choose public access for practicing the experiment.
  • n the Advanced option, there are many options, and keep it as it is, in data impact, if we enable then we are telling Microsoft that the data we will upload is sensitive.

3. After getting the Validation passed, click on create to make the workspace. It will create the four resources as shown below.

An image by the Author

4. Now, click on the go to the resource, and the workspace dashboard will open with the launch studio option as shown below.

An image by the Author

In the above image, the access control (IAM) is used to create more users to use this workspace.

Launching the machine learning studio

  1. After creating the workspace, it’s time to launch the ML studio, and it will look like the image below.
An image by the Author

The author in the above image is responsible for making machine learning experiments and pipelines.

2. Make a new storage account to avoid the files of other storage systems.

An image by the Author

3. Now, create a container inside this storage account.

An image by the Author

4. Now, create a data store in the ML studio that will connect to this newly made storage account.

An image by the Author

5. Fill in the information.

An image by the Author

To get the access key, go to the new storage account in step 2 and copy the key from the access key option, as shown below.

An image by the Author

Now, click on the create button of the data store. The data store is created and registered with the workspace along with the storage account.

6. Now, upload the dataset to the container we created in the storage account in step 3.

An image by the Author

We also check the file through the storage browser option in the storage account.

An image by the Author

7. Now, create the dataset and choose the file from the data stores.

An image by the Author

Click the next button; when we will select β€˜From Azure storage’ other options will come on the left side. We choose this option because our storage is a blob type.

An image by the Author
An image by the Author
An image by the Author
An image by the Author

Now, we can deselect the Loan_ID and Gender column in the Schema option.

An image by the Author

Our data is uploaded in the dataset.

An image by the Author
An image by the Author

Compute Resources

In this topic, we will discuss the managed resources artifacts i.e. compute instances and compute clusters that come in the machine learning workspace.

These are just different names for computers and virtual machines. The computed target is connected with linked services in the workspace.

Why do we need computing resources?

For any machine learning modeling, we need a computation resource that will train our model.

  1. Compute instance: It is a type of virtual machine/server or computer that is used for cloud computation. It is not a machine only but connected to the workspace and has Python, R, Docker, and Azure ML SDK configured. The default storage account while creating the workspace is attached to this instance means we can access all notebooks and other data stored. Mostly used in a development process training, testing, and inferencing. Inference means creating endpoints for web services.
  2. Compute clusters: It is also a managed resource that is a group of virtual machines. We can use the clusters for all three authors i.e.compute instance, designer, or autoML, for training and with limited deployment.
  3. Compute targets: These compute consist of remote/attached compute and Inference clusters.
  • Remote compute: The primary aim of the target is used for training and testing the deployment. We can use any local machines, compute instances or virtual machines as a compute target. We can also use batch inferencing on compute clusters.
  • Inference clusters: It is used to do real-time predictions in production using our model. They can be Azure Kubernetes Service (AKS).

Creating the compute clusters

  1. Go to the compute tab and find the compute cluster option in ML studio as shown below:
An image by the Author

2. Now, choose the compute with a low budget as we are just practicing.

An image by the Author

3. Give the name of the cluster and click on the create button.

An image by the Author

4. Also, create a compute instance as shown below with the same information as above.

An image by the Author

What is a pipeline?

It is just a series of tests or workflow from data processing to deployment.

An image by the Author

We may need to create compute instances at the time of cleaning and training processes.

Creating a new training pipeline using a designer from the ML studio dashboard

  1. Click on the start now button in the designer option as shown below:
An image by the Author

2. Now click on the plus button to create the pipeline.

An image by the Author

3. The pipeline interface will open after clicking on the plus button.

An image by the Author

4. In our data option we have our dataset as shown in the below image:

An image by the Author

5. Just drag and drop the data to the right side.

An image by the Author

6. Now, check for the selected columns from the dataset in the component option and drop them on the right side. Connect the output of the dataset to the input of the select column.

An image by the Author

7. Now, double-click on the select column then click on the edit columns.

An image by the Author

8. Now choose the clean missing option and choose the column in the edit column button.

An image by the Author

9. Now choose the split option and then select the fraction as β€˜0.7’ and also select the stratified column with column name β€˜loan_status’ through the edit column.

An image by the Author

10. Now, we are ready to train our two-class logistic regression model.

An image by the Author

11. After all the options in the canvas, our pipeline is complete. Now go to settings to choose compute clusters.

An image by the Author

12. Suppose our training model is big and we need to use a different compute cluster. Then we can choose the train model option and choose another compute in Run Settings as shown below:

An image by the Author

13. Our pipeline is complete, we can click on the submit button to run the pipeline.

An image by the Author

Our pipeline running is completed successfully.

14. After completing, right-click on the Evaluate model to see the ROC curves.

An image by the Author

Including Confusion matrix

An image by the Author

Creating Inference Pipeline and deploying it as a web service

  1. After completing the pipeline, we can get the create inference pipeline option as shown below:
An image by the Author

2. When we choose real-time inference, azure do some changes in the pipeline as shown below:

An image by the Author

3. We did some changes in the pipeline as shown below:

An image by the Author

4. Now click on the submit button to run the real-time inference. After completing the automation, we can check the scores.

An image by the Author

We got the scores and probabilities also.

5. But we need only score labels in the web service output, so we will need a select column option in the canvas to limit the output.

An image by the Author

6. To make the predictions, we need to deploy the model in the cloud, if the deploy button is shown, then refresh the page after submitting run completion.

But before deployment, we need an Azure Kubernetes Cluster service for that need to make a new inference cluster from the ML studio.

An image by the Author

Choose the virtual machine configuration.

An image by the Author

Write the name of the cluster for the endpoint.

An image by the Author

Now click on the create button to make the cluster, after some minutes it is completed successfully.

An image by the Author

After the creation of an inference cluster, now it’s time to deploy the model and make the endpoints.

An image by the Author

Give the name of the endpoint and select the newly made inference cluster and click on the deploy button as shown below:

An image by the Author

After a few minutes, the deployment is complete and in the endpoint section we will see the endpoint in a healthy state.

An image by the Author

In the test section, we can make the prediction based on the input parameters.

An image by the Author

In the consume section, we can use the URL for scores.

An image by the Author

Conclusion

This article is clearing the workflow of the pipeline of machine learning in the Azure cloud. The important steps are to take care of connections and services linked together.

I hope you like the article. Reach me on myΒ LinkedInΒ andΒ twitter.

Recommended Articles

1. Most Usable NumPy Methods with Python
2.Β NumPy: Linear Algebra on Images
3.Β Exception Handling Concepts in Python
4.Β Pandas: Dealing with Categorical Data
5.Β Hyper-parameters: RandomSeachCV and GridSearchCV in Machine Learning
6.Β Fully Explained Linear Regression with Python
7.Β Fully Explained Logistic Regression with Python
8.Β Data Distribution using Numpy with Python
9.Β 40 Most Insanely Usable Methods in Python
10.Β 20 Most Usable Pandas Shortcut Methods in Python

Feedback ↓