Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Azure Cognitive Services Sentiment Analysis v3.0 using Databricks PySpark
Latest   Machine Learning

Azure Cognitive Services Sentiment Analysis v3.0 using Databricks PySpark

Last Updated on July 19, 2023 by Editorial Team

Author(s): Rory McManus

Originally published on Towards AI.

Cloud Computing, Natural Language Processing

Azure Cognitive Services Text Analytics is a great tool you can use to quickly evaluate a text data set for positive or negative sentiment. For example, a service provider can quickly and easily evaluate reviews as positive or negative and rank them based on the sentiment score detected.

As more and more businesses rely on electronic communications with their clients, understanding the overall sentiment attached to your product, service or image has never been more important. Sentiment analysis allow companies to automatically detect sentiment in any text (reviews, insurance claims, triaging etc) in a fast and highly scalable way.

My latest project was with a property management company with the aim of using the sentiment scores from client feedback on properties to identify and prioritise major issues, which enabled a quicker resolution to issues and improved customer service.

Today I’m going to go through how to use Azure Cognitive Services Text Analytics using Databricks PySpark Notebook to analyze the sentiment of COVID-19 Tweets and return sentiment scores and indicators as to whether it is a positive or negative tweet.

What is Azure Cognitive Services Text Analytics?

Cognitive Services are a set of machine learning algorithms that Microsoft has developed to solve problems in the field of Artificial Intelligence (AI). Developers can consume these algorithms through standard REST calls over the Internet to the Cognitive Services APIs in their Apps, Websites, or Workflows.

For this article, we will focus on the Text Analytics API Sentiment Analysis feature, which evaluates the text and returns sentiment scores and labels for each document and sentence. This is useful for detecting positive and negative sentiment for any language in social media, client reviews, discussion forums, and more.

Consuming the Sentiment Analysis API using PySpark.

To analyse text and return a sentiment analysis for our data we need the code to complete the following steps.

  1. Import a dataset with a text column.
  2. Set a parameter to identify the input dataset text column name making our code dynamic.
  3. Set Azure Cognitive Services API and Key.
  4. Create input Dataframe ready for the API post with an Id and Text column only.
  5. Convert Dataframe to JSON ready for the API Post.
  6. Post the JSON document to the Sentiment Analysis API.
  7. Flatten JSON API response into Dataframe with rows and columns.
  8. Join Dataframe with the original dataset to produce the final dataset and display for analysis.

Steps

  1. Add the following imports to your file PySpark Notebook and create input Dataframe by importing a COVID19 Tweet dataset.

Results

2. Create and set the name of the text column parameter, set this to the name of the column you want analyzed.

3. For the purpose of this demonstration, we will set the Sentiment Analysis API parameters manually. Please be aware a more secure method would be to use Azure Key Vault to provide a greater level of security.

4. The payload to the API consists of a list of JSON documents, which are tuples containing an id, languageand a text attribute. The text attribute stores the text to be analyzed, the language is text language and the id can be any value. Therefore we need to add anid column and only select columns id,language and textcolumn for the API payload.

5. Convert DataFrame dfCog into a DataFrame of JSON string in the correct format for the API.

Output below.

6. Post the JSON payload to the API passing in the subscription_key, endpoint and document.

Successful response.

7. Now we have the response returned in JSON, we must flatten the document into rows and columns.

8. Finally, we can join the analyzed dataset to the input dataset and drop the added ID column and display the final output.

The final result provides a sentiment score between 0.0 and 1.0 and an overall sentiment label, with a higher score indicating more positive sentiment.

I have created this into a re-usable PySpark function. If you would like a copy please drop me a message and I can send you a link to my private GIT repo.

I hope this was helpful in saving you time understanding Azure Cognitive Sentiment Analysis and PySpark. Any thoughts, questions, corrections, and suggestions are very welcome πŸ™‚

If you liked this article, here are some other articles you may enjoy:

Databricks PySpark Type 2 SCD Function for Azure Synapse Analytics

Slowly Changing Dimensions (SCD) is a commonly used dimensional modeling technique used in data warehousing to capture…

pub.towardsai.net

Databricks: Upsert to Azure SQL using PySpark

An Upsert is an RDBMS feature that allows a DML statement’s author to automatically either insert a row, or if the row…

rorymcmanus.medium.com

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓