Key Phrase Extraction and Visualization : Python and Power BI

Last Updated on January 6, 2023 by Editorial Team

Last Updated on March 6, 2021 by Editorial Team

Author(s): Jayant Kumar Kodwani

Natural Language Processing

Key Phrase Extraction and Visualization: Python and Microsoft Power BI

Discover insights in unstructured text

Implementing RAKE algorithm in Python and Power BI integration

Key-Phrase Extraction, Photo by Rabie Madaci on Unsplash

We live in an age where data is the new currency! This makes the Big tech giants the richest companies in the world. The best investment for the next few decades will be the investment in data. So, what do these companies do with this data? How can anyone handle pieces of textual and unstructured data from Facebook posts, Twitter or Linkedin? To a layman, scanning or sampling might sound like a good idea, however, data scientists know the risks of sampling and the pain of scanning text by text, row by row, and word by word 😬. This is where data experts use “Key-phrase Extraction”.

Key-phrase Extraction is the skill to evaluate unstructured text and returning a list of key phrases. For example, given input text “The food was delicious and there were wonderful staff”, the service returns the main talking points: “food” and “wonderful staff”.

What will we Discuss?

In this story, we will extract key-phrases using RAKE algorithm in Python on a sample set of data and then and visualize in Microsoft Power BI.

Here is the link for the sample data that we will use: Sample Data

What is RAKE?

RAKE is short for Rapid Automatic Keyword Extraction algorithm, it is a domain-independent keyword extraction algorithm that tries to determine key phrases in a body of text by analyzing the frequency of word appearance and its co-occurrence with other words in the text.

Resources Required

Python instance (i.e. Spyder)
Microsoft Power BI Desktop (Pro License)
(OPTIONAL) Microsoft Azure Subscription (Free Trial or Paid) to correlate key-phrases together with sentiments.

Are you ready?? Here we go 🏄

Step 1: Install RAKE package and store stop wordlist

1.1 Installation: Open Python instance (i.e. Spyder 🐍 ) and issue below command to install the rake package.

!pip install python-rake==1.4.4

Installing RAKE algorithm package in Spyder Python instance

1.2 Create stop wordlist: Stop words are the words that generally do not help in text analysis and are typically dropped within all the informational systems and also not included in various text analyses as they are considered to be meaningless. Words that are considered to carry a meaning related to the text are described as the content bearing and are called content words. You can download the stopwords list here and customize the same as per your requirements. Save it at the desired location and copy the path for configuring the Python script.

Step 2: Open Power BI, Import Data & Configure Python script

2.1 Power BI Data Import: Open a new instance of Power BI desktop>> Import Data from Excel (Sample Data) >>Browse the sample data file >> Import data >> Calling “Run Python script” in Power Query Editor (Under Transform)

Calling “Run Python script” in Power Query Editor

2.2 Prepare your Python Script: You can use the below Python script and customize the same by replacing the path for stopwords list in row 11.

Also, you can specify/restrict the # of key-phrases to be extracted by modifying the count in row 31 (i.e. replace [-1:] to [-5:] to get up to 5 key-phrases from 1 text input)

Once done with the customization, you can apply the script and expand the “Rake_Final_Output” dataset. You can Save and Close the Power Query Editor to apply the script. This is how your dataset looks like after new fields added for key-phrases and their scores.

Power BI Dataset with Key-phrases and Scores

Step 3: Power BI Integration and Visualization

Now comes the fun part that we all love, the visualizations! 💝

In order to visualize the key-phrases, I would recommend to use a Word Cloud ☁️ together with tables preferably with sentiment analysis 😃, so you can relate the key-phrases with positive, neutral and negative sentiments.

You can download a sample Power BI template which integrates Sentiment analysis as well as key-phrase extraction all packed together in a Power BI.

As you can see in the below example, we have “Top 10 Key phrases with negative sentiments” where phrases like “Slower Connections” and “restart 10 times” are directly correlated to negative sentiments 😢

Word cloud with correlation of Negative Sentiments

Similarly, we have “Top 10 Key phrases with positive sentiments” where phrases like “explained neatly” and “great in depth knowledge” are directly correlated to positive sentiments 😃.

Word Cloud with correlation of Positive Sentiments

Conclusion

We learned 📘 how to apply RAKE algorithm to extract key-phrases and integrate the analysis in Microsoft Power BI to develop visualizations.

You could use other datasets and customize the code to see what suits your use case best! 👍

Came across a different approach for key-phrase extraction? Please drop it in the comments !

References

[1] https://docs.microsoft.com/en-us/azure/cognitive-services/text-analytics/tutorials/tutorial-power-bi-key-phrases

[2] https://towardsdatascience.com/analyzing-and-visualizing-sentiments-from-unstructured-raw-data-c263ba96cc2c

[3] Data source: prepared manually by the Author

Key Phrase Extraction and Visualization : Python and Power BI was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Key Phrase Extraction and Visualization : Python and Power BI

Author(s): Jayant Kumar Kodwani

Natural Language Processing

Key Phrase Extraction and Visualization: Python and Microsoft Power BI

Discover insights in unstructured text

What will we Discuss?

What is RAKE?

Resources Required

Are you ready?? Here we go 🏄

Step 1: Install RAKE package and store stop wordlist

Step 2: Open Power BI, Import Data & Configure Python script

Step 3: Power BI Integration and Visualization

Conclusion

References

Towards AI Team

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Understandability of Deep Learning Models

AI for Everyone: The Biggest AI Myths People Still Believe

How We Taught Machines to Think

#62 Will AI Take Your Job?

NN#6 — Neural Networks Decoded: Concepts Over Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Key Phrase Extraction and Visualization : Python and Power BI

Author(s): Jayant Kumar Kodwani

Key Phrase Extraction and Visualization: Python and Microsoft Power BI

Discover insights in unstructured text

What will we Discuss?

What is RAKE?

Resources Required

Are you ready?? Here we go 🏄

Step 1: Install RAKE package and store stop wordlist

Step 2: Open Power BI, Import Data & Configure Python script

Step 3: Power BI Integration and Visualization

Conclusion

References

Towards AI Team

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement