Emoticon and Emoji in Text Mining
Last Updated on July 20, 2023 by Editorial Team
Author(s): Dhilip Subramanian
Originally published on Towards AI.
Converting Emoticon and Emoji into word form using Python
In todayβs online communication, emojis and emoticons are becoming the primary language that allows us to communicate with anyone globally when you need to be quick and precise. Both emoji and emoticons are playing an essential part in text analysis.
Both Emoji and Emoticon are most often used in social media, emails, and text messages, though they may be found in any type of electronic communication. On the one hand, we might need to remove for some of our textual analysis. On the other hand, we need to retain as these give some valuable information, especially in Sentiment Analysis and removing them might not be a right solution.
For example, if a company wants to find out how people are feeling about a new product, a new campaign, or about the brand itself on social media. Emojis can help identify where there is a need to improve consumer engagement by picturing usersβ moods, attitudes, and opinions. We can capture peopleβs emotions by analyzing emojis and emoticons. This will provide an essential piece of information, and it is vital for companies to understand their customerβs feelings better.
Collecting and analyzing data on emojis as well as emoticons give companies useful insights. Hence, we will convert these into word format so they can be used in modeling processes. In this blog, we will see how to save both emoji and emoticon into word form using python.
What is an Emoji? U+1F642 U+1F641
An emoji is an image small enough to insert into text that expresses an emotion or idea. The word emoji essentially means βpicture-characterβ (from Japanese e β βpicture,β and moji β βletter, characterβ).
What is an Emoticon? π :-]
An emoticon is a representation of a human facial expression using only keyboard characters such as letters, numbers, and punctuation marks.
Here, I have used a library called emot. For more details on this library, please check this Github repo. It has a good collection of emoticons and emojis with the corresponding words. I have used the same to convert the emojis and emoticons into words.
Code
#Installing emot library
!pip install emot#Importing libraries
import re
from emot.emo_unicode import UNICODE_EMO, EMOTICONS
# Function for converting emojis into word
def convert_emojis(text):
for emot in UNICODE_EMO:
text = text.replace(emot, "_".join(UNICODE_EMO[emot].replace(",","").replace(":","").split()))
return text# Example
text1 = "Hilarious U+1F602. The feeling of making a sale U+1F60E, The feeling of actually fulfilling orders U+1F612"
convert_emojis(text1)
Output
'Hilarious face_with_tears_of_joy. The feeling of making a sale smiling_face_with_sunglasses, The feeling of actually fulfilling orders unamused_face'
Emoticon into word form
Code
# Function for converting emoticons into word
def convert_emoticons(text):
for emot in EMOTICONS:
text = re.sub(u'('+emot+')', "_".join(EMOTICONS[emot].replace(",","").split()), text)
return text# Example
text = "Hello :-) :-)"
convert_emoticons(text)
Output
'Hello Happy_face_smiley Happy_face_smiley'
Note:
Removal and converting of emojis or emoticons are purely based on business use cases.
Thanks for reading. Keep learning and stay tuned for more!
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI