A Chatbot With the Least Number of Lines Of Code
Last Updated on September 30, 2022 by Editorial Team
Author(s): Chinmay Bhalerao
Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.
Chatbot and NLP in the simplestΒ form
Natural language processing involves creating meaningful sentences and phrases using Natural Language. Text Realization, Sentence Planning, and Text Planning are all involved. Planning a text includes locating the pertinent information in a knowledge base. Planning a sentence involves selecting the necessary words, creating meaningful phrases, and establishing the sentenceβs mood. The process of translating a sentence plan into text is called text realization. The chatbot is a computer program that attempts to have a conversation with a human user in natural language. A chatbot is one of the biggest applications of Natural Language Processing [NLP].NLP is divided into three basic partsΒ :
- Natural language generation(NLG)
- Natural language understanding(NLU)
- Natural language interaction(NLI)
The aim of this blog is to make a workable chatbot with very few lines of code. Let's start building.
Importing libraries:
I am assuming you already have the python interpreter installed. Importing basic libraries such as pandas and NumPy for data handling and data manipulations. Then I used nltk [Natural language tool kit] for word and sentence data operations.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import nltk
import random
import string
import warnings
warnings.filterwarnings('ignore')
Lets download βpunktβ pakage. It creates a list of sentences from a text by utilizing an unsupervised algorithm to find words that start sentences, collocations, and abbreviations.
nltk.download('punkt',quiet = True)
Dataset forΒ chatbot:
I am creating this chatbot completely dedicated to Asthma. Because there might be many doubts regarding Asthma in people and the dedicated bot is the best way to resolve doubts. This can be treated as a medicalΒ bot.
For this particular chatbot, the dataset is a text file that includes all information related to Asthma. We can use web scrapping here directly but for the sake of simplicity, I already web-scraped data regarding asthma from government and trustest websites which are mentioned below:
Although I took only 3 websites but you can take as many as you can to make predictions related to words more accurate. After scrapping, data will be stored in a single text file and supplied for further processing.
with open("Asthma_data_1.txt",'r',encoding = 'utf8') as f:
Data = f.read()
print(Data)
As preprocessing is very important for data sets to make them purer and state-of-the-art for the processing of models for better results. Natural language processing allows us to do tokenization and lemmatization.
Tokenization: Tokenized words in each sentence into their root word so that they will work as chunks toΒ process.
Lemmatization: Lemmatization works the same as tokenization, but the difference is it converts into a word that has meaning. This means converting them into smaller chunks and then making them useful or understandable words. So we are doing tokenization here. We can directly import sent_tokenize fromΒ nltk.
# Tokenizing the data
from nltk import sent_tokenize
Data = sent_tokenize(Data)
Data
Now it's time to work on the initial conversation. We will assign a few words that we use in daily life to start a conversation. It's like greetings. There are a lot of scopes to improve the following code, like adding as many as greeting you can, then making greetings in lower case to make it easy for processing. but for now, I am just assigning a few words to show theΒ demo.
# This function generates the greeting responce
Corpus = Data
def greeting_responce(Text):
# Lowering the text
Text = Text.lower()
# Bot greetings responce
bot_greeings = ['Namaskar','Hi','Hey there','Namaskar','Hello',]
# User greetings
user_greetings = ['hi','hey','hello','ola','greetings','wassup','namaskar']
for word in Text.split():
if word in user_greetings:
return random.choice(bot_greeings)
Defining the function for indexΒ sort
def index_sort(list_var):
length = len(list_var)
list_index = list(range(0,length))
x = list_var
for i in range(length):
for j in range(length):
if x[list_index[i]] > x[list_index[j]]:
#Swap
Temp = list_index[i]
list_index[i] = list_index[j]
list_index[j] = Temp
return list_index
Now we reached the final stage of creating a chatbot. This is particularly to map the similarity between a word that the user asked for and its most similar word from the database. We are using cosine-similarity here.
#Create bot response
from sklearn.feature_extraction.text import CountVectorizer,TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
def bot_response(user_input):
user_input=user_input.lower()
Corpus.append(user_input)
bot_response=''
cm=TfidfVectorizer().fit_transform(Corpus)
similarity_scores=cosine_similarity(cm[-1],cm)
similarity_scores_list=similarity_scores.flatten()
index=index_sort(similarity_scores_list)
index=index[1:]
response_flag=0
j = 0
for i in range(len(index)):
if similarity_scores_list[index[i]]>0.0:
bot_response=bot_response+''+ Corpus[index[i+1]] + ' ' + Corpus[index[i+2]] + ' ' + Corpus[index[i+3]]
response_flag=1
j=j+1
if j > 2:
break
if response_flag==0:
bot_response=bot_response+''+"I apologize I dont understand"
Corpus.remove(user_input)
return bot_response
InteractionΒ :
Now the final chunk of code is where you can start the chat with the chatbot. You can write basic statements before chatting to know the user what this chatbot isΒ doing.
#Start the chat with Chatbot
print("Doc Bot: Hii....!!!, I am here to help you.")
print(" I am Doctor for short, I will tell about your doubts regarding 'Asthma'. ")
print("NOTE: My suggestions are only the results of google search and only for knowledge purpose.Please seek medical attention and Medication in case of Asthma")
exit_list=['exit','bye','see you later','quit','break','no','thanks','thanks alot']
while(True):
user_input=input("You : ")
if user_input.lower() in exit_list:
print("Doc Bot: Ok, Thanks, Bye...")
print(" Stay Home, Stay Safe!!!")
break
elif user_input.lower() in ['ok','hmm','its ok']:
print('Doc Bot: Can I tell you more about it')
elif user_input.lower() in ['yes','ok']:
print('Doc Bot:'+bot_response(user_input))
else:
if greeting_responce(user_input) != None:
print('Doc Bot:'+ greeting_responce(user_input))
else:
print("Doc Bot:"+bot_response(user_input))
There is a wide variety of words that can be encountered and not known by Bot. This is where data plays an important role. Medical bots are tested very drastically before production as they give very precious information that must be verified by the proper person. The next strategy can be to use a GUI tool or a hosting website to host this bot. That's completely the userβs choice. So with very bare minimum lines of code, we created a simpleΒ chatbot.
How to test theΒ bot?
The way that I followed for testing is creating a google form and sending it to all your near-ones/colleagues/friends. The form will ask them about their doubts regarding Asthma and the weirdest and strangest questions you can test on the bot. After answers, you can understand if more data is required or not! And that will be real-life testing for aΒ bot.
If you find this insightful
If you found this article insightful, follow me on Linkedin and medium. you can also subscribe to get notified when I publish articles. Let's create a community! Thanks for yourΒ support!
A Chatbot With the Least Number of Lines Of Code was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Join thousands of data leaders on the AI newsletter. Itβs free, we donβt spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI