Interactive COVID-19 Dashboard With Chatbot and Prediction Capabilities
Last Updated on January 11, 2021 by Editorial Team
Author(s): Daksh Trehan
Data Visualization
A Practical Way to show-off Machine Learning skills and help the globeΒ around.
COVID-19 can be marked as the preeminent highlight of the decade, and the vague information spread can be regarded as a matter of concern. Due to which several data visualization researchers and professionals are involved in delivering widely used tools for a better public explanation.
Joining the list of practitioners, weβve designed a Live Interactive Covid-19 Dashboard that included dynamic visualization of frequently updated worldwide data, along with a chatbot to help and solve queries of neophytes, and a predictor that is capable of predicting active cases, recovered cases, casualties worldwide and for different countries using Machine Learning Techniques.
Content Table:-
- Data Procurement and Preparation
- Dynamic Deployment ofΒ Stats
- Chatbot
- Predicting theΒ Outbreak
- Link to Resources
Data Procurement and Preparation
The dataset that has been used in prediction and modeling tasks is fetched from repository β2019 Novel Corona Virus Visual Dashboardβ managed by Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHUΒ APL).
The data fetched is parametrized that included relevant parameters like State, Country, Latitude, Longitude, and Date. A separate dataset has been used to understand better Confirmed, Death, and Recovered cases.
Dynamic Deployment ofΒ Stats
The data fetched from the above-mentioned repository is cleaned and made usable using several Data Cleaning techniques.
To deploy stats in a dynamic and lively way, JavaScript is employed, and the program is put on the web using HerokuΒ Server.
Salient Features of Dashboard:
ChatBot
To make our dashboard more usable to neophytes or people with limited knowledge about the deadly Corona Virus, weβve tried to employ a chatbot that can help to solve queries regarding the Pandemic outbreak.
The data is procured from the Frequently Asked Questions section of the official website of the Center for Disease Control and Prevention using requests and the BeautifulSoup library. The data includes 70 different questions regarding general awareness towards Novel 2019 Coronavirus. The queries and their solutions are collected separately and dumped in JSON files, which are then aggregated to create a useful data frame.
Following up with the chatbot, the Bag-of-Word model was employed using TF-IDF Vectorization. As usual, we canβt directly feed textual data to our model, and rather we need to convert them to feature vectors. This is where TF-IDF helped. It stands for βTerm Frequency-Inverse Document Frequencyβ that stores components of resulting scores assigned to each word. Some words like βtheβ, βisβ might appear a lot often in our document, but that certainly isnβt going to help our encoded vector. The TF-IDF vector's goal is to calculate the word frequency scores for the highlighted text that are more interesting. βTerm Frequency (TF)β calculates the frequency for each word, whereas βInverse Document Frequency (IDF)β downscales the score of much frequently occurring word.
Keeping in mind that there is a high chance that users will not enter the same question as fetched and stored in our corpus, though we can expect to match the meaning and insights feeding the same question to our model is far-fetched. To resolve this challenge, we have used Cosine-similarities that is used to determine the similarity between texts regardless of their size. It tends to determine the cosine angle between two vectors that are projected in multidimensional space.
Predicting theΒ Outbreak
Another salient feature of our dashboard can be regarded as the prediction of active, recovered, and death cases. The data fetch continuous dataset and, therefore, is well suited for regression analysis as it needs to predict from continuous dependent variables from various independent ones. The relation between dependent and independent variables can be defined by the regression mathematical statement's coefficient of both variables.
Since Linear Regression is supervised learning, therefore, we need to provide it with past data, and to do so, we have collected the data from β1 Jan 2020β and provided it with actual value to plot the hyperplane and predict future values each for Active, Recovered and DeathΒ cases.
SVM Regression followed the same trend. SVM is basically used as a classifier, but when we try to increase the margin rather than decrease it, it shows the property of regression and can be used for prediction modeling.
Predicting Recovered Worldwide Cases:
Predicting worldwide DeathΒ Cases:
Predicting Death cases for theΒ US:
Link to Resources
Link to the repository: dakshtrehan/Interactive-Covid-19-Dashboard
Link to Dashboard: http://interactivecovid19dashboard.herokuapp.com/
Link to Published Paper: COVID 19 Trend Analysis using Machine Learning TechniquesβββIJSER Journal Publication
Link to Portfolio: www.dakshtrehan.com
Feel free toΒ connect:
LinkedIn ~ https://www.linkedin.com/in/dakshtrehan
Follow for further Machine Learning/ Deep LearningΒ blogs.
Medium ~ https://medium.com/@dakshtrehan
Want to learnΒ more?
Are You Ready to Worship AI Gods?
Detecting COVID-19 Using Deep Learning
The Inescapable AI Algorithm: TikTok
GPT-3: AI overruling started?
Tinder+AI: A perfect Matchmaking?
An insiderβs guide to Cartoonization using Machine Learning
Reinforcing the Science Behind Reinforcement Learning
Decoding science behind Generative Adversarial Networks
Understanding LSTMβs and GRUβs
Recurrent Neural Network for Dummies
Convolution Neural Network forΒ Dummies
Cheers
Interactive COVID-19 Dashboard With Chatbot and Prediction Capabilities was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI