Real-Time Stock News Sentiment Analyzer
Last Updated on May 24, 2022 by Editorial Team
Author(s): Raviraj Shinde
Investing in the Stock Market is a great way of tackling Inflation. Inflation refers to the rise in the prices of most goods and services of daily or common use, such as food, clothing, housing, recreation, transport, consumer staples, etc. Basically, with 100 rupees you won’t be able to buy as many vada pavs (wadapavs) as you could last year.
In the pandemic-struck financial year of 2020–2021 a whopping 142 lakh new investors have started trading in the stock market.
One key skill required to make good investments in the stock market is being able to correctly analyze news related to the finance and the business sector. Which company is diversifying its sectors or which company is showing signs of heading towards bankruptcy? You need to keep yourself updated with every little deal and fallout happening in the market. Financial news can be a little tricky to understand especially for those who are new to the financial world.
To make the process a little easy I plan to make a tool that extracts the latest headlines of every stock on the Indian Stock market and runs them through a sentiment analyzer specially trained on financial news to create an aggregate sentiment for each stock to aid a newbie stock investor in understanding the news better.
Real-Time Stock News Analysis
- HuggingFace Transformers library
Getting the Data
Let us begin with web-scraping finance news from a trusted source.
After looking through multiple websites I found that tickertape segregates its stocks alphabetically and uses a tick(keyword) in its URL to navigate to a particular stock. This kind of interface proves quite useful for web-scraping data.
First, let us create a list of all the ticks from tickertape.
I have created a get_ticks() function to extract all the ticks in one go from tickertape/stocks. Here every stock’s name links to its screening page. The link is of the format “/stocks/stock-name-tick”.
We need to extract this tick to navigate to each stock seamlessly.
Using BeautifulSoup and urllib I have extracted both the tick and name of the stock and stored them in a Pandas DataFrame in alphabetical order.
The output looks like this:
Now moving on to the news part. For this, we follow a similar pattern as get_ticks(). News related to each stock is located at “tickertape/stock-tick/news?checklist=basic&ref=stock-overview_overview-sections&type=news”. Again by using BeautifulSoup and urllib we can extract this news.
Running a for loop where we replace the stock-tick part with the elements in the tick column of our DataFrame we can extract the news of each stock and store it in a list called news.
Now that we have the news in the format that we need we can move on to the sentiment analysis part.
Sentiment Analysis of Financial News
For this project, I plan to use a pre-trained model known as FinBERT.
FinBERT is a model specially designed to work on financial news and texts. It is a model based on BERT architecture. Along with the HuggingFace transformers library, FinBERT becomes really easy to implement.
For more details on BERT and FinBERT you can refer to my blog:
Now that FinBERT has been loaded we start the process of analysis.
As FinBERT returns sentiment in a numeric format, we need to map the output to a more Human-Friendly format. For this, we will create a python dictionary called labels.
Now the good part. Here we use the tokenizer object to preprocess the text according to good NLP practices and then pass the output from the tokenizer to the finbert object for sentiment analysis.
We do this for every stock in our list. The sentiment for each headline is stored in a list so that we can later group them to create an aggregate sentiment of each stock. We store the output from the function in a list of lists called tot_val. Some lists appear empty as the website does not contain the news. For those stocks, I just return neutral as the sentiment.
Finally, we move on to create an aggregate of the sentiments by combining the sentiments of each headline of every stock. For this purpose, I simply add +1 to the agg variable if the headline was positive and -1 if the headline was negative. Based on the final value of the agg variable I assign positive, negative, or neutral to the stock.
Finally, we pass our list of sentiments through the get_sent() and obtain the aggregate sentiments. We store these sentiments in a list which we then assign into a column sentiment of our original tick_df DataFrame.
The output looks like this:
Thus we have successfully managed to create a sentiment analyzer for the Indian stock market based on financial news.
- This project is for educational purposes only and should not be used for making investment decisions.
- Also, this blog is in no way a sponsorship for tickertape or any other entity mentioned in the blog
The entire code is available on my git profile:
Thanks for reading! 😄
Published via Towards AI