Accelerate your data journey. Join our AI Community!

Publication

Latest

Real-Time Stock News Sentiment Analyzer

Author(s): Raviraj Shinde

Natural Language Processing

Investing in the Stock Market is a great way of tackling Inflation. Inflation refers to the rise in the prices of most goods and services of daily or common use, such as food, clothing, housing, recreation, transport, consumer staples, etc. Basically, with 100 rupees you won’t be able to buy as many vada pavs (wadapavs) as you could last year.

In the pandemic-struck financial year of 2020–2021 a whopping 142 lakh new investors have started trading in the stock market.

One key skill required to make good investments in the stock market is being able to correctly analyze news related to the finance and the business sector. Which company is diversifying its sectors or which company is showing signs of heading towards bankruptcy? You need to keep yourself updated with every little deal and fallout happening in the market. Financial news can be a little tricky to understand especially for those who are new to the financial world.

To make the process a little easy I plan to make a tool that extracts the latest headlines of every stock on the Indian Stock market and runs them through a sentiment analyzer specially trained on financial news to create an aggregate sentiment for each stock to aid a newbie stock investor in understanding the news better.

Real-Time Stock News Analysis

Requirements:

  1. BeautifulSoup
  2. HuggingFace Transformers library
  3. Urllib
  4. Numpy
  5. Pandas

Getting the Data

Let us begin with web-scraping finance news from a trusted source.

After looking through multiple websites I found that tickertape segregates its stocks alphabetically and uses a tick(keyword) in its URL to navigate to a particular stock. This kind of interface proves quite useful for web-scraping data.

First, let us create a list of all the ticks from tickertape.

https://medium.com/media/96d44ff10ea473a0e420ee310eb13565/href

I have created a get_ticks() function to extract all the ticks in one go from tickertape/stocks. Here every stock’s name links to its screening page. The link is of the format “/stocks/stock-name-tick”.

We need to extract this tick to navigate to each stock seamlessly.

Using BeautifulSoup and urllib I have extracted both the tick and name of the stock and stored them in a Pandas DataFrame in alphabetical order.

https://medium.com/media/9d53280bac7af7524ccc4d95b4b4d1b2/hrefhttps://medium.com/media/110e192e4078594bed34813f259d469c/href

The output looks like this:

Now moving on to the news part. For this, we follow a similar pattern as get_ticks(). News related to each stock is located at “tickertape/stock-tick/news?checklist=basic&ref=stock-overview_overview-sections&type=news”. Again by using BeautifulSoup and urllib we can extract this news.

https://medium.com/media/530f30521725df5af4275f1c19465db1/href

Running a for loop where we replace the stock-tick part with the elements in the tick column of our DataFrame we can extract the news of each stock and store it in a list called news.

https://medium.com/media/c212436d3f22067ee5f98987c54c405e/href

Now that we have the news in the format that we need we can move on to the sentiment analysis part.

Sentiment Analysis of Financial News

For this project, I plan to use a pre-trained model known as FinBERT.

FinBERT is a model specially designed to work on financial news and texts. It is a model based on BERT architecture. Along with the HuggingFace transformers library, FinBERT becomes really easy to implement.

For more details on BERT and FinBERT you can refer to my blog:

Financial News Sentiment Analysis using FinBERT

https://medium.com/media/235d1fbaca13018804a619a386b23336/href

Now that FinBERT has been loaded we start the process of analysis.

As FinBERT returns sentiment in a numeric format, we need to map the output to a more Human-Friendly format. For this, we will create a python dictionary called labels.

https://medium.com/media/1c39c6ecbf2fd3cd55fcaa2caa78cf78/href

Now the good part. Here we use the tokenizer object to preprocess the text according to good NLP practices and then pass the output from the tokenizer to the finbert object for sentiment analysis.

https://medium.com/media/4abc8c3bfe734f416975b420565b7590/href

We do this for every stock in our list. The sentiment for each headline is stored in a list so that we can later group them to create an aggregate sentiment of each stock. We store the output from the function in a list of lists called tot_val. Some lists appear empty as the website does not contain the news. For those stocks, I just return neutral as the sentiment.

Finally, we move on to create an aggregate of the sentiments by combining the sentiments of each headline of every stock. For this purpose, I simply add +1 to the agg variable if the headline was positive and -1 if the headline was negative. Based on the final value of the agg variable I assign positive, negative, or neutral to the stock.

https://medium.com/media/b0f42fab537e0ab55a539a25555ee22e/href

Finally, we pass our list of sentiments through the get_sent() and obtain the aggregate sentiments. We store these sentiments in a list which we then assign into a column sentiment of our original tick_df DataFrame.

https://medium.com/media/f9d53a2cc597e75cc6d3539b9f7743ac/href

The output looks like this:

Thus we have successfully managed to create a sentiment analyzer for the Indian stock market based on financial news.

  • This project is for educational purposes only and should not be used for making investment decisions.
  • Also, this blog is in no way a sponsorship for tickertape or any other entity mentioned in the blog

The entire code is available on my git profile:

GitHub – Raviraj2000/Realtime-Stock-News-Sentiment-Analyzer

Thanks for reading! 😄


Real-Time Stock News Sentiment Analyzer was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓