Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Sentiments Analysis of Financial News as an Indicator for Amazon Stock Price
Latest   Machine Learning

Sentiments Analysis of Financial News as an Indicator for Amazon Stock Price

Last Updated on July 21, 2023 by Editorial Team

Author(s): Jay M. Patel

Originally published on Towards AI.

Natural Language Processing

We will perform sentiments analysis using a News API for predicting Amazon (AMZN) stock price using Python

Sentiments analysis of news has become one of the most robust ways of generating buy/sell signals for stocks in all major developed and major emerging markets. The idea is simple, a cumulative sentiments score of the news articles mentioning a companies name, brand, stock ticker, etc. will serve as a great indicator for the next days’ closing stock price.

This only works with stocks that have high trading volumes and active news coverage across major outlets. Generally speaking, constituent stocks of major market indices such as NASDAQ, Dow Jones, or S&P 500 will all satisfy these criteria.

In this article, we will discuss the steps necessary for building such a sentiments analysis pipeline for amazon.com stock.

You will have to select which portions of the page you want to extract. Typically, people want to extract author names, dates, titles, and full text of the news article.

Plotting Amazon Stock Price

Before we get into actually getting sentiments data, let us first get stock market data for Amazon stock price (AMZN). I like to use Alphavantage’s API to get stock market data; it’s free to use but you will have to generate an API key.

import requests 
import json
from dateutil import parser
import requests
import json
test_url = 'https://www.alphavantage.co/query?function=TIME_SERIES_DAILY_ADJUSTED&symbol=AMZN&outputsize=full&apikey=' + API_KEY + '&datatype=csv'r = requests.get(url = test_url)
print("Status Code: ", r.status_code)
print("*"*20)
print(r.headers)
html_response = r.text
with open("amazon_stock.csv", "w") as outfile:
outfile.write(html_response)
from dateutil import parser
datetime_obj = lambda x: parser.parse(x)
df2 = pd.read_csv("amazon_stock.csv", parse_dates=['timestamp'], date_parser=datetime_obj)#df2 = df2[(df2["timestamp"] >= start_date) & (df2["timestamp"] <= end_date)]df2.head(1)
# Output
timestamp open high low close adjusted_close volume dividend_amount split_coefficient
0 2020-09-28 3148.85 3175.04 3117.1684 3174.05 3174.05 4224165 0.0 1.0

Next, we will plot this so that we can visually see the price movements. As you might already know, the stock has had a strong rally due to covid-19 pandemic.

import matplotlib.pyplot as plt 
import seaborn as sns
top = plt.subplot2grid((4,4), (0, 0), rowspan=3, colspan=4) top.plot(df2['timestamp'], df2['close'], label = 'Closing price') plt.title('Amazon Close Price')
plt.legend(loc=2) bottom = plt.subplot2grid((4,4), (3,0), rowspan=1, colspan=4)
bottom.bar(df2["timestamp"], df2["volume"])
plt.title('Amazon Daily Trading Volume')
plt.gcf().set_size_inches(12,8)
plt.subplots_adjust(hspace=0.75)

Scraping News Articles

Getting news articles in bulk in a structured format will take a lot of data wrangling if you intend to scrape more than one news source. A good place to start will be looking over our blog post on scraping CNN news articles and using it as a template to scrape title, content, date, author, etc. from other news websites.

Once you have the raw articles, you will have to load them into a full-text search database like Elasticsearch, Amazon Cloudsearch, Apache Solr, etc. and then search for the stock name or stock ticker of your choice to locate relevant articles.

We will instead use a free public API that lets us query for news articles from all major news sources within the past 7 days. Our latest and historical news API returns a maximum of 100 news articles per API call and you can paginate through the results if needed. You will need to signup with rapidapi to get a key but it’s free (no credit card required) and you get free API usage in basic tier.

import requests
import json
url = "https://latest-news1.p.rapidapi.com/"
payload = json.dumps({"domains": "","topic":"business","q": "amazon","qInTitle": "", "content":"", "page":"1", "author_only":""})
headers = {
'content-type': "application/json",
'x-rapidapi-key': "YOUR_RAPID_API_KEY",
'x-rapidapi-host': "latest-news1.p.rapidapi.com"
}
response = requests.request("POST", url, data=payload, headers=headers)
# after paginating through results
print("Total results found: ", response_dict["totalResults"])
clean_response_list = clean_response_list + response_dict["Article"]
len(clean_response_list)
#Output
fetching page 1
Total results found: 1263

Let’s load it up in pandas dataframe and use it later for sentiments analysis.

import numpy as np
import pandas as pd
df = pd.DataFrame(clean_response_list)
df.head()
#Output
author content description publishedAt source_name source_url title url urlToImage
0 [] Market Overview Tickers Articles Keywords Che... GameStop Corp (NYSE: GME) could be extending i... 2020-09-22 Benzinga benzinga.com Gamestop Corporation (NYSE:GME), Amazon.com, I... https://www.benzinga.com/news/20/09/17595730/c... https://cdn.benzinga.com/files/imagecache/og_i...
1 ["www.ETAuto.com"] Uber investors are pressuring CEO to revamp t... When Khosrowshahi assumed Uber’s helm from ous... 2020-09-22 ETAuto.com indiatimes.com Uber business strategy: Uber investors are pre... https://auto.economictimes.indiatimes.com/news... https://etimg.etb2bimg.com/thumb/msid-78248237...
2 [] Xiaomi recently launched the entry-level Redm... Xiaomi's new entry-level smartphone offers sol... 2020-09-22 News18 news18.com Redmi 9A Goes on Sale Today on Amazon and Mi.c... https://www.news18.com/news/tech/redmi-9a-goes... https://images.news18.com/ibnlive/uploads/2020...
3 [] Bolsonaro faces growing pressure to green Bra... Bolsonaro faces growing pressure to green Braz... 2020-09-22 France 24 france24.com Bolsonaro faces growing pressure to green Braz... https://www.france24.com/en/20200922-bolsonaro... https://s.france24.com/media/display/aac6e8c6-...
4 [] Market Overview Tickers Articles Keywords Che... GameStop Corp (NYSE: GME) could be extending i... 2020-09-22 Benzinga benzinga.com Gamestop Corporation (NYSE:GME), Amazon.com, I... https://www.benzinga.com/news/20/09/17595730/c... https://cdn.benzinga.com/files/imagecache/og_i...

Sentiment Analysis using NLTK’s Vader

There are numerous algorithms out there for calculating sentiments on text; with the best ones typically being the ones trained on neural networks such as BERT on large datasets. You can use it for an accurate analysis, however, for a rule of thumb study like this, we can simply use vader sentiments algorithm packaged with NLTK library. The sentiments score varies from -1 to 1 with a compound column giving the sentiments score.

import nltk
nltk.download('vader_lexicon')
from nltk.sentiment.vader import SentimentIntensityAnalyzer
vader = SentimentIntensityAnalyzer()
scores = df['title'].apply(vader.polarity_scores).tolist()
scores[:5]
#Output
[{'compound': 0.1779, 'neg': 0.0, 'neu': 0.892, 'pos': 0.108},
{'compound': -0.3947, 'neg': 0.143, 'neu': 0.857, 'pos': 0.0},
{'compound': 0.1779, 'neg': 0.0, 'neu': 0.884, 'pos': 0.116},
{'compound': -0.128, 'neg': 0.185, 'neu': 0.672, 'pos': 0.143},
{'compound': 0.1779, 'neg': 0.0, 'neu': 0.892, 'pos': 0.108}]

Let us join this to our initially created dataframe.

scores_df = pd.DataFrame(scores)
# join it with main dataframe
df = df.join(scores_df, rsuffix='_right')
# Convert the date column from string to datetime
df['date'] = pd.to_datetime(df.publishedAt).dt.date
# Group by date and ticker columns from scored_news and calculate the mean
mean_scores = df.groupby(['date']).mean()
mean_scores
# Output
compound neg neu pos
date
2020-09-22 0.099078 0.079065 0.787065 0.133871
2020-09-23 0.084607 0.064409 0.812341 0.123260
2020-09-24 0.150295 0.045265 0.822676 0.132071
2020-09-25 0.061525 0.063262 0.838885 0.097852
2020-09-26 0.210888 0.044955 0.790333 0.164773
2020-09-27 0.257901 0.043290 0.751855 0.204855
2020-09-28 0.292509 0.020722 0.783089 0.196189

All we need to do now is plot this sentiments score as a bar chart and compare it with stock prices.

import matplotlib.pyplot as pltplt.rcParams['figure.figsize'] = [10, 6]mean_scores = mean_scores.unstack()# Get the cross-section of compound in the 'columns' axis
mean_scores = mean_scores.xs('compound').transpose()
# Plot a bar chart with pandas
mean_scores.plot(kind = 'bar')
plt.grid()
# Plotting a subset of amazon stock price for 7 days
# code block 1.2 (Cont.)
# Plotting stock and volumetop = plt.subplot2grid((4,4), (0, 0), rowspan=3, colspan=4)
top.plot(df2['timestamp'], df2['close'], label = 'Closing price')
plt.title('Amazon Close Price')
plt.legend(loc=2)
bottom = plt.subplot2grid((4,4), (3,0), rowspan=1, colspan=4)
bottom.bar(df2["timestamp"], df2["volume"])
plt.title('Amazon Daily Trading Volume')
plt.gcf().set_size_inches(12,8)
plt.subplots_adjust(hspace=0.75)

We can see quite immediately that positive news sentiments flow after 23 September led to the overall price increase. There are still a lot of limitations with this kind of simplistic analysis but it does show that sentiments analysis of news works pretty well.

The major limitations include:

  • When combining different news sources, we have to perform weighted averages on sentiments score by taking into account the overall domain authority and a total number of audiences of each source. For example, a negative article in the Wall Street Journal should be taken at a higher sentiments value than a negative article in a seeking alpha blog post.
  • We should always consider the stock price movement of a basket of securities in a particular sector to account for any sectoral or macroeconomic changes especially in these uncertain times of pandemic-induced market volatilities.

Originally published at http://www.specrom.com.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓