Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Analyzing The Presidential Debates
Data Science

Analyzing The Presidential Debates

Last Updated on January 6, 2023 by Editorial Team

Author(s): Lawrence Alaso Krukrubo

Data Science

Exploring Sentiments, Key-Phrase-Extraction, and Inferences …

image fromΒ rev.com

2020 has been one β€˜hell-of-a-year’, and we’re about the eleventhΒ month.

It’s that time again for Americans to take to theΒ polls.

If you’ve lived long enough, you recognize the patterns…

Each opposing political side, shades the other, scandals and leaks may pop, shortcomings are magnified, critics make the news, promises are doled out β€˜ratherconvincingly’ and there’s an overwhelming sense of β€˜nationality and togetherness’ touted by bothΒ sides…

But for the most part, we’re not buying the BS! And often, we simply choose the β€˜lesser of the two evils’, because candidly the one is not significantly better than theΒ other.

SoΒ today, I’m going to analyze the presidential debates of President Trump and Vice-President Biden…

Disclaimer:

The entire analysis is done by the Author, using scientific methods that do not assume faultlessness. This is a personal project devoid of any political affiliations, sentiments or undertones. The inferences expressed from this scientific process are entirely the Author’s, based on theΒ data.

Intro:

Trump and Biden faced-off twice.

  1. The first debate was on September 29, 2020. It was moderated by Chris Wallace of FoxΒ News
  2. The second debate was originally scheduled for October 15th, but was cancelled due to Trump’s bout of COVID19, and held a week later. After his β€˜rather-theatrical-and-spectacular-recovery’. This debate was moderated by Kristen Welker of NBCΒ News.
Trump offs mask and says don’t be afraid of the virus | image from the newΒ yorker

1. TheΒ Data:

After watching both debates, as a Data Professional, I got really curious, wondering, what I could learn from analyzing the responses of these two Contestants.

It’s possible I may find something interesting from digging a little deeper into the way they answered questions bordering on the lives of millions of Americans…

That was my only motivation β€˜Curiosity’, so I set out looking for theΒ data.

Luckily I stumbled on rev.com, they had up the entire debates so I employed my data skills, scraped it off the website to a Jupyter notebook. That was the easy part. The hard part was preparing the data for each specific format required by the different libraries and tools for my analysis.

I scraped the website with the method I definedΒ below…

2. Gentlemen You Have TwoΒ Minutes:

If you watched the first debate, you’d have noticed it was a hard task for Chris to keep both men within the 2-minute limit. Trump made it particularly hard, and quite often, there were exchanges between Trump andΒ Biden.

Let’s look at what the dataΒ says…

Trump dominates theΒ debates…

Of the total responses during the first debate, Trump had 56%, while Biden had 44% and it got worse for Joe during the second debate, as Trump dominated the responses further to 60%, leaving 40% toΒ Joe.

Trump spoke 314 times in debate one and 193 times in debate two.
Biden spoke 250 times in debate one and 131 times in debate two.

Note to Self: Trump may not be the brightest, but he sure gets his voiceΒ heard…

2. Lexical-Diversity:

This simply means the cardinality or variety of words used in a conversation or document. In this case, it checks the number of unique words as a percentage of total words spoken by Trump andΒ Biden.

The data shows that Joe Biden is more creative with his words. He’s lexically-richer than Donald Trump, even though he consistently speaks fewer words thanΒ Trump.

Biden speaks 7,936 total words with 2,020 unique words and a lexical-diversity score of 25%
Trump speaks 9,209 total words with 1,894 unique words and a lexical-diversity score of 21%

Note to self: Biden may be few on words, but he’s got a heart of creativity…

Biden gets pretty creative with his words, can he match em with actions? | image_credit

3. TFIDF:

Term-Frequency-Inverse-Document-Frequency is arguably the most popular text processing algorithm. It tells us the importance of certain words to a document in comparison to other documents.

Simply put, TF-IDF shows the relative importance of a word or words to a document, given a collection of documents.

So, in this case, I choose to lemmatize the words of Trump and Biden, rather than stemmingΒ them…

def lemmatize_words(word_list):
lemma = WordNetLemmatizer()
lemmatized = [lemma.lemmatize(i) for i in word_list]

return lemmatized

Then I tokenize the words, remove punctuations and remove stopwords…

Then I build a simple TFIDF class to compute the TFIDF scores for bothΒ men.

So let’s see the words peculiar to Donald Trump using a word-cloud…

Donald Trump’s top-10 TFIDFΒ words

It’s pretty interesting or β€œuninteresting”, that Trump has on his top-10 TFIDF, words like β€˜ago’, β€˜built’, β€˜Chris’ which is the Moderator’s name, as we can see he made it a hard task for Chris. Others are β€˜disaster’, β€˜called’, β€˜cage’, β€˜nobody’….

Let’s see for JoeΒ Biden…

Joe Biden’s top-10 TFIDFΒ words

With words like β€˜create’, β€˜federal’, β€˜serious’, β€˜Americans’, β€˜folk’, β€˜situation’… It appears, Biden, put in more effort to his debate, than Team-Trump, in terms of structure andΒ theme.

4. Some Questions Asked:

image fromΒ pixabay

We have to commend Chris Wallace and Kristen Welker for being great moderators during theΒ debates.

In the first debate, Chris asked some interesting questions, some of which borderedΒ on…

  • Supreme Court
  • Obama-Care
  • Economy
  • Race /Β Justice
  • Law Enforcement
  • Election Integrity
  • COVID

And during the second debate, Kristen held it down with questions on…

  • COVID
  • National-Security
  • America / American-Families
  • Minimum-Wage
  • Immigration
  • Race / Black-Lives-Matter
  • Leadership

5. Some Answers and Inferences:

image from John-Hain Pixabay

In this section, I shall analyze Trump’s and Biden’s responses to questions on three important topics:-

  1. Jobs, Wages andΒ Taxes
  2. Racism
  3. The USΒ Economy

The analysis for this section is quite interesting, involving a few libraries andΒ tools

  • For Sentiments-Analysis: AzureML Text-Analytics-Client SDK forΒ python
  • For Key-Phrase Extraction: AzureML Text-Analytics-Client SDK forΒ python
  • For Parts-Of-Speech-Tagging: spaCY
  • For Visualization: Pywaffle, Matplotlib, Seaborn

After signing up on the Microsoft azureML portal and obtaining my key and endpoints, I created two methods for sentiments analysis and key-phrase extraction.

Next, I define the method for extracting the Parts-Of-Speech(POS) tags, using the spaCY library. This is really important in understanding how Trump and Biden often construct their sentences.

At this point, I’ve defined my work structure, now I need a couple of helper functions to process the debates into required formats and to find sentences that match myΒ queries.

The first helper function is a search function. Such that given query-words like β€˜Jobs’, β€˜wages’, it would search through Trump’s and Biden’s corpus respectively, to extract sentences containing these queryΒ words…

The others are a function to convert the sentiments received from the AzureML Client to a DataFrame and another to apply the above methods together on a corpus to return a DataFrame with all sentiments and key-phrases intact plus a dictionary of overall sentiments scores.

With just a couple of extra plotting functions, we’re good toΒ go!

A. Trump and Biden on Jobs/Wages/Taxes:

Trump responds with 93 sentences with an overall sentiment score of 21% positive, 72% negative and 7%Β neutral.

Biden responds with 127 sentences with an overall sentiment score of 33% positive, 60.3% negative and 6.7%Β neutral.

Double Pie-Charts for Trump and Biden sentiments analysis on Jobs/Wages/Taxes.

In both Pie-charts above, we can see the huge red portions indicating negative sentiments.

A2: Note that in a debate, negative sentiments should never be taken at face value, but should be explored to understand the context. This can be done by exploring the sentences and key-phrases extracted. For example, Biden may start a sentence by criticizing Trump’s approach severely, inorder to buttress his point. But doing so will cause the sentiments-analysis-client to record that sentence as overly negative. Therefore, negative-sentiments may only be taken at face-value in a review/feedback session, where negativity may indicate dissatisfaction or unhappy customers.

Given A2 above, Trump’s sentiments score is still kinda unexpected…We would expect him to paint a good picture of the work he’s been doing if he believes he’s been doing good work. I mean, it’s expected for Biden to criticize Trump, but since Trump is the sitting President, in charge of the present Government, it’s expected that his responses be more positive.

Let’s see a word-cloud of Trumps key-phrases on Jobs/Wages/Taxes

Donald Trump’s word-cloud on Jobs/Wages/Taxes

Trump talks about β€œCountry, job, tax, companies, taxes, depression”…

Let’s see a few of his positive-sentiments responses on Jobs/Taxes/Wages…

Trump talks about β€˜helping small business by raising the minimum wage’, plus β€˜being on the road to success’, amongst other things. He also responds to the question of paying $750 taxes as untrue, saying he paid millions in taxes. When challenged by Biden for exploiting the tax-bill, he claimed the bill was passed by Biden and it only gave β€œcertain individuals” the privileges for depreciation and taxΒ credits.

And for Trump’s negative-sentiment responses…

For the negatives, Trump talks about people dying, committing suicide and losing their jobs. Saying there are depression, alcohol and drugs at a level nobody’s seen before, and that’s why he wants to open up the schools andΒ economy.

Let’s see the word-cloud of Biden’s key-phrases on Jobs/Wages/Taxes

Joe Biden’s word-cloud on Jobs/Wages/Taxes

Biden talks about β€œtax, job, people, millions, fact, economy, significant”…

Let’s see a few of Biden’s positive-sentiments replies on Jobs/Taxes/Wages…

Biden talks about creating millions of jobs, investing in 50,000 charging stations on highways so as to own the electric car market of the future. He talks about taking 4 million existing buildings and 2 million existing homes and retrofit them so they don’t leak as much energy, saving hundreds of millions of barrels of oil in the process and creating millions ofΒ jobs…

On Biden’s negative sentiments responses…

Here he criticizes the Trump administration saying people who have lost their jobs have been those on the front-lines. Also, that Trump has almost half the states in America with a significant increase in COVID deaths, because he rushed to open theΒ economy…

Generally, Biden’s negative sentiments scores come from his criticism of Trumps administration, which is expected. Trump’s negative sentiments are a mix of sour remarks and unfriendly remarks at Biden, Obama and Hillary Clinton… He called Hillary crooked and a disgrace.

Let’s see the Parts-Of-Speech tags on for both Trump andΒ Biden.

Bubble-Plot of Parts-Of-Speech Tags for Trump and Biden on Jobs/Wages/Taxes

Bigger bubbles represent the most frequent part-of-speech tagsΒ used.

B. Trump and Biden onΒ Racism:

Trump never said the word β€˜Racism’ during the debates. He called Biden a Racist though and said people accuse him(Trump) of being a Racist, but they’reΒ wrong…

Trump responds with 47 sentences with an overall sentiment score of 10% positive, 87% negative and 3%Β neutral.

Biden responds with 89 sentences with an overall sentiment score of 27.5% positive, 67% negative and 5.5%Β neutral.

Double Pie-Charts for Trump and Biden sentiments analysis onΒ Racism.

Trump’s sentences again appear overly negative at 87%, while Biden’s are negative atΒ 67%

Let’s see a word-cloud of Trumps Key-phrases used in describing Racism

Donald Trump’s word-cloud onΒ Racism.

Trump uses terms like β€˜people, person, horrible, country, china, black, racist, terrible..’

For some positive-sentiments responses fromΒ Trump…

And for some negative-sentiments responses fromΒ Trump…

Trump calls Biden a racist, calls Hillary Clinton crooked and says the first time he heard about Black-Lives-Matter, they were chanting β€˜pigs in a blanket’ and β€˜fry them like bacon’, at the police and Trump says, β€˜that’s a horrible thing’…

Then Trump goes on to say he’s the least racist person in the room and that he’s been taking care of Black colleges and universities.

Note to self: Trump finds it hard to address racism constructively. Often he thinks it’s about him, he doesn’t realize it’s about the entire AmericanΒ system

Let’s see the Racism word-cloud for JoeΒ Biden…

Joe Biden’s word-cloud onΒ Racism

Here we have Biden using words like β€˜people, president, character, racist, racism, suburbsβ€˜β€¦ To tackleΒ racism.

Some of Biden’s positive-sentiments responses are…

On his positives, Biden talks about how most people don’t wanna hurt nobody and how he’s going to provide for economic opportunities, better education, better health-care and education…

And while whipping negative sentiments, Biden talks likeΒ this…

Biden reminds Trump that when George Floyd was killed, he asked the military to use tear-gas on peaceful protesters at the White-house so that Trump could pose at the church with a Bible. Biden states there’s systemic racism in America, he calls Trump a racist and reminds him that it’s not 1950 noΒ more…

Note to self: As a Blackman, I’m happy that Biden openly agrees that there’s systemic racism in America… This assertion is the only true route to a solution.

Now, let’s see the Parts-Of-Speech-Tags, for Trump and Biden onΒ Racism…

Bubble-Plot of Parts-Of-Speech Tags for Trump and Biden onΒ Racism

C. Trump and Biden on The USΒ Economy:

Trump responds with 44 sentences with an overall sentiment score of 16% positive, 80% negative and 4%Β neutral.

Biden responds with 56 sentences with an overall sentiment score of 45% positive, 50% negative and 5%Β neutral.

Double Pie-Charts for Trump and Biden sentiments analysis on The USΒ Economy.

And for the β€˜third-time-running’, Trump seems overly negative with his responses on The USΒ Economy…

Let’s see a word-cloud of Trumps Key-phrases about theΒ Economy

Trump’s word-cloud on The USΒ Economy

Trump uses terms like β€˜greatest economy in history, country, china, administration, spike, massive,Β world…’

Let’s see some of Trump’s responses with positive-sentiments,

On his positives, Trump says Due to COVID he had to close β€˜The greatest economy of the history of our country’. Which by the way is being built again and it’s going up so fast. He ends with saying they had the lowest unemployment numbers before the pandemic.

Let’s see some of Trump’s responses with negative-sentiments,

Trump talks about the negative effect of closing down the economy because of the β€˜China-plague’. He accuses Biden of planning to shut down the economy again. He said if not for his efforts, there'd be 2.2 million dead Americans to the virus and not the currentΒ 220k…

Let’s see the word-cloud of Biden’s Key-phrases about theΒ Economy.

Biden talks about β€˜economy, jobs, fact, people, energy, covid, number,Β Putin…’

Let’s see some of his remarks with positive-sentiments about theΒ Economy

Biden talks repeatedly about creating millions of new jobs by making sure the economy is being run, moved and motivated by clean energy. He talks specifically about curbing energy leaks and saving millions of barrels of oil, which leads to significantly newΒ jobs.

On Biden’s negative-sentiments responses about theΒ Economy…

From his negative-sentiment responses, Biden talks to the families who’ve lost loved ones to the pandemic. He challenges Trump that he can’t fix the economy except he first fixes the pandemic. He mentions systemic racism affecting the US economy. He accuses Trump of mismanaging the economy, stating the Obama administration handed him a booming economy which he’sΒ blown.

Finally, for this section, let’s see the bubble-plot of the Parts-Of-Speech tags for Trump and Biden on The USΒ Economy.

6. Bayesian Inference:

So, our task here is to find the conditional probability (P)of Trump and Biden mentioning the words we care most about, given theΒ debates.

We will build a Naive-Bayes classifier from scratch and use it to tell the conditional likelihood of Trump and Biden saying the words we care mostΒ about.

This simply means that the Conditional P of event A, given event B is the Conditional P of event B, given event A, multiplied by the Marginal P of event A, all these divided by the Marginal P of event B (which is actually the Total P of event B occurring atΒ all).

First, let’s define the prior, this is simply the P of Trump and Biden participating in the debates. I say it’s 50%Β each.

p_trump_speech = 0.5
p_biden_speech = 0.5

Now, I get a list of some of the words we care about (some may beΒ stemmed)

['job','wage','tax','raci','race','economy','drugs','covid',
'pandemic','vaccine','virus','health','care','dr','doc','citizen',
'america','black','african','white','latin','hispanic','asian',
'minorit','immigra']

Next, I define a function that computes the individual conditional P of Trump and Biden saying each word, given the debates. It returns a DataFrame with theseΒ intact.

So I get the DataFrame, scale it up uniformly by multiplying each value by some factors of 10 and then I normalize the values and it looks likeΒ this…

Finally, I define a Bayes-Inference method for computing the conditional probability of Trump and Biden given theseΒ words.

So I get 46.5% for Trump and 53.5% forΒ Biden

Waffle chart showing Probability of Trump and Biden saying some of the words we careΒ about…

So from these debates and given the topics we care about, who’s more likely to discuss them… Hopefully, address them and proffer solutions? Bayes Rule says Biden is more likely, and the margin is tight 53.5%β€Šβ€”β€Š46.5% = 7% in favor of JoeΒ Biden…

This is by no means a prediction of the result of the election nor a means to influence voter decisions, it’s just my opinion inferred solely from the Presidential debates.

But of course, we know there’s more to life, to America than just twoΒ debates.

God Bless America, God Bless Africa, God Bless TheΒ World…

Cheers!!

About Me:

Lawrence is a Data Specialist at Tech Layer, passionate about fair and explainable AI and Data Science. I believe that sharing knowledge and experiences is the best way to learn. I hold both the Data Science Professional and Advanced Data Science Professional certifications from IBM and the IBM Data Science Explainability badge. I have conducted several projects using ML and DL libraries, I love to code up my functions as much as possible. Finally, I never stop learning and experimenting and yes, I have written several highly recommended articles.

Feel free to find meΒ on:-

Github

Linkedin

Twitter


Analyzing The Presidential Debates was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓