Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Starbucks Sales Analysis – Part 1
Latest

Starbucks Sales Analysis – Part 1

Last Updated on December 28, 2021 by Editorial Team

Author(s): Abhishek Jana

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Data Analysis

An in-depth look at Starbucks salesΒ data!

Every data tells a story! As a part of Udacity’s Data Science nano-degree program, I was fortunate enough to have a look at Starbucks ’ sales data. In this capstone project, I was free to analyze the data in my way. So, in this blog, I will try to explain what IΒ did.

Dataset Overview

The data was created to get an overview of the following things:

  • To observe the purchase decision of people based on different promotional offers.
  • There are three types of offers: BOGO ( buy one get one ), discount, and informational. I wanted to see the influence of these offers on purchases.
  • Finally, I wanted to see how the offers influence a particular group ofΒ people.

There are 3 files in theΒ dataset:

profile.json

Rewards program users (17000 users x 5Β fields)

  • gender: (categorical) M, F, O, orΒ null
  • age: (numeric) missing value encoded asΒ 118
  • id: (string/hash) id of eachΒ user.
  • became_member_on: (date) formatΒ YYYYMMDD
  • income: (numeric)

portfolio.json

Offers sent during the 30-day test period (10 offers x 6Β fields)

  • reward: (numeric) money awarded for the amountΒ spent
  • channels: (list) web, email, mobile,Β social
  • difficulty: (numeric) money required to be spent to receive aΒ reward
  • duration: (numeric) time for the offer to be open, inΒ days
  • offer_type: (string) BOGO, discount, informational
  • id: (string/hash) id of theΒ offers

transcript.json

Event log (306648 events x 4Β fields)

  • person: (string/hash)
  • event: (string) offer received, offer viewed, transaction, offer completed
  • value: (dictionary) different values depending on eventΒ type
  • offer id: (string/hash) not associated with any β€œtransaction”
  • amount: (numeric) money spent in β€œtransaction”
  • reward: (numeric) money gained from β€œoffer completed”
  • time: (numeric) hours after the start of theΒ test

Problem Statement

There are three main questions I attempted toΒ answer.

  1. What is the spending pattern based on offer type and demographics?
  2. How to recommend coupons/offers to current customers based on their spendingΒ pattern?
  3. How to recommend coupons/offers to new customers?

Data Analysis

From the portfolio.json file, I found out that there are 10 offers of 3 different types: BOGO, Discount, Informational.

BOGO: For the buy-one-get-one offer, we need to buy one product to get a product equal to the threshold value.

Discount: In this offer, a user needs to spend a certain amount to get a discount.

Informational: This type of offer has no discount or minimum amount toΒ spend.

To redeem the offers one has to spend 0, 5, 7, 10, or 20Β dollars.

The profile.json data is the information of 17000 unique people. The data has some null values. And by looking at the data we can say that some people did not disclose their gender, age, or income. That’s why we have the same number of null values in the gender and income column, and the corresponding age column has 118 asΒ age.

Distribution of the profileΒ data

Fig 1. Left: distribution of average age vs gender; Right: distribution of age and incomeΒ data

The profile data has the same mean age distribution amongΒ genders.

As we can see the age data is nearly a Gaussian distribution(slightly right-skewed) with 118 as outlier whereas the income data is right-skewed.

The transcript.json data has the transaction details of the 17000 unique people. 4 types of events are registered, transaction, offer received, and offerΒ viewed

The value column has either the offer id or the amount of transaction.

Data Preprocessing

To answer the first question: What is the spending pattern based on offer type and demographics? I will rearrange the data files and try to answer a few questions to answer questionΒ 1.

The sub-questions are:

  • What are the popularΒ offers?
  • How offers are utilized among different genders?
  • How transaction varies with gender, age, andΒ income?

Firstly, I merged the portfolio.json, profile.json, and transcript.json files to add the demographic information and offer information for better visualization. So my new dataset had the following columns:

'person', 'event', 'value', 'time', 'gender', 'age', 'income', 'date'.

Also, I changed the β€˜null’ gender to β€˜Unknown’ to make it a newΒ feature.

Let’s recap the columns for better understanding:

  • person(category): 17000 uniqueΒ users.
  • event(category): 4 unique categories: offer completed, offer received, offer viewed, and transaction.
  • value(category/numeric): when event = β€˜transaction’, value is numeric, otherwise categoric with offer id as categories.
  • time(numeric): 0 is the start of the experiment.
  • gender(category): 4 unique categories: Male, Female, Other, andΒ Unknown.
  • age(numeric): numeric column with 118 being unknown orΒ outlier.
  • income(numeric): numeric column with some null values corresponding to 118Β age.
  • date: date of the transaction.

What are the popular types ofΒ offers?

We can make a plot of what percentage of the distributed offer was BOGO, Discount, and Informational and finally find out what percentage of the offers were received, viewed, and completed.

To do so, I separated the offer data from transaction data (event = β€˜transaction’).

Fig 2. percentage of offer received vs offerΒ type

We can see that the informational offers don’t need to be completed. Although, BOGO and Discount offers were distributed evenly,

  • BOGO offers were viewed more than discountΒ offers.
  • But, Discount offers were completed more.

So, discount offers were more popular in terms of completion.

How offers are utilized among different genders?

Since there is no offer completion for an β€˜informational’ offer, we can ignore the rows containing β€˜informational’ offers to find out the relation between offer viewed and offer completion.

Fig 3 offer type vsΒ gender

From the β€˜Average offer received by gender’ plot, we see that the average offer received per person by gender is nearly theΒ same.

The β€˜distribution of offers by Gender’ plot shows the percentage of offers viewed among offers received by gender and the percentage of offers completed among offers received byΒ gender.

We seeΒ that,

  • Other customers viewed the most offersΒ and
  • Male customers viewed the leastΒ offers.
  • Female customers completed the most offersΒ and,
  • The Unknown group completed the leastΒ offers.

We can say, given an offer, the chance of redeeming the offer is higher among Females and OtherΒ genders!

How transaction varies with gender, age, andΒ income?

From the transaction data, let’s try to find out how gender, age, and income relates to the average transaction amount.

Fig 4. dependence on age, gender, income on avgΒ spending

We can see the expected trend in age and income vs expenditure. With age and income, mean expenditure increases.

  • In the gender plot, we see women tend to spend the most, and the group with no demographic data (Unknown gender) tends to spend theΒ least.
  • There’s a positive correlation between age and average spending.
  • People spend more with higherΒ income.

Conclusion

So, in conclusion, to answer What is the spending pattern based on offer type and demographics?

The possible answerΒ is,

  • Although BOGO offers were viewed more, Discount offers were more popular in terms of completion.
  • Given an offer, the chance of redeeming the offer is higher among Females and OtherΒ genders!
  • Women tend to spend theΒ most.
  • Spending increases with age andΒ income.

In part 2 of this blog, I willΒ explain,

  • How to recommend coupons/offers to current customers based on their spendingΒ pattern?
  • How to recommend coupons/offers to new customers?

A link to part 2 of this blog can be foundΒ here.

The GitHub repository of this project can be foundΒ here.


Starbucks Sales Analysis – Part 1 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓