Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

What should I cook for dinner?
Data Science   Latest   Machine Learning

What should I cook for dinner?

Last Updated on January 14, 2024 by Editorial Team

Author(s): Renu Gehring

Originally published on Towards AI.

Can recommender systems help?

Photo by Malte Helmhold on Unsplash

It is a weeknight, and the witching hour is approaching. Kids are hungry. You are in your kitchen, brainstorming what you should make for dinner. You pop open your fridge and wearily inspect its ingredients. No inspiration. You shuffle to your pantry and notice the mismatched cans of beans, pickled beets, and chocolate frosting. Still no inspiration. “Honey”, your helpful spouse calls out, “I have set the table for dinner”.

It is 2 PM on a Saturday, time for your weekly grocery trip while the kids are at a playdate. It has been a busy week at work where you fought and put out several fires. Your 3-year has been biting his preschool classmates and you are handling it, barely. Now it is time to plan and shop for next week’s meals, but you are without any ideas other than the frozen pizza you have been serving the past two evenings.

What to cook for dinner? A complex, highly mathematical problem in which you are maximizing an objective function with many constraints, your kids’ allergies and preferences, what is in stock, what needs to be purchased, how much preparation and cooking time you have, and how good a home chef you are.

As a data scientist, I am well versed in optimization problems and one evening, when I was grappling with what I should cook for dinner, I had a burst of inspiration. I recalled that in the early 2000s, Netflix offered a $1 million prize for improving its recommender system. Perhaps a recommender system would help me with my daily problem. I had data available since I record my recipes and their key ingredients in a spreadsheet. I ordered delivery from the family’s favorite restaurant and while I waited for dinner to arrive, I perused articles on recommender systems online.

Recommender systems come in all flavors and complexities, with the basic ideas listed below. For a more comprehensive overview, read the Different Approaches For Building Recommender Systems Using Python.

  1. If A and B are similar users, they may like the same items.
  2. If items X and Y are similar, you can recommend item X to users who have purchased Y.
  3. Combining (1) and (2) to arrive at user, item similarities. This is a bit tricky data wise because you need to tackle many-to-many relationships between your data. User A can be similar to User B but also User C. Item X is similar to item Y and Item Z.
  4. Once you figure out your objective and your data, you can translate the user, item similarity problem into a machine learning or even a deep learning exercise. Your job is to classify users, items into similar buckets so that you can match them to their most similar counterparts.

I wanted to build something simple for my dinner recommender system, so I was delighted to discover that Maggie@DataStoryTeller had an easy-to-build and effective recommender system. Thank you, Maggie! I modeled my recommender system for dinner based on her methodology. Maggie and everyone who is reading this, it works pretty well.

Before I show you my data, I want to introduce the concept of capturing similarity using mathematics. Both you and I know that elbow noodles and corkscrew noodles are more similar to each other than they are to an apple. Our human brain is lightening quick. (It is the most sophisticated and accurate deep learning neural network built in human history.) In the time that it takes to pose this question, our brain has already created features related to the three foods, trained a neural network to classify them, and determined the similarity between them. And we were blissfully unaware of that process while knowing the answer with certainty.

If we are to teach a machine how to do copy the human brain process, we have to translate the problem step by step. Here is my data.

Data created by author

I have my family’s dinners categorized by name, key ingredients, preparation and cooking time in minutes, and how my family rated them on a 0–10 scale. I also have other features captured like whether the recipe was more suitable for summer or winter or whether it was a weekday or weekend. Not shown are features such as whether the ingredients were procured through a regular grocery or a specialty store and whether they were stock items in my pantry.

Several algorithms came to my mind, including XGBoost and even Keras. But I discarded them after I considered my objective function, which at a minimum was to cook a dinner on a weeknight or weekend, depending on the available preparation and cooking time. This clearly translated into a nearest neighbor exercise.

I used Google Colab because of its easy set-up and access to GPUs. By the way, the %load_ext google.colab.data_table allows you to see your data in a user friendly manner with options to filter and sort your rows, a nice advantage over the head or tail methods in pandas.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.neighbors import NearestNeighbors
import warnings
warnings.filterwarnings("ignore")
%load_ext google.colab.data_table

The next bit of code does two things. First, it allows my notebook to read data on my Google Drive. Second, it converts my data into a dataframe.

from google.colab import auth
auth.authenticate_user()

import gspread
from google.auth import default
creds, _ = default()

gc = gspread.authorize(creds)
worksheet = gc.open('Dinner').sheet1

rows = worksheet.get_all_values()
df=pd.DataFrame.from_records(rows)

You can view the metadata by df.info() and some summary statistics with df.describe(). It is also a great idea to quality-check your data, like whether you have any duplicates. This can be done with: df.duplicated().sum()

From viewing my data, I realized that the header had been read as the first row (index: 0 in python). I fixed this by doing two things. First, I picked up the correct names of columns with: df.columns = df.iloc[0]. iloc is a nifty way of identifying the first row (index: 0). Second, I deleted the first row with df = df[1:]. This puts all records starting from the second row (index: 1) and places them in df.

The following code can be ignored because I ended up not using it in my algorithm. If you are curious, I am parsing out the Ingredients columns into 4 columns called Ingredients1, Ingredients2, Ingredients3, and Ingredients4.

#we want top 4 ingredients called Ingredient1, Ingredient2...
split_data = df['Ingredients'].str.split(',', expand=True)
split_data.head(10)
split_data.columns = ['Ingredient1', 'Ingredient2', 'Ingredient3', 'Ingredient4']

#concatenate the two dataframes
df = pd.concat([df, split_data], axis=1)

Now that I had my Ingrdients1-Ingredients4 columns, I had to hot encode them. Hot encoding means that you create numeric columns out of text columns, so if Ingredients1 column reads noodles for row 20, you will have a column called noodles with the value 1 in row 20 and 0 otherwise.

Hot encoding is necessary for machine learning because machines process numeric data, not text, so we have to figure out creative ways of representing text with numbers.

#Make dummy variables hot encoding out of the ingredients columns
ingredient1_d = pd.get_dummies(df['Ingredient1'].str.lower())
ingredient2_d = pd.get_dummies(df['Ingredient2'].str.lower())
ingredient3_d = pd.get_dummies(df['Ingredient3'].str.lower())
ingredient4_d = pd.get_dummies(df['Ingredient4'].str.lower())

df = pd.concat([df, ingredient1_d, ingredient2_d, ingredient3_d, ingredient4_d], axis=1)

The last step added many columns to my data, which now had 59 colunbs instead of the original 13! Here are a handful of new columns added, which are all either 1 or 0: eggplant, fish, kidney beans, meat, nacho chips, noodles, and many more.

I discarded all these columns because I realized that I wanted to find nearest neighbors with respect to preparation and cooking time and season. I created my algorithm with the following code, where n_neighbors = 1 means that I only want one recommendation.

samples = df2[['PrepTime', 'CookingTime', 'Weekday', 'Winter']].copy()
neigh = NearestNeighbors(n_neighbors=1)
neigh.fit(samples)

I tweaked Maggie@DataStoryTeller’s function for my purpose.

def what_is_4dinner(df2, rec):
'''print recommendations of what to cook for dinner tonight'''

print('You should cook')
print('cook', df2.iloc[rec]['Name'])

print('PrepTime', df2.iloc[rec]['PrepTime'], ', CookingTime:', df2.iloc[rec]['CookingTime'], ', Weekday:', df2.iloc[rec]['Weekday'],
', Winter:', df2.iloc[rec]['Winter'])

And then I tested with my sample data. With PrepTime = 45, CookingTime = 60, Weekday = 0, Winter = 1, I called the the above function

rec = neigh.kneighbors([[PrepTime,CookingTime,Weekday,Winter]], return_distance=False)[0][0]
what_is_4dinner(df2, rec)

You should cook
cook Mac and Cheese

Good, old Mac and Cheese. Guess, what I served my family that cold wintery evening.

I had a lot of fun building this application even though it is a toy. The only way to add on to this is by collecting more data, perhaps on user preferences. But then I am no short order cook!

I recently started volunteering in the kitchen of Amherst Survival Center, where Chef Alan makes daily meals for 400 people with donated materials from 10–15 different stores. Now that is an optimization problem! All the constraints that we home chefs have plus two more. (1) Instead of 4 or 5 people, he is cooking for 400! (2) He does not purchase his ingredients; they are donated the morning of, meaning that he plans out meals in minutes. The best recommender system is Chef Alan himself. Today, we had beef stir fry with bell peppers (tempeh for vegan patrons), glazed carrots, broccoli, rice, and green salad with homemade dressing. Case rested!

References:

Data Science Project Tutorial: How to Build a Recommender System to Suggest What to Wear on the Run

Deep Dive into Netflix’s Recommender System

Different Approaches For Building Recommender Systems Using Python

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓