Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

How to Define an AI Problem
Artificial Intelligence   Latest   Machine Learning

How to Define an AI Problem

Last Updated on August 26, 2023 by Editorial Team

Author(s): Jeff Holmes MS MSCS

Originally published on Towards AI.

A better way to ask an AI/ML question

Towfiqu barbhuiya on Unsplash

With more than 25 years of software engineering experience, I have answered a lot of questions from software developers who are getting started with artificial intelligence (AI) and machine learning (ML), so I thought I would share some tips on posting AI/ML questions on chat forums such as Slack and Discord.

Background

A common misconception by some users is that they can just “post” a question. However, chat forums are different in principle from online forums such as stackoverflow. A chat forum tends to be more one-on-one in nature, so it tends to require more time and effort to answer a question. Thus, it is best to take a little time upfront to properly describe the problem, especially when sending a direct message (DM). Otherwise, it is likely that you will be given the wrong answer (which is quite common).

Be mindful of the background and experience of the users giving you advice. Many Discord users are high school and undergraduate college students with no AI/ML or software engineering experience. I list my credentials on my profile (full disclosure).

The first step in solving an AI/ML problem is to be able to describe and understand the problem in detail.

Overview

Here is an overview of my tips for describing an AI/ML problem [1]:

  1. Give some description of your background and experience.
  2. Describe the problem, including the category of ML problem.
  3. Describe the dataset in detail and be willing to share your dataset(s).
  4. Describe any data preparation and feature engineering steps that you have done.
  5. Describe any models that you have tried.
  6. Favor text and tables over plots and graphs.
  7. Avoid asking users to help debug your code.

Since some users access Discord using mobile devices, it is best to share your problem description and/or code snippets via GitHub Gist, Pastebin, etc., and share the link on the forum.

It is usually best to share files via DM or create a thread so that other users do not have to search the channels for your files and posts. Keep in mind that Discord channel content is unstructured, so it can be difficult to search channels to find your original post(s).

If someone volunteers to help you, it can be helpful to copy/paste your original post if you send them a DM.

If you are having coding issues, it is best to share a link to the code/algorithm source and say that you are having problems with the implementation rather than posting code snippets and asking “what is wrong with my code?”

Details

You should briefly provide the following (1–2 sentences per item) in your forum post:

1. Give some description of your background and experience.

It is best to let users know upfront if you are in high school, university, graduate school, researcher, experienced professional, etc.

Many Discord users do not realize that many questions take a lot of time to research in order to provide an answer. Therefore, it is only fair that you give a short description of your background.

I learned this lesson after spending considerable time trying to help beginners (unknown to me) who proceeded to solve the wrong problem using the wrong algorithm.

2. Describe the problem.

In a few sentences, describe the problem, including the type of ML problem if known (Numeric: classification, regression: Image: object classification, object detection, object recognition: Text: sentiment analysis, topic modeling, text generation, etc.).

This step may include a literature review. However, if you are able to find some articles solving the same problem, then that should work for now.

Part of problem formulation is deciding whether you are dealing with supervised, unsupervised, reinforcement learning, etc. [1].

What is the goal of the model? Classify, predict, detect, translate, etc.

What is the goal of the project? Research, engineering, commercial application, hobby, etc.

3. Describe the dataset in detail and be willing to share your dataset.

Describe the dataset, including the input features and target feature(s).

It is best to share summary statistics of the data, including counts of any discrete or categorical features, including the target feature.

It is best to share the entire dataset (if you want someone to help you then you must be open and honest).

If, for some reason, you are unable to share the dataset, you need to clearly state this and why you cannot share the dataset.

Please note that Discord users are more than willing to donate their time to give free consulting advice, but it is unethical to try to ask vague questions in an effort to get free advice on a commercial or research project that you are getting paid to do. If this is the case, you should be diligent in stating this fact up front repeatedly (do not expect other Discord users to go data mining for your original post).

4. Describe any data preparation and feature engineering steps that you have done.

The steps and techniques for data preparation and cleaning will vary by dataset.

The most common steps are: fixing structural errors, handling missing or duplicate data, and filtering outliers.

Feature engineering can be used to help an algorithm and improve model performance, which includes creating new features, combining sparse classes, removing unused features, and adding dummy variables.

Some machine learning algorithms perform much better if all of the variables are scaled to the same range using normalization or standardization techniques.

5. Describe the models that you have tried (you should have tried at least one).

After performing data preparation and feature engineering, the first step should be to evaluate several baseline models to be used later for comparison. The final model(s) that you select should perform better on your dataset than the baseline models.

6. Favor text and tables over plots and graphs.

It is best not to plot more than one metric on a graph since libraries such as matplotlib will automatically adjust the axes to better show the difference in values (for examples, see How to Diagnose Overfitting and Underfitting.

Here are some common problems with charts and graphs:

  • Tables are used to compare models and share summary statistics.
  • Plots and graphs can be used to visualize model results but should not be used to evaluate and compare algorithm results.
  • Plots and graphs are difficult to view on mobile devices (many users access Discord using mobile devices).
  • Plots and graphs can be misleading, whether intentional or not (see Lessons on How to Lie with Statistics).
  • Many libraries, such as Matplotlib, will automatically rescale the axes when possible to show the difference in values, which has pros and cons.

The best practice (and approach used by most ML tools) is to compute several performance metrics (see Machine Learning Error Metrics] rather than plotting graphs.

Figure 1: Error metrics for regression using Orange.

The only graph that is commonly used in ML is a plot of train/val loss to spot-check the convergence of the model training process. Tables are usually used to compare models and share summary statistics. If you decide to present results as a graph, it is best to include a table of the actual values of the performance metrics.

Since AI/ML models are dynamic/stochastic in nature, you will get slightly different results each time you train and fit your model. Therefore, you should run the entire process (train and fit the model, then evaluate the model by computing the performance metrics) many times (say 10x). Finally, compute the average of the metric values. It can also help to compute summary statistics on the metrics such as mean, median, and standard deviation.

Figure 2: Average error metrics for 10 trials.

Here are some articles by J. Brownlee that show some ways to present ML results:

Multivariate Time Series Forecasting with LSTMs in Keras

How to Develop Multivariate Multi-Step Time Series Forecasting Models for Air Pollution

7. Avoid asking users to help debug code.

In general, the problem is usually not the algorithm implementation but the data preparation and feature engineering of your dataset.

If you find yourself mired in debugging code, this should be a red flag that you need to refactor or, more likely, choose a simpler model (Occam’s Razor).

References

[1] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, 4th ed. Upper Saddle River, NJ: Prentice Hall, ISBN: 978–0–13–604259–4, 2021 (mostly Section 19.9).

[2] E. Alpaydin, Introduction to Machine Learning, 3rd ed., MIT Press, ISBN: 978–0262028189, 2014 (mostly Chapter 19).

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓