Unlocking The Secrets: Exploring Generative vs. Discriminative Models in Machine Learning
Last Updated on November 16, 2024 by Editorial Team
Author(s): Claudio Giorgio Giancaterino
Originally published on Towards AI.
In Machine Learning, two models stand out: generative and discriminative models. They are used for distinct goals and they approach problems from unique perspectives. Understanding the difference between these categories of models becomes important, for instance when weβre solving tasks about classifying emails as spam or not, predicting car insurance premiums, or even generating realistic images. From Google Trends, I found that the rise of generative models began in 2023, after the release of ChatGPT, a new version of chatbot equipped with Generative AI, accompanied by the spread of text-to-image generators such as Midjourney, DALL-E, or Stable Diffusion.
The evolution of Artificial Intelligence (AI) comes through the development of Machine Learning. They arise from the theory, expressed in 1959 by Arthur Samuel that computers can learn patterns from data to perform specific tasks without being programmed. Tom Mitchellβs definition in 1997 is more complete since it states that machine learning performance improves automatically through experience. This learning allows models/algorithms to perform tasks like prediction, classification, and even the generation of new data. ML is prevalent across industries, powering everything from recommendation engines to customer churn.
What Are Generative and Discriminative Models?
Generative Models: in classification and regression tasks, generative models aim to learn the joint probability distribution P(X, Y), where (X) represents the features and (Y) represents the outcome. These models focus on generating data points similar to the training data. They learn the distribution of data.
Discriminative Models: these models focus on learning the conditional probability distribution P(Y|X). They try to model directly the boundary or relationship between features and the outcome. In classification tasks, they predict the label for a given set of features, while in regression, they estimate a continuous value.
Different Approaches to Modelling Problems
Generative models build a joint model of features and labels that exist simultaneously, allowing them to simulate or generate new data points. They capture the underlying patterns or distribution of the data points and can create new instances similar to the training data they were trained on. For instance, a generative model might learn how images of cats and dogs are formed based on their features like shape.
Conversely, discriminative models study how features relate to labels or target values, focusing on the separability or relationships within the given dataset. This makes them efficient for classification and regression tasks as they do not understand the data distribution beyond what is necessary for a good performance. For example, a discriminative model might learn to classify images like cats or dogs by finding a hyperplane that best separates the two classes in feature space.
A Glimpse into the Mathematics
When working on a classification task, the goal is to estimate the conditional probability P(Y|X).
Generative Models: Training a generative classifier involves finding the conditional probability P(Y|X) by estimating parameters for the prior probability P(Y) and the likelihood probability P(X|Y) from the training dataset. This is followed by using Bayesβ Theorem to calculate the posterior probability P(Y|X). Classic examples of generative models include NaΓ―ve Bayes, Hidden Markov Models, Gaussian Mixture Models until Transformer and Diffusion Model.
Discriminative Models: Training a discriminative classifier involves estimating the conditional probability P(Y|X) by assuming a specific functional form for P(Y|X) and then estimating the parameters directly from the training dataset. Examples of discriminative models are Logistic Regression, Support Vector Machines, traditional Neural Networks and Linear Regression.
Strengths and Weaknesses
Generative Models:
- Strengths: generative models are powerful tools for understanding data generation processes and can be highly effective in scenarios where data synthesis is required.
- Weaknesses: generative models can be computationally intensive, and their complexities might lead to overfitting, particularly in high-dimensional spaces with limited data.
Discriminative Models:
- Strengths: discriminative models are generally easier to train. They can provide accurate predictions with less complexity and are robust against overfitting when regularized properly.
- Weaknesses: discriminative models cannot generate new data points and require large amounts of labelled data to perform well.
Real-World Applications and Actuarial Uses
Generative models are used for applications like image generation, text generation, language translation, and voice synthesis. In the actuarial field, they can help simulate potential future scenarios, offering a synthetic view of possible outcomes, detecting fraudulent claims identifying patterns in claims data that might indicate fraud.
Conversely, discriminative models are widely used in applications requiring predictive power, such as spam detection, credit scoring, risk assessment, and stock prediction. In the actuarial field, they can enhance pricing models or predict claims, providing precise estimations based on historical data.
References:
Generative Deep Learning, 2nd Edition[Book]
Generative vs Discriminative Models: Differences & Use Cases | DataCamp
Decoding Generative and Discriminative Models | Analytics Vidhya
Discriminative vs Generative Models
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI