How to Perform Quantization in Machine Learning (Math Explained!)
Last Updated on November 3, 2024 by Editorial Team
Author(s): Richard Warepam
Originally published on Towards AI.
Quantization is a MUST Step to Fine-Tune Large Language Models
This member-only story is on us. Upgrade to access all of Medium.
Photo by ThisisEngineering on UnsplashSuppose you are fitting a large number of books into a small suitcase. You canβt take all of them, so you must decide which ones to bring and which to leave behind. This process of selecting and compressing data is quite similar to what we do in machine learning when we perform quantization.
Quantization is a technique for reducing the number of bits needed to represent data. This reduces the number of bits needed to compress models, making them faster and more efficient.
In this article, weβll delve into the concept of quantization, its types, and how to perform it effectively.
Β· What is Quantization and Why is it Important? β Why Quantization Matters β You might be wondering, How?Β· Types of Quantization β Symmetric Quantization β Asymmetric QuantizationΒ· How to Perform Quantization β Symmetric Quantization β Asymmetric QuantizationΒ· Conclusion β Key Takeaways
Quantization in the context of machine learning refers to the process of mapping a large set of input values to a smaller set.
This is primarily done to reduce the computational and memory requirements of machine learning models, making them more efficient without significantly sacrificing accuracy.
Efficiency: Quantized… Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI