The Rise of Vector Databases: Understanding Vector Search and RAG Pipeline
Author(s): Shwetha Acharya Originally published on Towards AI. What is a Vector? Vector is an object that possesses both magnitude and direction. It is represented as an array of numbers that define its dimensionality. Here is an example of how vectors — …
Technical Post-Mortem of a Data Migration Event
Author(s): Vishnu Regimon Nair Originally published on Towards AI. Key Objectives of Data Migration. Image by Author In this data-driven landscape, extracting the maximum value from data is crucial for success. As data volumes grow exponentially, organizations face considerable pressure to optimize …
The Architecture of Mistral’s Sparse Mixture of Experts (S〽️⭕E)
Author(s): JAIGANESAN Originally published on Towards AI. Exploring Feed Forward Networks, Gating Mechanism, Mixture of Experts (MoE), and Sparse Mixture of Experts (SMoE). Photo by Ticka Kao on Unsplash Introduction:🥳 In this article, we’ll dive deeper into the specifics of Mistral’s SMoE …
Unsupervised Clustering: Can We Identify Clusters in the Descriptions of Sounds in Music?
Author(s): Greg Postalian-Yrausquin Originally published on Towards AI. The data used is tricky because it is a list of Spotify songs, which are assigned values that describe the sounds in them. At this point, the goal is to see if those descriptions …
How To Use Target Encoding in Machine Learning Credit Risk Models — Part 1
Author(s): Varun Nakra Originally published on Towards AI. Target encoding, also known as mean encoding or likelihood encoding, is a technique used to convert categorical variables into numerical values based on the target variable in supervised learning tasks. This method is particularly …
Web scraping & NLP
Author(s): Greg Postalian-Yrausquin Originally published on Towards AI. In this example, I extract data from a Wikipedia list of the most grossing movies go into each of the links and fetch the text of the movie’s article. Then I use BERTopic (which …
Using NLP (Doc2Vec) and Neural Networks (with Keras): Removing Hate Speech and Offensive Tweets
Author(s): Greg Postalian-Yrausquin Originally published on Towards AI. This is a great example of how more than one ML step can be used to achieve a goal. In this exercise, I will combine NLP (Doc2Vec) with binary classification to extract offensive and …
Perfect Answer to Deep Learning Interview Question — Why Not Quadratic Cost Function?
Author(s): Varun Nakra Originally published on Towards AI. One of the most common question asked during deep learning knowledge interviews is — “Why can’t we use a quadratic cost function to train a Neural Network?”. We will delve deep into the answer …
How Do Diffusion Models Work? Simple Explanation: No Mathematical Jargon, Promised!
Author(s): Suhaib Arshad Originally published on Towards AI. Background Knowledge Essentially, there are 3 common types of generative models: Generative Adversarial Networks (GANs), Variational Autoencoder, and Flow-based models. Although they have proven their spot as high-quality image-generating models, they fall short on …
Data Science Interview Question: Creating ROC & Precision-Recall Curves From Scratch
Author(s): Varun Nakra Originally published on Towards AI. This is one of the popular data science interview questions which requires one to create the ROC and similar curves from scratch, i.e., no data on hand. For the purposes of this story, I …