Which Open-Source LLM Should You Choose in 2024?
Author(s): Dr. Leon Eversberg
Originally published on Towards AI.
LLMs are evolving at a rapid speed. Photo by Johannes Plenio on Unsplash
Since the 2017 paper βAttention Is All You Needβ invented the Transformer architecture, natural language processing (NLP) has seen tremendous growth. And with the release of ChatGPT in November 2022, large language models (LLMs) has captured everyoneβs interest.
Do you want to use LLMs for your own use case but not pay for every prompt? This article will help you understand the current state of LLMs in 2024. It will also help you decide which open-source model to choose for your own use case.
Without going into too much detail, the original Transformer architecture is divided into two interconnected parts: an encoder on the left and a decoder on the right.
The encoderβs job is to encode an input word into a deep vector representation. The decoderβs job is to generate new words.
The originally published Transformer architecture by Vaswani et al. [1]
First, an input sentence must be tokenized; that is, words (strings) must be mapped to tokens (numbers). For example, the word βtheβ can be mapped to the token 342.
The tokens are then converted into high-dimensional embedding vectors. Similar word embeddings are close to each other in this high-dimensional vector space…. Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI