Is Mamba the End of ChatGPT As We Know It?
Author(s): Ignacio de Gregorio

The Great New Question

Two researchers have made the boldest claim in years: throwing the biggest algorithmic breakthrough of the 21st century out the window.

Named Mamba, it achieves what was once thought impossible: matching or beating the Transformer’s language modeling capabilities while being faster and a lot cheaper.

Everyone seems to be talking about it, so let’s uncover what Mamba is.

Since its release in 2017, the Transformer architecture has become the ‘de facto’ choice for natural language modeling (models that generate text).

ChatGPT, Gemini, Claude, you name it, all are based on this seminal architecture.

The intrusiveness of this architecture is such that the ‘T’ in ChatGPT stands for ‘Transformer’.

A sequence-to-sequence model, (takes a sequence as input, be that a text passage or a sequence of pixels in an image, and gives you another sequence, usually new text) the secret sauce of the Transformer is… Read the full blog for free on Medium.

