Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Free: 6-day Agentic AI Engineering Email Guide.
Learnings from Towards AI's hands-on work with real clients.
LAI #108: Building What Lasts in the Year Ahead
Artificial Intelligence   Latest   Machine Learning

LAI #108: Building What Lasts in the Year Ahead

Last Updated on January 3, 2026 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

LAI #108: Building What Lasts in the Year Ahead

Good morning, AI enthusiasts, and happy new year 🎉

This is the first issue of the year, and it feels like a good moment to reset expectations and direction. We’re starting 2026 by looking ahead, not at launches or demos, but at what it will actually take to build AI systems that hold up in the real world: systems that are reliable, governable, and affordable to run.

We will also dive into how modern inference really works with a kernel-level breakdown of Paged Attention, unpack why teams are migrating from FAISS to Qdrant in production, and explore new efficiency frontiers with CALM autoencoders that move beyond token-by-token generation. You’ll also find a fresh perspective on vision systems through the Prism Hypothesis, and a back-to-basics walkthrough of building a neural network from scratch using only NumPy: a reminder that understanding the mechanics still pays dividends as systems grow more complex.

Our aim this year is simple: fewer generic takes, more work that helps you learn, build, and ship with confidence. If you’re here at the start of the year, you’re exactly who this newsletter is for.

Here’s to a focused, curious, and constructive year of building together.

— Louis-François Bouchard, Towards AI Co-founder & Head of Community

The AI Trends That Will Matter in 2026 (and How to Prepare for Them)

In this end-of-year reflection post, we break down what it will take to build systems that are reliable, governable, and affordable, and what does that actually require in practice? We will go from the fundamentals that fail first (context and retrieval) to the constraints that decide whether anything ships (verification, governance, portability, and cost).

Read the complete article here!

Learn AI Together Community Section!

Featured Community post from the Discord

Tbinkiewinkie created Z-Image-Turbo-Local, a Dockerized AI image and video generation system running Z-Image-Turbo and WAN 2.2 locally on consumer hardware. It’s optimized to fit on a 12GB VRAM card and generates images within 3 seconds. Check it out on GitHub and support a fellow community member. If you have any feature ideas, share them in the thread!

AI poll of the week!

Happy New Year! Looks like most of you found us by searching or browsing, love that you came here on purpose. If you’re reading this, you’re our kind of person.

First issue of the year = fresh start. We’ll keep things hands-on and builder-friendly, but we want the roadmap to match your goals, fewer generic “AI tips,” more pieces that actually help you ship and learn.

Tell us your one AI resolution for 2026 and what would help you stick to it: topics you want covered, formats you prefer (short guides, deep dives, code labs, office hours), and any “first week” content you’d love to see.

Collaboration Opportunities

The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too — we share cool opportunities every week!

1. Voovy_ is looking to collaborate with operators, closers, and network-driven partners who want to build products. They run an AI automation agency, and if you want to be a part of the project, connect with them in the thread!

2. Smith_le_retour is building a personalized LLM-based AI agent and wants to collaborate with developers who know Python, C++, and some tools. If this space interests you, reach out in the thread!

3. Falcon5338 is looking for partners to prepare for interviews and do mock interview sessions together. If you want to prepare together as well, contact them in the thread!

Meme of the week!

Meme shared by bin4ry_d3struct0r

TAI Curated Section

Article of the week

Paged Attention: Theoretically under the hood By Sai Saketh

This article offers a detailed technical examination of the kernel-level implementation of Paged Attention for transformer inference. It explains how the computation is managed by CUDA threads, covering the entire process from data loading to final output. The summary breaks down the efficient handling of query, key, and value data, highlighting techniques like memory coalescing for queries and block-based processing for the KV cache. It also describes the multi-stage reductions across threads and warps used to perform the query-key dot product, a numerically stable softmax, and the final weighted-value summation, providing insight into the system’s performance optimization.

Our must-read articles

1. How to Migrate from FAISS to Qdrant: A Real-World Guide Using the MS MARCO Passage Dataset By Sai Bhargav Rallapalli

Highlighting the operational challenges of using FAISS in production, such as its lack of an API and metadata filtering, this guide demonstrates a migration to Qdrant. Using the MS MARCO dataset as a case study, it details the process of exporting embeddings, creating a Qdrant collection, and performing a batch upload with metadata. A performance comparison reveals that a slight increase in latency is a worthwhile trade-off for Qdrant’s persistent storage, advanced filtering, and improved developer experience, ultimately eliminating fragile glue code for a more stable system.

2. Beyond Token-by-Token: How CALM Autoencoders Are Redefining LLM Efficiency By Fabio Yáñez Romero

To address the inefficiency of token-by-token generation in language models, this piece details the Continuous Autoregressive Language Models (CALM) framework. It utilizes a variational autoencoder to compress multiple tokens into a single, dense latent vector. A language model can then be trained to predict this vector, allowing a decoder to reconstruct the full token sequence in a single forward pass, which reduces latency. The summary also covers key technical challenges, such as using KL clipping to prevent “latent collapse” and to ensure the autoencoder produces a robust, meaningful representation for downstream tasks.

3. The Prism Hypothesis: Why AI Vision Systems Have Been Looking at the World Wrong By Kaushik Rajan

This study explores the “Prism Hypothesis” to resolve a long-standing trade-off in AI vision systems, where models typically excel at either semantic understanding or image reconstruction, but not both. The hypothesis posits that this is not a conflict of information types but of different frequency bands: low frequencies encode semantic meaning, while high frequencies capture fine visual details. The researchers developed a Unified Autoencoder (UAE) that spans the full spectrum. It aligns its low-frequency data with a semantic model for comprehension and leverages the full spectrum for reconstruction, thereby creating a single model that delivers both strong semantic performance and high-fidelity image generation.

4. Building a Neural Network From Scratch — Just Numpy By Sai Saketh

This article provides a guide to building a simple two-layer neural network for MNIST digit classification using only NumPy. It explains the fundamental processes, beginning with forward propagation, where pixel data is processed through weighted layers with ReLU and softmax activation functions to generate predictions. The explanation continues with backward propagation, detailing how prediction errors are calculated and distributed back through the network. Finally, it covers how gradient descent is used to update the network’s parameters, improving its accuracy over iterations. The piece serves as a foundational look at a neural network’s internal mechanics.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

6-Day Agentic AI Engineering Email Guide — one practical lesson per day

Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.