Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

LAI #80: Why LLMs Fail, Reinforcement Pre-Training, and Local Agents That Listen
Artificial Intelligence   Latest   Machine Learning

LAI #80: Why LLMs Fail, Reinforcement Pre-Training, and Local Agents That Listen

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

LAI #80: Why LLMs Fail, Reinforcement Pre-Training, and Local Agents That Listen

Good morning, AI enthusiasts,

This week’s issue explores why LLMs fail and what we can actually do about it. We’re starting with a look at their core weaknesses: why they struggle with consistency, how they approximate meaning, and what kinds of logic-layer tools can help fix it.

In the curated section, we dive into Reinforcement Pre-Training, a Microsoft-backed approach to shift models from memorization to reasoning. You’ll also find a hands-on comparison of PPO, DPO, and GRPO for fine-tuning, a guide to building your own local voice assistant with LangGraph, and a detailed breakdown of how decision trees optimize splits using greedy algorithms.

Plus: a new open-source logic engine from the community, real-world collab threads, and a poll exploring the tipping point for open model adoption.

Let’s get into it.

What’s AI Weekly

LLMs have unique strengths and weaknesses, making them powerful building blocks in some areas and unreliable in others. These weaknesses are crucial to understand so you can know where to build with and use LLM tools safely and appropriately in your workflows and what techniques you can use to address these issues. LLMs are not plug-and-play geniuses; they often need extra work to be practical in real-world applications. So this week in What’s AI, I will take a closer look at what these models actually “learn,” where they fail, and what we can do about it. Read the complete article here or watch the video on YouTube.

— Louis-François Bouchard, Towards AI Co-founder & Head of Community

Learn AI Together Community Section!

Featured Community post from the Discord

Psbigbig_71676 just open-sourced a project called WFGY (All Principles Return to One), a logic-level reasoning engine that improves how LLMs handle meaning, reduce contradictions, and stabilize outputs. It works as a plug-in layer for any LLM (GPT-2, 3, etc.) and enhances reasoning purely through language logic control. You can find the research paper here or check it out on GitHub. They are actively collecting edge cases and weird reasoning bugs, so if you have any suggestions or feedback, share them in the thread!

AI poll of the week!

Open models are gaining traction, but not decisively. 58% say they’re using open models frequently, but that still leaves 42% leaning proprietary. Open models might win on flexibility, transparency, and cost, but proprietary ones still dominate when the priority is out-of-the-box reliability.

If you’re using open models, what finally pushed you over the edge — cost, performance, flexibility, or something else? And if you’re still on closed models, what’s keeping you there? Tell me in the thread!

Collaboration Opportunities

The Learn AI Together Discord community is flooded with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too — we share cool opportunities every week!

1. Quixy8330 is building a small team to create and sell AI automation systems to local businesses and is looking for motivated people who want to learn, build, and earn together. If you have a basic understanding of AI tools, reach out in the thread!

2. Safar4352 is building a project from scratch and needs a project partner. The project will be Python-based, so if you want to understand GenAI and LLMs in depth, connect with him in the thread!

3. Tranquil_dolphin_27432 wants to collaborate with someone who typically sees AI as their passion project. If this sounds like you, reach out in the thread!

Meme of the week!

Meme shared by phiter6008

TAI Curated Section

Article of the week

Reinforcement Pre-Training: Teaching AI to Think Instead of Memorize By MKWriteshere

The article details Reinforcement Pre-Training (RPT), a method from Microsoft Research that teaches language models to reason rather than memorize. With RPT, models generate a chain-of-thought justification before making a prediction and receive a reward for correctness. The research shows that a 14-billion parameter RPT model can match the performance of a much larger 32-billion parameter baseline model on key benchmarks. This suggests that focusing on reasoning can lead to more capable and efficient AI systems without relying on brute-force scaling.

Our must-read articles

1. The Core of Decision Tree Mechanics: Impurity, Gain, and Greedy Algorithms By Kuriko Iwai

This analysis breaks down the core mechanics of decision trees, focusing on how they determine optimal splits. It covers fundamental concepts like impurity measures (Gini Impurity and Entropy) and their corresponding gains. It compares three optimization approaches: Exact, Approximate, and Histogram-based greedy algorithms, illustrating their processes with a detailed walkthrough example and a practical Python simulation. The comparison highlights the trade-offs between computational speed and model precision, showing how different algorithms handle continuous and categorical features.

2. Mastering LLM Fine-Tuning: GRPO, PPO, and DPO Compared By Adi Insights and Innovations

This piece compares policy optimization methods for fine-tuning large language models, tracing their evolution from traditional reinforcement learning to preference-based techniques. It covers Proximal Policy Optimization (PPO), which relies on reward models, before moving to Direct Preference Optimization (DPO), which learns from paired preferences. The primary focus is on DeepSeek’s Group Relative Policy Optimization (GRPO), an advancement that processes groups of ranked responses. GRPO provides a more scalable and stable approach to align models with nuanced human feedback, removing the need for a separate reward model.

3. Building a Local Background Voice Assistant with LangGraph Agent on Your PC By Murat Şimşek

The blog outlines a method for building a local, voice-activated background assistant designed to help users navigate complex software interfaces. The process is detailed through four main components: wake word detection with an audio classification model, speech-to-text transcription using OpenAI’s Whisper, and voice synthesis via the lightweight Kokoro TTS model. At its core, a LangGraph agent manages a stateful workflow, dynamically routing requests to analyze a screenshot for visual context or process a conversational query, all running locally for privacy.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.