LAI #90: Research Agents, Model Selection, and Smarter Workflows

Last Updated on September 4, 2025 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

LAI #90: Research Agents, Model Selection, and Smarter Workflows

Good morning, AI enthusiasts! This week, we’re diving into how AI is evolving beyond quick answers. In What’s AI, we explore research agents: tools built to search, reason, and produce citation-backed reports in minutes. We’re also excited to share our latest O’Reilly Radar post on LLM system design and model selection, breaking down the trade-offs every AI engineer faces.

Beyond that, you’ll find hands-on work on teaching GPT-OSS multilingual reasoning, a look at the Model Context Protocol for scaling agent-tool interactions, a custom setup for AI-powered development in Cursor, and practical guides for human-in-the-loop workflows and evaluating RAG systems the right way.

Let’s get started!

What’s AI Weekly

This week in What’s AI, I look at a new wave of AI tools designed for deeper work: research agents. Unlike chatbots that give quick answers, these agents can search, reason, and pull together citation-backed reports in minutes. They’re built to handle multi-step queries and act more like junior researchers than assistants. Read the article to see how they work in practice, or watch the video for a quick overview.

— Louis-François Bouchard, Towards AI Co-founder & Head of Community

Learn AI Together Community Section!

Featured Post

We’re excited to share that our latest post, LLM System Design and Model Selection, is now live on O’Reilly Radar. This piece dives into the core decisions every AI engineer faces: how to select the right model, balance performance with cost, and design production-grade LLM systems that actually scale.

Whether you’re evaluating models for a new product or optimizing pipelines in production, this post gives you the practical criteria and mental models you need to make the right choices.

👉 Read the full article on O’Reilly

AI poll of the week!

OpenAI is leading this poll. That’s interesting, because if you ask “Which model produces the best images?”, MidJourney usually wins in community debates. So why is OpenAI ahead here? Likely because of integration and convenience.

It shows that in AI, distribution and accessibility can matter as much as raw quality. So, do you think the “best” image gen model will ultimately win on quality, or will the one that’s easiest to access dominate the future of creative workflows? Tell me in the thread!

Collaboration Opportunities

The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too — we share cool opportunities every week!

1. Efficientnet_99825 is looking for someone to write a research paper, pick a dataset from Kaggle and work on it, build and finetune LLMs, and work on the stock market, quant analysis, and time series. If this sounds interesting, connect with him in the thread!

2. Silentsentinel6943 is looking for a partner to help him grow his GitHub repo, MARM-Systems. He needs help with scaling. If you can help, reach out in the thread!

Meme of the week!

Meme shared by sysop1984_26148

TAI Curated Section

Article of the week

Teaching OpenAI’s GPT-OSS 20B Model Multilingual Reasoning Ability: A Hands-On Guide with RTX 4090 By Lorentz Yeung

The author outlines a process for fine-tuning OpenAI’s GPT-OSS 20B model to improve its multilingual reasoning abilities. Using a local RTX 4090 setup and the Unsloth library, the model was trained on the Multilingual-Thinking dataset. This training, accomplished in just 60 steps, shifted the model’s default English-based thinking to perform chain-of-thought analysis in other languages, like French. The outcome is a model capable of generating structured, language-specific reasoning in the Harmony format, showcasing an effective method for adapting large models for more diverse, global applications.

Our must-read articles

1. The Bridge to MCP: Scaling AI Tools with Gateways By Parth Saxena

As AI agents become more interconnected with various tools, managing their communication presents a significant scaling challenge. A discussion of the Model Context Protocol (MCP) shows how it offers a standardized solution for these interactions, enabling stateful dialogues superior to traditional APIs. To handle this complexity, MCP gateways serve as a centralized entry point for authentication, security, and logging. The piece highlights recent open-source projects, like the author’s Bridge MCP, which are building the foundational infrastructure to scale these systems and shape the future of agent-tool connectivity.

2. My Cursor Custom Mode Setup: Building the Perfect AI Development Toolkit By Mayank Bohra

To enhance AI-assisted development, the author outlines a method for creating a specialized toolkit within the Cursor code editor. This approach extends beyond generic AI chat by establishing custom modes that pair specific models, such as GPT-5, Claude 4, and Gemini 2.5, with fine-tuned system prompts tailored to particular tasks. The piece details seven distinct “expert” assistants, including a Code Architect for system design, a Bug Hunter for complex debugging, and a Performance Optimizer for algorithmic improvements. This strategy is designed to leverage the unique strengths of each AI to handle specialized cognitive workloads and improve workflow efficiency.

3. Human in the loop AI Workflows using Langgraph By Aayushi_Sharma

This article details the implementation of Human-in-the-Loop (HITL) workflows in AI using LangGraph. It explains how HITL provides essential control over autonomous agents by allowing users to pause execution, approve or reject actions, and modify the agent’s state in real time. This capability is presented as a method for building safer and more reliable AI systems. The piece includes a step-by-step guide for creating an agent that halts before executing a tool, requiring human confirmation to continue. This demonstrates how to effectively combine machine efficiency with necessary human judgment in complex AI workflows.

4. From Prompts to RAG to RAGAs: Evaluating Retrieval-Augmented Generation Systems the Right Way By Edgar Bermudez

This piece addresses the limitations of Retrieval-Augmented Generation (RAG) systems, which often fail despite impressive demos. It introduces RAGAs, a framework designed to systematically evaluate these systems. The author explains how RAGAs provides measurable metrics — such as context precision, faithfulness, and answer correctness — to identify weaknesses in retrieval and generation. A practical code example demonstrates how to implement a RAG pipeline and apply RAGAs for evaluation. The text also outlines best practices for creating robust evaluation datasets, offering a structured approach for developing reliable, production-ready RAG applications, rather than relying on subjective assessments.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.

Frequently Used, Contextual References

Resources

LAI #90: Research Agents, Model Selection, and Smarter Workflows

Author(s): Towards AI Editorial Team

What’s AI Weekly

Learn AI Together Community Section!

Featured Post

AI poll of the week!

Collaboration Opportunities

Meme of the week!

TAI Curated Section

Article of the week

Our must-read articles

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

Full-Stack Data Scientists for the Agentic Coding World

Building Production-Grade AI Skills with Snowflake Cortex AI Function Studio

I Tried 10 AI Agent Frameworks in 2026 — Here’s the Honest Guide I Wish I Had Earlier

How One Spring Boot Optimization Saved Our Startup $30,000 a Year

Inside Palantir AIP: How the World’s Most Controversial AI Platform Actually Works

What Is a Reverse Proxy? (And Why Every Backend Developer Should Care)

What Claude Opus 4.8 Actually Changes If You’re Building Agents

QWEN 3.7 Max Worked For 35 Hrs Straight And The Results Were Mind-blowing

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

LAI #90: Research Agents, Model Selection, and Smarter Workflows

Author(s): Towards AI Editorial Team

What’s AI Weekly

Learn AI Together Community Section!

Featured Post

AI poll of the week!

Collaboration Opportunities

Meme of the week!

TAI Curated Section

Article of the week

Our must-read articles

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement