Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

LAI #97: Claude 4.5 Benchmarks, Function-Calling Fine-Tunes, and the Future of Model Alignment
Artificial Intelligence   Latest   Machine Learning

LAI #97: Claude 4.5 Benchmarks, Function-Calling Fine-Tunes, and the Future of Model Alignment

Last Updated on October 18, 2025 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

LAI #97: Claude 4.5 Benchmarks, Function-Calling Fine-Tunes, and the Future of Model Alignment

Good morning, AI enthusiasts,

This week’s issue dives deep into how models are evolving across capability, specialization, and alignment. We examine Claude Sonnet 4.5, how it outperforms GPT-5 (Codex) and Gemini 2.5 Pro in reasoning, coding, and long-form tasks, and what that means for the next wave of professional AI systems.

From there, we shift to the practical side of engineering. You’ll find a complete guide to fine-tuning open-source models for function calling with Unsloth and Docker, a comprehensive comparison of time-series foundation models like Chronos and TimesFM, and an in-depth walkthrough of building your own MCP server from scratch. The issue closes with a grounded look at where fine-tuning and alignment techniques, SFT, RLHF, DPO, and RLAIF, are heading next.

Together, these pieces highlight how the AI stack is maturing, from bigger benchmarks to cleaner, more efficient alignment pipelines.

Let’s get into it.

What’s AI Weekly

Over the last week, I have shared a few more ‘AI in a minute’ videos covering all the major developments in AI, such as the Qwen3-VL update, compared DGX Spark vs. Cloud, shared a few practical tips to ace your engineering interview, and more. Find all the videos here!

— Louis-François Bouchard, Towards AI Co-founder & Head of Community

Learn AI Together Community Section!

Featured Community post from the Discord

Elinaembedl_18370 has built Embedl Hub, a free developer platform that allows you to optimize, benchmark, and compare models by running them on devices in the cloud, so you do not need access to hardware yourself. It currently supports phones, devboards, and SoCs. If you are experimenting with on-device AI and want to understand how models perform on real edge devices, it’s worth checking out. Take a look here and support a fellow community member. If you have any questions or feedback, share them in the thread!

AI poll of the week!

The room is leaning towards a hybrid approach: using local rigs for fast iteration, privacy-sensitive work, and predictable loads, and bursting to the cloud for scale, experiments, and spiky training. A strong second camp sticks with cloud-first for velocity and managed ops, while only a few bet on all-local or classic on-prem. Which workloads stay local vs. go cloud, and what single trigger flips the choice (data gravity, latency target, egress cost, procurement/compliance, or team tooling)? Tell me in the thread and let’s discuss!

Collaboration Opportunities

The Learn AI Together Discord community is flooding with collaboration opportunities. If you are excited to dive into applied AI, want a study partner, or even want to find a partner for your passion project, join the collaboration channel! Keep an eye on this section, too — we share cool opportunities every week!

1. Lokomatic is looking to collaborate with AI literacy/training experts focused on LLM literacy, DS literacy, GenAI Upskilling, and other learning areas. If this is your focus, connect with him in the thread!

2. Bavoyager is looking for a developer with experience in AI, video & audio transcription. If you are interested in something similar, check out the project details in the thread!

Meme of the week!

Meme shared by ghost_in_the_machine

TAI Curated Section

Article of the week

Why Claude Sonnet 4.5 Is So Much Better Than GPT-5(Codex) And Gemini 2.5 Pro — Here is The Result By Gao Dalie (高達烈)

Anthropic has released Claude Sonnet 4.5, an AI model showing significant improvements in coding, reasoning, and long-duration tasks. Benchmark results indicate superior performance in software development and computer operation tasks. The model also demonstrates enhanced expertise in specialized fields like finance and law. In direct comparisons for a technical writing task, Sonnet 4.5 provided a more detailed and developer-focused output than GPT-5 (Codex) and Gemini 2.5 Pro. It offers a 200K token context window and extended autonomous operation, positioning it as a capable tool for complex, multi-step professional workflows.

Our must-read articles

1. Time Series Foundation Models: A Comprehensive Comparison By Rashmi

Time series forecasting is shifting from traditional statistical methods to pre-trained foundation models. This analysis compares the leading options, detailing their architectures and optimal use cases. It covers Amazon’s Chronos for its strong zero-shot performance and Google’s TimesFM for production efficiency. It also examines Salesforce’s Moirai for handling complex multi-variate data, the lightweight TTM from IBM for resource-constrained systems, the API-based TimeGPT, and the open-source Lag-Llama for probabilistic forecasting.

2. Fine-Tuning Open Source Models for Function Calling: A Complete Guide with Unsloth and Docker By Sharath Kumar

Leveraging Unsloth within a Docker environment, this piece demonstrates how to fine-tune the Llama 3.1 8B model for function calling. It utilizes the Hermes Function Calling V1 dataset and the Low-Rank Adaptation (LoRA) technique for efficient training. It provides a complete walkthrough, covering dataset preparation, model configuration, and the execution of the training process. Finally, it details the necessary steps for evaluating, testing, and exporting the resulting model for integration into various applications, offering a practical approach for developers to enhance open-source models with tool-use capabilities.

3. Advanced MCP Server Engineering — Part 1: “Walking Skeleton” By Kenneth Kasuba

This article provides a comprehensive, end-to-end guide for building a Model Context Protocol (MCP) server, starting with a “walking skeleton.” It details scaffolding the project with `mcp-forge`, testing locally via stdio and HTTP with `mcptools`, and setting up a secure Git repository. The process continues with deployment to FastMCP Cloud, including API key authentication. It concludes by demonstrating how to interact with the live server using a Python client from a notebook and integrating a large language model to orchestrate tool calls, offering a complete development-to-deployment workflow for engineers.

4. Fine-Tuning and Aligning Large Language Models: A Guide to SFT, RLHF, and What Comes Next By M

This summary reviews the evolution of alignment techniques for large language models, starting with Supervised Fine-Tuning (SFT) to teach models instruction-following. It then explains Reinforcement Learning from Human Feedback (RLHF), a method that refines models using human preference data, noting its effectiveness and complexity. A significant focus is placed on the shift toward simpler, more efficient alternatives like Direct Preference Optimization (DPO), which achieves similar results without a separate reward model. The overview also covers recent advancements like RLAIF, which uses AI for feedback, and ORPO for single-stage training, outlining the current alignment landscape.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.