Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Free: 6-day Agentic AI Engineering Email Guide.
Learnings from Towards AI's hands-on work with real clients.
Beyond Vision Language Action (VLA) Models: Moving Toward Agentic Skills for Zero-Error Physical AI
Computer Vision   Latest   Machine Learning

Beyond Vision Language Action (VLA) Models: Moving Toward Agentic Skills for Zero-Error Physical AI

Author(s): Telekinesis AI

Originally published on Towards AI.

Vision Language Action (VLA) models are the hottest topic in Physical AI right now.

If you are in the space of robotics or computer vision, your feed will be packed with it: massive funding rounds for companies building the “ChatGPT of robotics”, the best research conferences publishing large corpus of work on new models, and the best of all: the laundry folding robot demos.

The promise is absolutely magical: a single end-to-end model that autonomously learns and masters any task, delivering true general-purpose intelligence to robots.

The Harsh Reality of Manufacturing

Our team visits factories regularly. These sites are often in remote areas, almost devoid of signs of life. For those who haven’t seen one, it’s easy to imagine an automotive OEM production line, with conveyor belts stretching endlessly and robots working in perfect sync.

Beyond Vision Language Action (VLA) Models: Moving Toward Agentic Skills for Zero-Error Physical AI

But the reality is that less than 10% of factories look like that. Most, especially those run by small and medium-sized enterprises (SMEs), resemble large warehouses filled with individual workstations, oily floors, and the damp, metallic smell of machine oil.

Physical AI’s Last Mile: The Massive Untapped Market of Industrial SMEs

The Small and Medium-sized Enterprises (SMEs) are the backbone of the global $14.8 trillion manufacturing market. These companies are currently struggling to hire labor because the work is repetitive and mind-numbing.

So the key question is: why haven’t robots already been deployed in SMEs?

The answer lies in a fundamental difference in production logic. Large OEMs operate in high-volume environments, producing the same part hundreds of thousands or millions of times. SMEs on the other hand operate under High-Mix Low-Volume (HMLV) manufacturing. A single factory may produce 50 to 100 distinct products, each differing in geometry, size, and handling requirements. The SMEs get weekly orders of which products they need to deliver and hence, every day, there is a changeover in the product that is produced.

What differentiates this environment from academic benchmarks or lab demos is the Zero-Error Threshold. At high volumes, even a 98% success rate is insufficient. If a robot miscalculates a grasp and drops a part into a CNC machine’s spindle, it doesn’t just “fail the task” , it breaks a $200,000 piece of equipment and halts production for days.

The challenge of Physical AI in manufacturing is therefore: high generalizability along with high precision.

So how do we truly bring Physical AI into manufacturing?

Moving Beyond VLAs: The Rise of Agentic Skills in Physical AI

A dichotomy is emerging in the space of robotics: Classical Robotics are deterministic but brittle, VLA Models are generalizable but probabilistic..

What if we could fuse the best of these 2 worlds? The answer is directly in front of our eyes: the history of LLMs.

If we study the evolution of LLMs, from Chatbots to AI Agents, we can quickly learn that the true value of LLMs is when we introduce tools. In AI Agents, the LLM is responsible for orchestrating the tools. The more powerful the modular toolset, the better is the agent.

We can achieve the same in robotics with Agentic Skills. We need to first build a large toolset, called the Skill Library, which consists of robust methods for perception and robotics. Subsequently an LLM/VLM, called an Agent, can be used to orchestrate the Skills. In this architecture, the Agent selects the right Skill for the job from the robust Skill library:

  • Perception Skills: 6D pose estimation, point-cloud segmentation, and anomaly detection.
  • Manipulation Skills: Force-sensitive insertion, compliant grasping, and high-speed trajectory following.
  • VLAs as Skills: Even a general-purpose VLA can be a “tool” used for creative or non-standard tasks like “clear the workspace of debris.”

By modularizing Physical, we solve the Zero-Error requirement that plagues end-to-end models:

  1. High-Precision Execution: The Agent decides what to do, but the classical “Skill” (e.g., a peg-in-hole insertion algorithm) ensures it is done with sub-millimeter accuracy.
  2. Explainability: If a robot fails, the system can report exactly which “Skill” failed and why, rather than being a “black box” that simply stopped working.
  3. Rapid Changeover: To handle a new product, you don’t need to retrain the entire model. You simply update the Skill sequence or provide the Agent with a prompt for the part’s geometry.

The Telekinesis Agentic Skill Library

Telekinesis Docs

Documentation of Telekinesis.

docs.telekinesis.ai

The Telekinesis Agentic Skill Library is a concrete example of how this agentic architecture can be implemented in practice. It is a Python library designed to help teams build Physical AI systems by combining robust robotics algorithms with LLM/VLM based reasoning.

At its core, it provides two layers.

  • Skills: a broad set of algorithms for perception, motion planning, and control.
  • Physical AI Agents: LLM/VLM agents for task planning across industrial, mobile, and humanoid robots.

The library is intended for robotics, computer vision, and research teams.

Skills are organized in Skill modules. Each module covers a concept of robotics, ranging from 3D perception, 2D perception, robotics and Physical AI Agents. The Skills are hosted on the cloud and can be accessed easily with the Python library. The vision is to enable developers to introduce new Skills to this shared common library.

Contribute a Skill to the Telekinesis Community

Our bigger vision is to build a vibrant community of contributors who help grow the Physical AI Skill ecosystem.

We want you to join us. Maybe you’re a researcher who just published a paper and built some code you’re proud of. Maybe you’re a hobbyist tinkering with robots in your garage. Maybe you’re an engineer tackling tough automation challenges every day. Whatever your background, if you have a Skill, whether it’s a perception module, a motion planner, or a clever robot controller, we want to see it.

The idea is simple: release your Skill, let others use it, improve it, and see it deployed in real-world systems. Your work could go from a lab or workshop into factories, helping robots do things that were previously too dangerous, repetitive, or precise for humans.

If you’re curious and want to explore:

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

6-Day Agentic AI Engineering Email Guide — one practical lesson per day

Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.