Beyond Vision Language Action (VLA) Models: Moving Toward Agentic Skills for Zero-Error Physical AI

Author(s): Telekinesis AI

Originally published on Towards AI.

Vision Language Action (VLA) models are the hottest topic in Physical AI right now.

If you are in the space of robotics or computer vision, your feed will be packed with it: massive funding rounds for companies building the “ChatGPT of robotics”, the best research conferences publishing large corpus of work on new models, and the best of all: the laundry folding robot demos.

The promise is absolutely magical: a single end-to-end model that autonomously learns and masters any task, delivering true general-purpose intelligence to robots.

The Harsh Reality of Manufacturing

Our team visits factories regularly. These sites are often in remote areas, almost devoid of signs of life. For those who haven’t seen one, it’s easy to imagine an automotive OEM production line, with conveyor belts stretching endlessly and robots working in perfect sync.

Beyond Vision Language Action (VLA) Models: Moving Toward Agentic Skills for Zero-Error Physical AI

But the reality is that less than 10% of factories look like that. Most, especially those run by small and medium-sized enterprises (SMEs), resemble large warehouses filled with individual workstations, oily floors, and the damp, metallic smell of machine oil.

Physical AI’s Last Mile: The Massive Untapped Market of Industrial SMEs

The Small and Medium-sized Enterprises (SMEs) are the backbone of the global $14.8 trillion manufacturing market. These companies are currently struggling to hire labor because the work is repetitive and mind-numbing.

So the key question is: why haven’t robots already been deployed in SMEs?

The answer lies in a fundamental difference in production logic. Large OEMs operate in high-volume environments, producing the same part hundreds of thousands or millions of times. SMEs on the other hand operate under High-Mix Low-Volume (HMLV) manufacturing. A single factory may produce 50 to 100 distinct products, each differing in geometry, size, and handling requirements. The SMEs get weekly orders of which products they need to deliver and hence, every day, there is a changeover in the product that is produced.

What differentiates this environment from academic benchmarks or lab demos is the Zero-Error Threshold. At high volumes, even a 98% success rate is insufficient. If a robot miscalculates a grasp and drops a part into a CNC machine’s spindle, it doesn’t just “fail the task” , it breaks a $200,000 piece of equipment and halts production for days.

The challenge of Physical AI in manufacturing is therefore: high generalizability along with high precision.

So how do we truly bring Physical AI into manufacturing?

Moving Beyond VLAs: The Rise of Agentic Skills in Physical AI

A dichotomy is emerging in the space of robotics: Classical Robotics are deterministic but brittle, VLA Models are generalizable but probabilistic..

What if we could fuse the best of these 2 worlds? The answer is directly in front of our eyes: the history of LLMs.

If we study the evolution of LLMs, from Chatbots to AI Agents, we can quickly learn that the true value of LLMs is when we introduce tools. In AI Agents, the LLM is responsible for orchestrating the tools. The more powerful the modular toolset, the better is the agent.

We can achieve the same in robotics with Agentic Skills. We need to first build a large toolset, called the Skill Library, which consists of robust methods for perception and robotics. Subsequently an LLM/VLM, called an Agent, can be used to orchestrate the Skills. In this architecture, the Agent selects the right Skill for the job from the robust Skill library:

Perception Skills: 6D pose estimation, point-cloud segmentation, and anomaly detection.
Manipulation Skills: Force-sensitive insertion, compliant grasping, and high-speed trajectory following.
VLAs as Skills: Even a general-purpose VLA can be a “tool” used for creative or non-standard tasks like “clear the workspace of debris.”

By modularizing Physical, we solve the Zero-Error requirement that plagues end-to-end models:

High-Precision Execution: The Agent decides what to do, but the classical “Skill” (e.g., a peg-in-hole insertion algorithm) ensures it is done with sub-millimeter accuracy.
Explainability: If a robot fails, the system can report exactly which “Skill” failed and why, rather than being a “black box” that simply stopped working.
Rapid Changeover: To handle a new product, you don’t need to retrain the entire model. You simply update the Skill sequence or provide the Agent with a prompt for the part’s geometry.

The Telekinesis Agentic Skill Library

Telekinesis Docs

Documentation of Telekinesis.

docs.telekinesis.ai

The Telekinesis Agentic Skill Library is a concrete example of how this agentic architecture can be implemented in practice. It is a Python library designed to help teams build Physical AI systems by combining robust robotics algorithms with LLM/VLM based reasoning.

At its core, it provides two layers.

Skills: a broad set of algorithms for perception, motion planning, and control.
Physical AI Agents: LLM/VLM agents for task planning across industrial, mobile, and humanoid robots.

The library is intended for robotics, computer vision, and research teams.

Skills are organized in Skill modules. Each module covers a concept of robotics, ranging from 3D perception, 2D perception, robotics and Physical AI Agents. The Skills are hosted on the cloud and can be accessed easily with the Python library. The vision is to enable developers to introduce new Skills to this shared common library.

Contribute a Skill to the Telekinesis Community

Our bigger vision is to build a vibrant community of contributors who help grow the Physical AI Skill ecosystem.

We want you to join us. Maybe you’re a researcher who just published a paper and built some code you’re proud of. Maybe you’re a hobbyist tinkering with robots in your garage. Maybe you’re an engineer tackling tough automation challenges every day. Whatever your background, if you have a Skill, whether it’s a perception module, a motion planner, or a clever robot controller, we want to see it.

The idea is simple: release your Skill, let others use it, improve it, and see it deployed in real-world systems. Your work could go from a lab or workshop into factories, helping robots do things that were previously too dangerous, repetitive, or precise for humans.

If you’re curious and want to explore:

Documentation to the Telekinesis Agentic Skill library: https://docs.telekinesis.ai/
Github examples: https://github.com/telekinesis-ai/telekinesis-examples
Join our Discord community: https://discord.gg/S5v8bYAnc6

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.

Frequently Used, Contextual References

Resources

Beyond Vision Language Action (VLA) Models: Moving Toward Agentic Skills for Zero-Error Physical AI

Author(s): Telekinesis AI

The Harsh Reality of Manufacturing

Physical AI’s Last Mile: The Massive Untapped Market of Industrial SMEs

Moving Beyond VLAs: The Rise of Agentic Skills in Physical AI

The Telekinesis Agentic Skill Library

Telekinesis Docs

Documentation of Telekinesis.

Contribute a Skill to the Telekinesis Community

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

Full-Stack Data Scientists for the Agentic Coding World

Building Production-Grade AI Skills with Snowflake Cortex AI Function Studio

I Tried 10 AI Agent Frameworks in 2026 — Here’s the Honest Guide I Wish I Had Earlier

How One Spring Boot Optimization Saved Our Startup $30,000 a Year

Inside Palantir AIP: How the World’s Most Controversial AI Platform Actually Works

What Is a Reverse Proxy? (And Why Every Backend Developer Should Care)

What Claude Opus 4.8 Actually Changes If You’re Building Agents

QWEN 3.7 Max Worked For 35 Hrs Straight And The Results Were Mind-blowing

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Beyond Vision Language Action (VLA) Models: Moving Toward Agentic Skills for Zero-Error Physical AI

Author(s): Telekinesis AI

The Harsh Reality of Manufacturing

Physical AI’s Last Mile: The Massive Untapped Market of Industrial SMEs

Moving Beyond VLAs: The Rise of Agentic Skills in Physical AI

The Telekinesis Agentic Skill Library

Telekinesis Docs

Documentation of Telekinesis.

Contribute a Skill to the Telekinesis Community

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement