LLM & AI Agent Applications with LangChain and LangGraph — Part 19: Guardrails (Safety Barriers for LLMs)

Last Updated on January 3, 2026 by Editorial Team

Author(s): Michalzarnecki

Originally published on Towards AI.

LLM & AI Agent Applications with LangChain and LangGraph — Part 19: Guardrails (Safety Barriers for LLMs)

Hi! In this chapter we’ll move to another topic that is just as practical — and in many real applications, absolutely critical: Guardrails, a safety-barrier system for language models.

Guardrails are simply a set of rules and validators that check whether the answer generated by the model matches our requirements. Thanks to them we can immediately catch errors, weird formats, or responses that are simply unusable inside an application.

In practice, guardrails are our first line of defense against the unpredictability of LLMs.

Why do we need guardrails?

Language models are powerful, but they have one fundamental property: they are stochastic — meaning their outputs have randomness.

So even if you ask for a specific format, the model might still:

add an unnecessary comment,
break the structure,
or return something completely unexpected.

For example:

You ask for pure JSON, and the model adds a sentence before it:
“Here is the answer in JSON: …”
You want exactly three tags, and the model returns three tags plus a sentence:
“Here are the generated tags: …”
You expect Python code, and you get a mix of Python and Markdown commentary.

In situations like this, guardrails are priceless. They validate the output automatically and can:

reject an invalid result,
raise an error,
or force the model to regenerate the answer.

Types of guardrails in LangSmith

Here are a few guardrails you’ll typically see in LangSmith:

1) JSON Format Validator

Checks whether the output is valid JSON.
This is the most commonly used one, because JSON is the default data exchange format in most applications.

from langchain_classic.evaluation import JsonValidityEvaluator

evaluator = JsonValidityEvaluator()
# print(evaluator.evaluate_strings(prediction='{"x": 1}')) # correct
print(evaluator.evaluate_strings(prediction='{x: 1}')) # incorrect

output:

{'score': 0, 'reasoning': 'Expecting property name enclosed in double quotes: line 1 column 2 (char 1)'}

2) JSON Equality Validator

Checks the equality of JSONs after parsing (the order of keys in JSON does not matter)

from langchain_classic.evaluation import JsonEqualityEvaluator

evaluator = JsonEqualityEvaluator()
print(evaluator.evaluate_strings(
 prediction='{"a":1,"b":[2,3]}',
 reference='{"b":[2,3],"a":2}',
))

{'score': False}

3) Fallback Messages Validator

This validator detects responses like:
“I’m sorry, but I can’t help with that.”

Such answers can appear when the model decides the topic is inappropriate, or when it simply doesn’t understand the prompt. In many applications — especially business chatbots — you can’t allow this behavior silently, so you need to catch it.

from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
load_dotenv()

primary = ChatOpenAI(model="gpt-4o-miniS", max_retries=0)
backup = ChatOpenAI(model="gpt-3.5-turbo")

chain = primary.with_fallbacks([backup])

print(chain.invoke("Describe Python in 1 sentence."))

output:

content='Python is a versatile and user-friendly programming language known for its simplicity and readability.' additional_kwargs={
 'refusal': None
} response_metadata={
 'token_usage': {
 'completion_tokens': 16,
 'prompt_tokens': 14,
 'total_tokens': 30,
 'completion_tokens_details': {
 'accepted_prediction_tokens': 0,
 'audio_tokens': 0,
 'reasoning_tokens': 0,
 'rejected_prediction_tokens': 0
 },
 'prompt_tokens_details': {
 'audio_tokens': 0,
 'cached_tokens': 0
 }
 },
 'model_provider': 'openai',
 'model_name': 'gpt-3.5-turbo-0125',
 'system_fingerprint': None,
 'id': 'chatcmpl-CZxXq6wdZu8Rwff4AfMJDNDh7ABEn',
 'service_tier': 'default',
 'finish_reason': 'stop',
 'logprobs': None
} id='lc_run--7c27ffd5-179c-43a8-ba8d-b5e8407f2529-0' usage_metadata={
 'input_tokens': 14,
 'output_tokens': 16,
 'total_tokens': 30,
 'input_token_details': {
 'audio': 0,
 'cache_read': 0
 },
 'output_token_details': {
 'audio': 0,
 'reasoning': 0
 }
}

4) Regex Pattern Validator

With regular expressions you can check whether the output matches a specific expected pattern.

This is extremely powerful when you have strict requirements — for example for phone numbers, email addresses, postal codes, IDs, invoice numbers, and so on.

from langchain_classic.evaluation import RegexMatchStringEvaluator

evaluator = RegexMatchStringEvaluator()
result = evaluator.evaluate_strings(
 prediction="Order ID: ABC-1234",
 reference=r"^Order ID: [A-Z]{3}-\d{4}$",
)
print(result['score'])

iter = 3
while result['score'] < 1.0 and iter > 0:
 iter -= 1
 print('run model once more')

output:

5) Token Limit Validator

A guardrail that ensures the output doesn’t exceed a defined number of tokens.

This matters because overly long answers can increase costs and sometimes even break application logic.

#%% md
### Token Limit
Tracking and pruning history to a token limit to avoid exceeding model context.
#%%
import json
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_core.messages.utils import trim_messages, count_tokens_approximately
from langchain_openai import ChatOpenAI

messages = [
 SystemMessage(content="You are a helpful assistant."),
 HumanMessage(content="(long conversation history here / many messages...)"),
]

trimmed = trim_messages(
 messages,
 strategy="last",
 token_counter=count_tokens_approximately,
 max_tokens=256,
 start_on="human",
 include_system=True,
)

llm = ChatOpenAI(model="gpt-4o-mini")
print(json.dumps(llm.invoke(trimmed).response_metadata, indent=4))

output:

{
 "token_usage": {
 "completion_tokens": 51,
 "prompt_tokens": 25,
 "total_tokens": 76,
 "completion_tokens_details": {
 "accepted_prediction_tokens": 0,
 "audio_tokens": 0,
 "reasoning_tokens": 0,
 "rejected_prediction_tokens": 0
 },
 "prompt_tokens_details": {
 "audio_tokens": 0,
 "cached_tokens": 0
 }
 },
 "model_provider": "openai",
 "model_name": "gpt-4o-mini-2024-07-18",
 "system_fingerprint": "fp_560af6e559",
 "id": "chatcmpl-CZxXrZEP6NDARAHV3gEdZqJBFF6PQ",
 "service_tier": "default",
 "finish_reason": "stop",
 "logprobs": null
}

6) Word Limit Validator

Works similarly, but counts words instead of tokens.
Useful for tasks like generating summaries of a fixed length.

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

limit = 25
prompt = f"Write a summary in MAX {limit} words: What is machine learning?"

resp = llm.invoke(prompt).content
if len(resp.split()) > limit:
 # quick fix - ask the model to shorten to the limit
 resp = llm.invoke(f"Shorten this to max {limit} words, without any additions:\n\n{resp}").content

print(resp)

output:

Machine learning is a subset of artificial intelligence that enables systems to learn and improve from data without explicit programming.

How guardrails work in practice

Let’s imagine you’re building a system that generates financial reports.

You enable JSON Format Validator to ensure the output can be parsed by your application.
You add a Token Limit Validator so the report isn’t longer than, say, 1000 tokens.
You include a Regex Pattern Validator to verify that numeric values are returned as numbers, not written out as words.

With this setup, you gain confidence that every response will be not only correct in content, but also usable and safe to process.

And when you combine guardrails with evaluators, you get a complete quality control system — so LLM-based applications are not only intelligent, but also stable and predictable.

That’s all in this part dedicated to guardrails. In the next article of this series we will implement code that uses Retrieval Augmented Generation RAG to generate answers based on source documents.

see next chapter

see previous chapter

see the full code from this article in the GitHub repository

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.

Frequently Used, Contextual References

Resources

LLM & AI Agent Applications with LangChain and LangGraph — Part 19: Guardrails (Safety Barriers for LLMs)

Author(s): Michalzarnecki

Why do we need guardrails?

Types of guardrails in LangSmith

1) JSON Format Validator

2) JSON Equality Validator

3) Fallback Messages Validator

4) Regex Pattern Validator

5) Token Limit Validator

6) Word Limit Validator

How guardrails work in practice

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

Full-Stack Data Scientists for the Agentic Coding World

Building Production-Grade AI Skills with Snowflake Cortex AI Function Studio

I Tried 10 AI Agent Frameworks in 2026 — Here’s the Honest Guide I Wish I Had Earlier

How One Spring Boot Optimization Saved Our Startup $30,000 a Year

Inside Palantir AIP: How the World’s Most Controversial AI Platform Actually Works

What Is a Reverse Proxy? (And Why Every Backend Developer Should Care)

What Claude Opus 4.8 Actually Changes If You’re Building Agents

QWEN 3.7 Max Worked For 35 Hrs Straight And The Results Were Mind-blowing

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

LLM & AI Agent Applications with LangChain and LangGraph — Part 19: Guardrails (Safety Barriers for LLMs)

Author(s): Michalzarnecki

Why do we need guardrails?

Types of guardrails in LangSmith

1) JSON Format Validator

2) JSON Equality Validator

3) Fallback Messages Validator

4) Regex Pattern Validator

5) Token Limit Validator

6) Word Limit Validator

How guardrails work in practice

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement