LLM & AI Agent Applications with LangChain and LangGraph — Part 14: 5 Rules of Effective Prompt Engineering
Last Updated on January 2, 2026 by Editorial Team
Author(s): Michalzarnecki
Originally published on Towards AI.

Hi! This time we’ll focus on something that, in practice, often decides whether your work with large language models will be impressive… or disappointing: prompt engineering.
Even the best model — and even the most well-designed workflow in LangChain — won’t meet expectations if we don’t tell it precisely what we want. It’s the same as working with another person on your team: you need to explain the task clearly to minimize the risk of mistakes.
And there’s one more thing: if you define expectations and your definition of done, you can usually expect better results. If you attach examples, you set a pattern — and the model will tend to follow it.
Based on the experience of engineers and researchers, we can point to five principles that come up again and again. They’re universal rules you can apply to almost any task — whether you’re generating code, analyzing data, or creating content.

1) Clear instructions
Treat the model like a very capable teammate who knows nothing about your project. The more useful context you provide, the better the output tends to be.

Or another one:

See the difference? In the second version the model knows its role, the criteria it should apply, and how to present the output.
Let’s see also code sample for this rule.
First install and import libraries:
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
load_dotenv()
# Default model
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7)
Run example prompts:
# Bad prompt — vague, no role or expectations
bad_prompt = "Write a function in Python."
print("=== Bad prompt ===")
print(llm.invoke(bad_prompt).content)
# Good prompt — clearly defined role and expectations
good_prompt = """You are an expert Python programmer.
Write a Python function that takes a list of integers
and returns a new list containing only even numbers.
Add a unit test in pytest."""
print("\n=== Correct prompt ===")
print(llm.invoke(good_prompt).content)
output:
=== Bad prompt ===
Sure! What kind of function would you like to create in Python? Please provide some details about its purpose, input parameters, and expected output so I can assist you better.
=== Correct prompt ===
Certainly! Below is a Python function that takes a list of integers and returns a new list containing only the even numbers. I've also included a unit test using the `pytest` framework.
### Python Function
```python
def filter_even_numbers(numbers):
"""Return a new list containing only the even numbers from the input list."""
return [num for num in numbers if num % 2 == 0]
```
### Unit Test Using Pytest
To test the function using `pytest`, you can create a separate test file. Here’s how you would write the unit test:
```python
import pytest
from your_module import filter_even_numbers # Replace 'your_module' with the name of your Python file without the .py extension
def test_filter_even_numbers():
assert filter_even_numbers([1, 2, 3, 4, 5, 6]) == [2, 4, 6]
assert filter_even_numbers([0, 1, 2, 3]) == [0, 2]
assert filter_even_numbers([-2, -1, 0, 1, 2]) == [-2, 0, 2]
assert filter_even_numbers([]) == []
assert filter_even_numbers([1, 3, 5]) == []
```
### How to Run the Test
1. Save your function in a Python file, e.g., `even_filter.py`.
2. Save your test in another file, e.g., `test_even_filter.py`.
3. Open a terminal and navigate to the directory where both files are located.
4. Run the tests using the command:
```bash
pytest test_even_filter.py
```
This will execute the tests and let you know if the function behaves as expected. Remember to replace `'your_module'` in the import statement with the actual name of your Python file that contains the `filter_even_numbers` function.
2) Show examples
Models respond extremely well to analogies and learning by comparison. When you include examples, you typically use one of three approaches:
- zero-shot (no examples)
- one-shot (one example)
- few-shot (a few examples)
Zero-shot example:

→ Output: a list in a weird format — sometimes numbered, sometimes with extra commentary.
One-shot example:

→ Output: clean, usable tags.
Few-shot example:
Add a few examples with different texts and tags. The model starts to understand the context better and returns data in a more stable format.

In practice: the more complex the task, the more powerful examples become — especially if you also show the model how to reason step by step.
Let’s see results by running a code:
# Task: Generate tags based on the content of the company's website
# Zero-shot
zero_shot = """Enter tags describing the company based on the page text:
Lego produces toys for children."""
print("=== Zero-shot ===")
print(llm.invoke(zero_shot).content)
# One-shot
one_shot = """Enter up to three tags that describe your company based on the text on your page.
Example:
Text: Lego manufactures building blocks for children.
Tags: toys, building blocks, children
Now:
Text: Nike manufactures athletic clothing and shoes.
Tags:"""
print("\n=== One-shot ===")
print(llm.invoke(one_shot).content)
# Few-shot
few_shot = """Enter up to three tags describing your company based on the page text.
Example 1:
Text: Lego manufactures building blocks for children.
Tags: toys, building blocks, children
Example 2:
Text: Nike manufactures sportswear and shoes.
Tags: sports, clothing, footwear
Now:
Text: Tesla manufactures electric cars and energy storage systems.
Tags:"""
print("\n=== Few-shot ===")
print(llm.invoke(few_shot).content)
output:
=== Zero-shot ===
Oto propozycje tagów opisujących firmę Lego na podstawie podanego tekstu:
1. #Lego
2. #Zabawki
3. #DlaDzieci
4. #ProducentZabawek
5. #Kreatywność
6. #Zabawa
7. #Edukacja
8. #RozwójDziecka
9. #Klocki
10. #ZabawkiKreatywne
=== One-shot ===
athletic clothing, shoes, sports
=== Few-shot ===
electric vehicles, energy storage, technology
In another example below we instruct model to use specific writing style from examples:
prompt = """Create a list of 5 advertising slogans for a coffee shop.
Use the style from the example slogans below:
"We'll get you coffee on the table!"
"Coffee fast, cheap, and to the point!"
"""
print(llm.invoke(prompt).content)
output 1:
1. "Sip, savor, and start your day right!"
2. "Brewed for you, happiness in every cup!"
3. "Where every cup tells a story!"
4. "Your daily grind, made divine!"
5. "Coffee that fuels your passion!"
prompt = """Create a list of 5 advertising slogans for your coffee shop.
Adopt the style from the example slogans below:
"Our coffee is like a stroll in the Garden of Eden on a sunny, joyful day."
"Drinking our coffee will make you feel as if an angel is dancing on your tongue."
"""
print(llm.invoke(prompt).content)
output 2:
1. "Sip our coffee and let your senses waltz through a blissful sunrise in a serene meadow."
2. "Our brews wrap you in a warm hug from the universe, igniting your spirit with every sip."
3. "Indulge in our coffee, where each cup is a ticket to a dreamy escape beneath a starlit sky."
4. "Taste our coffee and experience the sweet symphony of flavors orchestrating a celebration on your palate."
5. "With every sip of our coffee, feel the sun rise in your heart, filling your day with radiant joy."
3) Define the output format
Few things are more frustrating than getting results in the wrong format — when you expect JSON and you receive an essay… or an extra follow-up question.

Or for code review:

This way the output is immediately ready for automated processing — loading into an app, running validation, or feeding into tests.
Let’s also jump to the code for this example:
from typing import List
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
# Output schema definition (Pydantic)
class CityGuide(BaseModel):
city: str = Field(..., description="The city covered by the guide")
summary: str = Field(..., description="A short description of the city (2–3 sentences)")
must_do: List[str] = Field(..., description="3-5 To-Do List")
# LLM + structured output
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
structured_llm = llm.with_structured_output(CityGuide)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a travel expert. Answer concisely."),
("user", "Create a quick guide to {city} for a {days}-long visit.")
])
chain = prompt | structured_llm
class GuideRequest(BaseModel):
city: str = Field(min_length=2)
days: int = Field(ge=1, le=7)
req = GuideRequest(city="Wroclaw", days=2)
result: CityGuide = chain.invoke(req.model_dump())
print(result)
print(result.model_dump_json(indent=2))
output:
city='Wrocław' summary='Wrocław, a vibrant city in western Poland, is known for its stunning architecture, rich history, and lively cultural scene. The city is famous for its picturesque Market Square, charming canals, and the unique dwarf statues scattered throughout the streets.' must_do=['Explore the Market Square and admire the Gothic-style Town Hall.', 'Visit the Centennial Hall, a UNESCO World Heritage site.', 'Take a stroll along the Odra River and enjoy the scenic views.', 'Discover the Wrocław Cathedral on Ostrów Tumski island.', 'Search for the Wrocław dwarfs scattered around the city.']
{
"city": "Wrocław",
"summary": "Wrocław, a vibrant city in western Poland, is known for its stunning architecture, rich history, and lively cultural scene. The city is famous for its picturesque Market Square, charming canals, and the unique dwarf statues scattered throughout the streets.",
"must_do": [
"Explore the Market Square and admire the Gothic-style Town Hall.",
"Visit the Centennial Hall, a UNESCO World Heritage site.",
"Take a stroll along the Odra River and enjoy the scenic views.",
"Discover the Wrocław Cathedral on Ostrów Tumski island.",
"Search for the Wrocław dwarfs scattered around the city."
]
}
4) Break complex tasks into smaller parts
Models are great at simple questions, but they’re less reliable when you pack everything into one huge request.

Now the model can answer each step accurately and completely — and you stay in full control of the process.
Code example:
# Bad prompt — everything at once
bad_prompt = """Prepare a three‑day sightseeing plan for Posen with a budget of 300 euros,
including attractions, restaurants, transport, and maps."""
print("=== Correct prompt ===")
print(llm.invoke(bad_prompt).content[:600], "...")
# Good prompt — step by step
good_step1 = "List the most important cultural attractions in Posen with opening hours."
step1 = llm.invoke(good_step1).content
print("\n=== Correct prompt — step 1 ===")
print(step1[:600], "...")
good_step2 = f"Based on this list, create a sightseeing plan for 3 days, max. 4 attractions per day. Attractions: {step1}"
step2 = llm.invoke(good_step2).content
print("\n=== Correct prompt — step 2 ===")
print(step2[:600], "...")
output:
=== Correct prompt ===
Here's a three-day sightseeing plan for Poznań, Poland, with a budget of 300 euros. This plan includes attractions, restaurants, transport, and maps.
### Day 1: Explore the Old Town
**Morning:**
- **Breakfast:** Start your day with breakfast at **Café La Ru** (approx. €5).
- **Attraction:** Visit the **Old Market Square** (Stary Rynek) to see the colorful townhouses and the famous **Town Hall** (free entry).
- **Attraction:** Watch the **Goats of Poznań** at noon (free).
**Lunch:**
- **Restaurant:** Enjoy lunch at **Ratuszova** (approx. €10).
**Afternoon:**
- **Attraction:** Visit the **Na ...
=== Correct prompt — step 1 ===
Poznań, a vibrant city in Poland, is rich in history and culture. Here are some of the most important cultural attractions along with their typical opening hours. Please note that hours may vary, so it's always a good idea to check the official websites or contact the venues directly for the most current information.
1. **Old Market Square (Stary Rynek)**
- **Description**: The heart of Poznań, featuring colorful townhouses, the Renaissance-style Town Hall, and numerous cafes and shops.
- **Opening Hours**: Open year-round, accessible at all times.
2. **Poznań Town Hall (Ratusz)**
- ...
=== Correct prompt — step 2 ===
Here’s a 3-day sightseeing plan for Poznań, focusing on a maximum of 4 attractions per day to ensure a relaxed and enjoyable experience.
### Day 1: Exploring the Heart of Poznań
1. **Old Market Square (Stary Rynek)**
- **Description**: Start your day at the vibrant heart of Poznań, where you can admire the colorful townhouses and enjoy a coffee at one of the many cafes.
- **Time**: Morning (around 9:00 AM)
2. **Poznań Town Hall (Ratusz)**
- **Description**: Visit the historic Town Hall, where you can see the mechanical goats at noon and explore the Museum of the History of the City ...
5) Don’t blindly trust the output — always test and verify
Even if the model sounds confident, it can still be wrong.
For example, after generating a Python function, verify that:
- the code runs,
- it works on sample inputs,
- it handles edge cases,
- it follows security best practices.
If the model generates a database query (like SQL), always test it first on a small dataset and check the execution plan — for example with EXPLAIN ANALYZE.
Remember: AI is an assistant. Final responsibility for quality, safety, and correctness is still on your side.
A final example: generating vehicle control commands
Let’s take a real-world example: generating a command to control a vehicle.
We send an image from the vehicle’s front camera to a multimodal model. A multimodal model accepts not only text, but also inputs like images. So here we have an image on input, and on output we want one word indicating the command.



What can we verify?
- If we have a strictly defined list of allowed outputs, validate whether the model’s answer is on that list.
- If outputs are dynamic, validate the type — for example: is the returned value numeric or text?
- Validate the format too — for example: word count, presence/absence of extra text, etc.

I’ll also show more verification rules later in the chapter about evaluation and so-called guardrails. Here is some sample code for embedding distance evaluation:
from langchain_classic.evaluation import load_evaluator
from dotenv import load_dotenv
load_dotenv()
evaluator = load_evaluator("embedding_distance", embeddings_model="openai")
result = evaluator.evaluate_strings(
prediction="Capital of Poland is Warsaw",
reference="something very different"
)
print(result)
output:
{'score': 0.27325426536671193}
Summary: five principles of effective prompt engineering
- Clear instructions — tell the model what role it plays and what it should do.
- Provide examples — show exactly what you expect.
- Define the output format — impose structure so the result is usable.
- Break the task into steps — split complex problems into smaller parts.
- Verify the results — test and validate outputs just like you would with human-written code.
That’s all in this chapter dedicated to effective prompt engineering.
In next chapters we will focus on different evaluation techniques for LLM output.
see next chapter
see previous chapter
see the full code from this article in the GitHub repository
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI
Towards AI Academy
We Build Enterprise-Grade AI. We'll Teach You to Master It Too.
15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.
Start free — no commitment:
→ 6-Day Agentic AI Engineering Email Guide — one practical lesson per day
→ Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages
Our courses:
→ AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.
→ Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.
→ AI for Work — Understand, evaluate, and apply AI for complex work tasks.
Note: Article content contains the views of the contributing authors and not Towards AI.