Going Beyond Zero/Few-Shot: Chain of Thought Prompting for Complex LLM Tasks
Last Updated on April 7, 2024 by Editorial Team
Author(s): Abhinav Kimothi
Originally published on Towards AI.
It is quite astonishing how Large Language Models or LLMs (GPT, Claude, Gemini etc.) have captured the worldβs imagination. Itβs a powerful technology that can tackle a variety of natural language tasks.
LLMs are machine learning models that have learned from massive datasets of human-generated content, finding statistical patterns to replicate human-like abilities.
Foundation models, also known as base models, have been trained on trillions of words for weeks or months using extensive computing power. These models have billions of parameters, which represent their memory and enable sophisticated tasks.
Interacting with LLMs differs from traditional programming paradigms. Instead of formalized code syntax, you provide natural language βpromptsβ to the models
Getting the most out of LLMs requires carefully crafted prompts β the instructions given to the LLM to guide its output. The technique of giving instructions to an LLM to attain a desired outcome is termed βPrompt Engineeringβ and has quickly become an essential skill for anyone working with LLMs.
The goal of Prompt Engineering is to construct the prompts to elicit accurate, relevant, and coherent responses from LLMs by providing the right context, examples, and instructions.
Prompting Techniques
While prompt engineering might seem a simple exercise in english writing, there are techniques that have been demonstrated to extract the best out of LLMs.
While we will talk about a few of these, our focus will be on one novel approach called Chain of Thought (CoT) prompting.
X-Shot Learning
Zero Shot Learning
The ability of the LLM to respond to the instruction in the prompt without any example is called βZero-Shot Learningβ
One Shot Learning
When a single example is provided to the LLM to understand the desired outcome, itβs called βOne Shot Learningβ
Few Shot Learning
If more than one example in to the LLM to better understand the desired outcome, itβs called βFew Shot Learningβ
Chain-of-Thought (CoT) Prompting
Introduction of intermediate βreasoningβ steps, improves the performance of LLMs in tasks that require complex reasoning like arithmetic, common sense, and symbolic reasoning.
In their paper, βChain-of-Thought Prompting Elicits Reasoning in Large Language Modelsβ, Wei et. al. demonstrated how LLMs naturally start reasoning with a few examples.
In this technique, a few logical reasoning steps are added to the prompt as examples for the LLM to understand how to arrive at the desired outcome.
Zero-Shot Chain-of-Thought
Another idea of βZero Shot CoTβ was introduced by Kojima et al. 2022 where, instead of adding examples for Few Shot CoT, we just add βLetβs think step by stepβ to the prompt.
Automatic Chain-of-Thought (Auto-CoT)
As we saw, CoT prompting involves creating examples for the LLM. This is a manual process and introduces subjectivity. To reduce this subjectivity, Zhang et al. (2022) introduced Auto-COT. There are two stages involved in Auto-CoT
Stage A: Create clusters from a dataset of diverse question
Stage B: Select one question from each cluster and generate its reasoning chain using Zero-Shot-CoT with simple heuristics
The Questions with Reasoning Chain in the Demo are then used as examples for new questions
GitHub – amazon-science/auto-cot: Official implementation for "Automatic Chain of Thought Promptingβ¦
Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will beβ¦
github.com
Benefits of Chain of Thought Prompting
Chain of thought provides a lot of advantages over regular prompting.
- Breaks down multi-step problems into simpler components to enable more efficient solving
- Provides transparency into modelsβ reasoning for interpretability
- Applicable across diverse reasoning tasks like math, commonsense, and symbolic manipulation.
- Easily integrated into existing models via prompting. Does not require any architectural change
- Makes modelsβ thought processes relatable to facilitate human-AI collaboration
- Adapts complexity of reasoning chain to task difficulty for broad applicability
- Enables error identification by exposing modelsβ step-by-step reasoning logic
- Teaches generalizable structured problem-solving strategies transferable across tasks.
Limitations of CoT
However, there are also a few limitations of the chain of thought approach to prompting
Task Complexity
Chain of Thought Prompting offers minimal additional value over standard prompting for tasks that lack multi-step reasoning requirements or cannot be easily decomposed. Its benefits are best achieved for problems requiring sequential logic or intermediate explanatory steps
Prompt Quality
The technique depends heavily on prompt quality to steer models through reasoning chains. Crafting prompts that provide effective stepwise guidance demands care and can prove difficult for complex domains necessitating expert knowledge.
Scalability
While Auto CoT tries to automate the process of creating the reasoning chains, it still remains a complex and labor-intensive process to create them. As the tasks increase, the manual effort to create or verify the reasoning chains keeps on increasing.
Model Size
Chain of Thought reasoning works well only on very large models with more than 100 Billion parameters. On smaller models the efficiency reduces. On the other hand, the efficacy of CoT remains to be seen as the model size increases further.
Some Advanced Prompting Techniques
While chain-of-thought prompting improves LLM performance on complex reasoning tasks, there has been a lot of techniques that have emerged. Some of these techniques outperform CoT on a variety of task.
Self Consistency
While CoT uses a single Reasoning Chain in a Chain of Thought prompting, Self Consistency aims to sample multiple diverse reasoning paths and use their respective generations to arrive at the most consistent answer
Generated Knowledge Prompting
This technique explores the idea of prompt-based knowledge generation by dynamically constructing relevant knowledge chains, leveraging modelsβ latent knowledge to strengthen reasoning.
Tree of Thoughts Prompting
This technique maintains an explorable tree structure of coherent intermediate thought steps aimed at solving problems.
Automatic Reasoning and Tool-use (ART)
ART framework automatically interleaves model generations with tool use for complex reasoning tasks. ART leverages demonstrations to decompose problems and integrate tools without task-specific scripting.
Automatic Prompt Engineer (APE)
The APE framework automatically generates and selects optimal instructions to guide models. It leverages a large language model to synthesize candidate prompt solutions for a task based on output demonstrations.
Active Prompt
Active-Prompt improves Chain-of-thought methods by dynamically adapting Language Models to task-specific prompts through a process involving query, uncertainty analysis, human annotation, and enhanced inference.
ReAct Prompting
ReAct integrates LLMs for concurrent reasoning traces and task-specific actions, improving performance by interacting with external tools for information retrieval. When combined with CoT, it optimally utilizes internal knowledge and external information, enhancing the interpretability and trustworthiness of LLMs.
Recursive Prompting
Recursive prompting breaks down complex problems into sub-problems, solving them sequentially using prompts. This method aids compositional generalization in tasks like math problems or questions answering, with the model building on solutions from previous steps.
Prompt engineering has rapidly evolved into a critical discipline for unlocking the full potential of large language models. The field of prompt engineering is still in its infancy, and there is significant room for further innovation and refinement. As LLMs continue to grow in size and capability, new prompting techniques will likely emerge to harness their ever-expanding knowledge and reasoning abilities.
If youβre interested in the generative AI space, please read my e-books.
Generative AI with Large Language Models (Coursera Course Notes)
Generative AI with Large Language ModelsThe arrival of the transformers architecture in 2017, following the publicationβ¦
abhinavkimothi.gumroad.com
Retrieval Augmented Generation β A Simple Introduction
How to make a ChatGPT or a Bard for your own dataU+2753 The answer is in creating an organisation βknowledge brainβ and useβ¦
abhinavkimothi.gumroad.com
Generative AI Terminology β An evolving taxonomy to get you started with Generative Artificialβ¦
In the realm of Generative AI, newcomers may find themselves daunted by the technical terminology. To alleviate thisβ¦
abhinavkimothi.gumroad.com
Letβs connect on LinkedIn -> https://www.linkedin.com/in/abhinav-kimothi/
If youβd like to talk to me, please feel free to book a slot -> https://topmate.io/abhinav_kimothi
Please read my other blogs on Medium
Getting Started with OpenAI : The Lingua Franca of AI
A step by step guide to accessing and using the Chat Completions API provided by OpenAI to generate text. Features likeβ¦
pub.towardsai.net
Generative AI Terminology β An evolving taxonomy to get you started
Being new to the world of Generative AI, one can feel a little overwhelmed by the jargon. Iβve been asked many timesβ¦
pub.towardsai.net
Progression of Retrieval Augmented Generation (RAG) Systems
The advancements in the LLM space have been mind-boggling. However, when it comes to using LLMs in real scenarios, weβ¦
pub.towardsai.net
Gradient Descent and the Melody of Optimization Algorithms
If you work in the field of artificial intelligence, Gradient Descent is one of the first terms youβll hear. It is theβ¦
pub.towardsai.net
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI