Few Examples Where GPT-4o Fails to Think/Reason !!
Last Updated on December 16, 2024 by Editorial Team
Author(s): Ganesh Bajaj
Originally published on Towards AI.
This member-only story is on us. Upgrade to access all of Medium.
Large language models (LLMs) are like students who memorized a lot but donβt always understand. Theyβre great at giving solutions based on what theyβve learned, but sometimes they struggle with actual thinking.
Simply put, LLMs still fails where thinking is involved. If we reason with it or guide it, it will probably lead to the correct solution. But at first level of thinking, sometimes it fails. This makes LLM models less consistent and are as trustworthy as our own knowledge or logical thinking.
This is likely given that LLMs are after all as good as it was trained. The responses we receive are the mere results of the hard training we have put it through.
Below are few examples I come across and tried on gpt-4o myself:
Chatgpt-4o Result: Image Illustrated by AuthorHere, at first sight we can say that the orange dot in right figure is larger. But llm fails !!! We can reason with the model and guide it to the correct response. But this makes it less trustworthy and inconsistent when developing GenAI applications. In GenAI solutions, a critique is must to guide it to the correct response.
Chatgpt-4o Result: Image… Read the full blog for free on Medium.Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI