Towards AI Can Help your Team Adopt AI: Corporate Training, Consulting, and Talent Solutions.


GPT-4 a New Era of AI
Latest   Machine Learning

GPT-4 a New Era of AI

Last Updated on August 1, 2023 by Editorial Team

Author(s): Jan Werth

Originally published on Towards AI.

Summarization of “Sparks of Artificial General Intelligence”

Photo by Dawid Zawiła on Unsplash


In this article, I would like to give you a short overview of the most important findings and GPT-4 insights from the “Sparks of Artificial General Intelligence” paper (2023 Bubeck et al.). What are the main improvements compared to chatGPT? Why will GPT-4 and the following algorithm change our daily live drastically?

A group of Microsoft engineers had a good head start of around five months on experimenting with the latest GPT-4 algorithm. They have some amazing findings, which I want to summarize here quickly. Thanks to Sebastien Bubeck et al. for the great work.

Table of Content

Seven Main Findings
– Output Images from Text-Prompt
– Produce Quality Code
– Math
– Ability to Use Tools
– Interacting With the World

– Understanding Humans
– Weakness & Social Influences

Seven Main Findings

1 — Output Images from Text-Prompts

GPT4 can draw images from a text-prompt. The images are not perfect but can be used with, e.g., Stable Diffusion to polish them up.

Knowing the results from Stable Diffusion, might not sound very impressive, but here image generation is integrated into the discussion, making it more dynamic. Like any intense discussion, I will profit from a quick sketch on the whiteboard. Further, this can be seen as a great primer for Stable Diffusion, where you first get an Image where things are placed where you need it and only later graphically enhanced.

2 — Produce Quality Code

GPT-4 is able to pass technical interviews on LeetCode. The results suggest (actually, the authors do) that GPT4 could be hired as a software engineer in its current state.

They tested different difficulty levels and compared them to human levels. For the comparison, they left out all answers of humans who did not answer any correct answer to compare to at least a base knowledge of coding. The results are astonishing, especially in the hard bracket (See table below).

LeetCode comparisson of GPT-4 vs Human [2023 Bubeck et al.]

Next to plain code, GPT-4 can also produce running 3D-games with a Zero-shot approach. The games are not trivial and work right from the get-go. ChatGpt (GPT-3.5) cannot do this.

3 — Math

Next, to be able to use a calculator as a tool, GPT-4 was tested in the International Mathematics Olympiad (IMO) and could produce proofs for the given Math problems. Those problems are not simply calculus but require a higher level of understanding and creativity to solve the given problems.

I would not even try to solve those problems. This is a remarkable step into general problem-solving capabilities. Being able to solve non-trivial problems by following a non-linear strategy is a key capability for an autonomous “thinking” system we all are longing for (Not the world-domination-type, but rather the do-the-dishes-for-me type)

4 — Ability to Use Tools

A very important finding is the ability to use tools correctly without demonstrations and only limited instructions. The example they use is a calculator. GPT-4 just uses a calculator to solve simple calculus problems. Beforehand, ChatGPT (GPT3.5) would get the answer from estimating the next token, in this case, a number/result. ChatGPT (GPT3.5) was not able to use any tools. GPT-4 increases the reliability of its answers by using the right tool for the right job. On top of that, the algorithm can also use multiple tools to solve more complex problems like penetration testing of a security system.
Further, GPT-4 can use tools, understand the output and use the results to advance solving a problem.
As an example, it uses APIs to access calendars, browse the internet for information, communicate the findings via email, set an appointment, and finally inform all participants about the appointment in an email.

The assistant we were all waiting for desperately is knocking on the door.

5 — Interacting With the World

Also interesting, GPT-4 can produce map-visualizations from text prompt while navigating through them without getting lost. This could be interesting for autonomous navigation in the near future. A nice example would be cleaning robots which will receive audio commands, translate that into text prompts and create their internal map based on this. This map will be explored and optimized without any further input. We might not notice a large difference in the improved orientation skills. However, this is also an important step toward autonomous acting assistance.

Left: The true map and exploration path of GPT-4. Right: The map that GPT-4 generates.
[2023 Bubeck et al.]

6 — Understanding Humans

This is really remarkable paragraph in the paper. The researchers tested several algorithms on the “theory of mind”. The theory of mind describes the asset of being able to understand the cause of an opponent's behavior. This is important to understand the frame of a situation based on feelings, beliefs, desires, intentions, and also knowledge of the opponents. This goes even further by understanding thirds, fourths, etc. beliefs, emotions, etc., influencing the situation.

GPT-4 passes advanced theory of mind tests used to assert the understanding of its opponent for its surrounding. GPT-4, in contrast to chatGPT, can even understand the root causes for misunderstanding in a given realistic situation/discussion. Asked about a possible solution for a stuck conversation, it offered several sensible solutions.

While not fully tested in all regards, GPT-4 passes not only tests used on children but even more complex, nuanced, and realistic scenarios regarding the theory of mind. This further increases the capability of the algorithm to explain their reasoning, but as the theory of mind is also in part connected with empathy, this, most important, becomes another huge step towards GAI.

7 — Weakness & Social Influences

GPT-4 is brilliant in incremental tasks such as Math problems, where one solution leads to the next problem. However, GPT-4 is an autoregressive model. This means that the model does not know what it will produce. This might explain the lack of generating complex music compositions, as for a good composition, you must know what the parts of your composition are, to be able to build up towards them and interconnect them properly. GPT-4 cannot work backward.

Another problem is Misinformation. GPT-4 is capable of producing fake news and propaganda on a large scale using images, links, tools, … . This might have huge implications for future use of GPT-4 as a manipulation or scamming tool.


For me, using tools and interacting as a personal assistant is the most exciting, as it will impact us most direct in daily life. However, understanding the human counterpart is the biggest step to next-level AI in general.

What time to be alive? WOW!

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓