GPT-5 is now Trademarked by OpenAI: What Does That Say About the Future of ChatGPT?
Last Updated on August 12, 2023 by Editorial Team
Author(s): Aditya Anil
Originally published on Towards AI.
What is it hinting to us? .. ChatGPT-5?
I. The Trademark of GPT-5
In a 2014 BBC interview, Stephen Hawking said the following words –
The development of full artificial intelligence could spell the end of the human race.
The state of AI in 2014 was different from today. AI was picking up interests in the corporate world. In that year, Google bought DeepMind β a machine learning startup β for over $600 Million. A year later, DeepMind created AlphaGo, which went on to beat Fan Hui, the European Go champion. Facebook then, on the other hand, was creating a system that could predict if two picture showed the same person.
Development in Deep Learning was at its golden state. A small startup named OpenAI got formed then after a year, in Dec 2015. And now, 10 years later, after what now feels like a century of advancements in AI, OpenAI filed a trademark application for βGPT-5β on July 14 with the United States Patent and Trademark Office (USPTO).
This move by OpenAI attracted many speculations. Many say it hints at the potential development of a new version of their language model after GPT4.
This news popped up on Twitter/X post by a trademark attorney Josh Gerben on July 31st.
The trademarking of GPT-5 came as a surprise for many of us.
What is it hinting at?
II. OpenAIβs Code Interpreter: A Stealthy Launch, bridging GPT-4.5 and GPT-5?
Not too long ago, OpenAI released ChatGPTβs newest feature: code interpreter. This was by far the most impressive feature addition to ChatGPT-4. Using the code interpreter, you could now run Python programs in ChatGPT, upload and even download files. Plus, it can even work with images to some extent.
In a podcast on Latent Space (July 11), Simon Willison, Alex Volkov, Aravind Srinivas, and Alex Graveley argue that Code Interpreter is actually GPT-4.5. Of course, OpenAI hasnβt announced whether this was indeed GPT 4.5 or not. However, this isnβt something new. We had seen similar behavior before when OpenAI quietly released Gpt 3.5.
This time, however, OpenAI might not have announced the Gpt 4.5 β keeping up with Sam Altmanβs (CEO of OpenAI) statement to adhere to the six pause letter.
When talked about the viral open letter urging for a six-month pause in AI development, Sam said the following:
βThere are parts of the thrust that I really agree withβ¦ We spent more than six months after we finished training GPT-4 before we released it, so taking the time to really study the safety of the model, to really try to understand whatβs going on and mitigate as much as you can is important.β
In the same conversation, Samβs comment on GPT-5 development was that
β[OpenAI] are not, wonβt for some time [develop new versions of gpt], so in that sense [the six-month pause] was sort of silly.β
This talk was held at MIT in March of this year. You can watch the short clip here.
Based on this, many of us became convinced that releasing GPT-5 anytime soon would be unlikely. This clear gap between finishing training and release for GPT-4 meant that the release of GPT β 5 hadnβt started yet.
At least, that was what was expected.
However, OpenAI trademarking GPT-5 is something new. Could it be possible that OpenAI is already developing GPT-5? Is it a new marketing tactic to hype up AGI β a hypothetical AI that can do any task without any help?
Squinting our eyes, we can find the clues in the trademark application itself.
III. Trademarking Tomorrow: GPT-5βs Odyssey into the Multimodal Frontier
Going into a bit more detail, the GPT-5 trademark application refers to βdownloadable computer programs and computer software related to language models.β This means that the trademark covers the βprogramβ and βcomputer softwareβ related to LLMs.
GPT-5 could actually be an LLM that upcoming iterations of GPT-4 could utilize.
Additionally, the main crux of the hint comes from the ones that I have highlighted above. The trademark application includes software for making speech and text, language processing, and machine learning. It also includes software for voice and speech recognition, converting audio files to text, and more.
Does that give you some familiar scent? A chatbot that can β apart from generating responses β work with images, voices, speech, etc.?
Ha! The multimodality of GPT.
Multimodality refers to the ability to work with more than one type of input β like images, texts, audio and so on. People anticipated the release of GPT-4 with hoardings of βthe future is hereβ placards all over the internet. This anticipation was elevated when we got to know that GPT-4 could βpresumablyβ work with images in the near future. During the GPT-4 demo livestream 4 months ago we saw many impressive capabilities of the model. This included the ability to interpret memes and images, describe the various elements of the images, and so on.
President and co-founder of OpenAI, Greg Brockman, demonstrated how he created a website using GPT-4. He did this by inputting a photo of an idea from his notebook, and GPT4 generated the code for the website. That was pretty impressive. We were convinced that the future was indeed near.
But how near is it? As of now, the closest multimodal experience I ever had is with Bing Chat, which runs on GPT4. You could, in theory, make online searches using images and get results based on that. However, Bing still feels rusty and needs development. An experiment done by roboflow showed how good this multimodality feature of Bing is.
Here are some noteworthy findings as covered in the report –
ββ¦The model was subpar at counting the number of people that were present in an image. Surprisingly, asking the model for a simple structured format (in the form of a JSON) worked much better than most other prompts. With that said Bing could not extract exact locations or bounding boxes, either producing fabricated bounding boxes or no answer at allβ¦β
Roboflow concluded the strengths and weaknesses –
One strength of the underlying Bing Chat model is its ability to recognize qualitative characteristics, such as the context and nuances of a situation, in a given imageβ¦
And
There are notable limitations to how Bingβs new features can be used, especially in use cases where quantitative data is important.
Certainly, you cannot use it to make a website β as shown in the demo by Brockman β which makes Bing βnearly multimodalβ if not the least. I myself fed it some memes and it couldnβt explain the humour in them β the same way it was shown in the live stream demo. This feature either needs some refinement, or my taste in memes are bad in itself. In my case, both are equally likely (Iβm not a big fan of memes).
Right now, only Bing Search β based on GPT-4 β lets you make searches using images. But the responses are not up to the mark, it seems.
In the case of ChatGPT, especially GPT4, you can loosely associate multimodality with the Code Interpreter. It enables you to work with documents and images along with the power of ChatGPT. Feeding a document or image is indeed a βnew inputβ that differs from the text β making GPT-4 fall under multimodality. Thus it would be wrong to say that GPT-4 isnβt multimodal yet.
Code Interpreter gives some taste of multimodality. It sets the expectation of future capabilities on ChatGPT.
Hello readers! Hope you are enjoying this article. This article is part of my Creative Block newsletter β a weekly newsletter on Tech and AI.
If youβd like to read more content like this, head on to Creative Block
Judging from the phrase βartificial production of human speech and textβ from the trademark, GPT-5 β if ever released β would likely be based highly on multimodal. A ChatGPT that could work with (of course) texts, plus with images, speeches, documents and so on.
So that means GPT-5 release is around the corner eh? Not really, if we were to believe Sam. Saying GPT-5 would be released anytime soon would contradict Sam Altmanβs statement. He confirmed that the company wasnβt working on GPT-5, back in April.
So if itβs true, the trademarking GPT-5 seems to secure the rights to its next iteration of GPT models in advance. This would keep other companies at bay and reduce βcompetitionβ. GPT-5 may or may not be AGI as anticipated by many, and experts seem to suggest that AGI isnβt yet possible.
However, there is another perspective to see this trademark move from the lenses of hype and hope. And OpenAI seems to master it early on.
IV. Hype, Hope, and Dreams of AGI
In a blog post, Sam declared that his companyβs Artificial General Intelligence (AGI) will benefit humanity and it βhas the potential to give everyone incredible new capabilities.β
But we are nowhere near AGI. Is it even possible? We donβt know.
The βexperienced expertsβ believe we are far from AGI. Meanwhile, the βAI doomersβ believe we are close to AGI. And the βAI influencersβ donβt care at all as long as there are apt contents out there. All these people have varied opinions on the future of AI, but one link ties them all: somewhere down below, they all are rowing in the stream of hype. Some oppose it, and some flow into it. And OpenAI seems to manifest the flow.
Reporter Karen Hao β who wrote an extensive report on OpenAIβs company culture in 2020 β suggests that OpenAIβs internal culture has started to reflect less on safe and research-driven AI and more on getting ahead than everyone else. Thus, accusing the company of the βAI hype cycle.β
Hereβs an excerpt from the post.
But OpenAIβs media campaign with GPT-2 also followed a well-established pattern that has made the broader AI community leery. Over the yearsβ¦splashy research announcements have been repeatedly accused of fueling the AI hype cycleβ¦critics have also accused the lab of talking up its results to the point of mischaracterization. For these reasons, many in the field have tended to keep OpenAI at armβs length.
β Karen Hao in The messy, secretive reality behind OpenAIβs bid to save the world U+007C MIT Technology Review
But letβs assume the hype and rumors are true β OpenAI is building GPT-5 in their secret dungeon.
They claim GPT-5 will be so impressive and will make humans question whether ChatGPT has reached AGI. The future is now here, once again.
Going by the narratives and hype, GPT-5 or ChatGPT 5 would bring the following to the table:
- Multimodal capabilities: GPT-4 can already handle image and text inputs β and thatβs a good start. But there is still some scope for audio and video inputs. Companies like Google and Meta already demonstrated using various text-to-speech and text-to-music tools. Google also experimented with multimodal AI to develop the PaLM 2 language model. But these capabilities are still in fragments. If rumors are to be true, then the next ChatGPT would be a culmination of all these multimodal features. An all-in-one ChatGPT if possible. And of course, the competition in generative AI forces OpenAI and other AI companies to innovate something close to AGI. Thatβs the expectation of this hype-driven AI race.
- Improved accuracy: While it is impossible to remove hallucination β aka the tendency of AI to make up facts β we have seen improvements in the newer GPT versions. According to OpenAI, GPT-4 is 60% less likely to make stuff up. Successive AI models try to be more accurate than their previous versions. We got to see this in GPT-3 and GPT-4, Llama and Llama2, and even in Claude and Claude 2 β where accuracy rate had noticeable improvement. The future version of GPT might expand its training dataset to fix inaccuracies. However, it would make it resource-heavy, as even the current ChatGPT takes $700,000 per day to run. If there isnβt a better way to make it more accurate and less resource-demanding, GPT-5 would be far from the near future.
- Artificial general intelligence (AGI): This is the final destination that every AI research company is heading to. Whether it is achievable or not is still under debate β but it is reasonable to say that AGI is unattainable any time soon. AGI, in theory, is do-anything-on-its-own AI, but how to approach it practically is where the roadblock comes in. Computers are not out there in the world, and in order to do tasks for humans, they need to interact with the environment. How to go about it? Nobody quite knows, but the answers seem to lie in the conjugation of Neuroscience and deep learning. If GPT-5 comes up with AGI β which is highly speculative β it would be yet another milestone; not just for AI but for the whole field of technology. Resurrecting a living and thinking mind from algorithms would undoubtedly be marvelous.
V. Forging the AGI dream
As I am writing this, the GPT-5 trademark application is currently awaiting examination. But whenever things like this grab the headline it sparks a lot of curiosity, as well as speculation in the AI community. There are always two divisions of the crowd β one, those who view it in a skeptical way, and the other who view it in an optimistic way. One class believes in the facts of yesterday, and the other class believes in the hopes of tomorrow. Nonetheless, both classes are equally important β especially when it comes to governing AI.
With tighter regulations and laws β the likes of the EU AI Act and the US AI Bill β it is getting restrictive for AI companies to vouch for breakthroughs. But are such strict measures justified, though? I believe it is.
If you observe the amount of developments in the past few years in the realm of AI, the growth has been exponential.
But the safety aspects, arising from the growing competition in the corporate world, is a matter of concern. OpenAI became a for-profit company. Investors started to burn money behind any company that is turning βAI Poweredβ, making competition intense in the AI race.
Just progress isnβt enough. We need safe progress β safe progress in the development of NLP, multimodality, and artificial general intelligence.
But pushing for trademarks β as a way to protect intellectual property, or for a marketing strategy to create hype and anticipation β doesnβt lower competition. It just increases it.
With that said, if GPT-5 is going to be at par with our expectations, then it would undoubtedly be a game-changer yet again in the field of AI. But that is if it ever becomes something close to AGI, if not complete AGI.
Yet, even in our craziest dream, if we DO get to AGI, then safety and regulation have to be the priority. Otherwise, our pursuit of AGI in the AI race, from the words of Hawking, could spell the end of the human race.
AGI in the wild can-do wonders β even from the perspective of destruction.
Are you interested in keeping up with the latest events in tech, science and AI?
Then you wonβt want to miss out on my free weekly newsletter on substack, where I share insights, news, and analysis on all things related to tech and AI.
Creative Block U+007C Aditya Anil U+007C Substack
100+ subscribers. The weekly newsletter about AI, Technology and Science that matters to you. Click to read Creativeβ¦
creativeblock.substack.com
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI