Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

OpenAI JSON Mode vs Functions
Latest   Machine Learning

OpenAI JSON Mode vs Functions

Last Updated on March 13, 2024 by Editorial Team

Author(s): João Lages

Originally published on Towards AI.

When working with the OpenAI API, you’ll encounter two primary methods for obtaining structured output responses from GPT models: JSON mode and Function calling.

Both are powerful tools, but understanding when to use each can significantly enhance your workflow. Let’s delve into the differences and optimal use cases for each approach.

JSON mode

When JSON mode is enabled, the model exclusively generates outputs formatted as valid JSON strings. However, to ensure the desired JSON structure, it’s crucial to explicitly specify it somewhere within the prompt, like in the example below.

import openai 

system_prompt = """You are a helpful assistant designed to output this JSON format:
```
{
"answer": "<your to the user's question>"
}
```
"""


response = openai.chat.completions.create(
model="gpt-3.5-turbo-0125",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Who won the world series in 2020?"}
]
)

print(response.choices[0].message.content)
# '{"answer": "The Los Angeles Dodgers won the World Series in 2020."}'

Note that OpenAI gives you no guarantees that the output text will have your specified JSON format, it will just always be a valid string that can be parsed to JSON.

In JSON mode, the GPT is forced to only output tokens that conform to the rules of a valid JSON object. This process, known as constrained decoding, has proven effective in various applications, including limiting the language model to generate only valid SQL queries.

The only occasion when the output may not be a valid JSON is when response.choices[0].finish_reason == “length”, so pay attention to that!

Function calling

GPT models can also be given access to a set of functions that they can choose to call, instead of producing a free-form text response.

If a tool_choice is provided, the model will be constrained to use that specified function. Otherwise, it will automatically choose when to call function(s) and when to write a normal text response.

At the time of writing, it is only possible to force the LLM to use a specific function. We can’t force it to use only one function and provide more than one.

import openai 

functions = [
{
"type": "function",
"function": {
"name": "get_world_series_winner",
"description": "Get the world series winner in a given year",
"parameters": {
"type": "object",
"properties": {
"year": {"type": "integer"}
},
"required": ["year"],
},
},
}
]

response = openai.chat.completions.create(
model="gpt-3.5-turbo-0125",
tools=functions,
tool_choice={"type": "function", "function": {"name": "get_world_series_winner"}},
messages=[
{"role": "user", "content": "Who won the world series in 2020?"}
]
)

print(response.choices[0].message.content)
# None

print(response.choices[0].message.tool_calls)
# [ChatCompletionMessageToolCall(id='...', function=Function(arguments='{"year": 2020}', name='get_world_series_winner'), type='function')]

But how is this accomplished? GPT language models only take text as input and only output text, so how are they also producing function calls?

In reality, Function calling is just a disguised JSON mode.

The main difference is that OpenAI already takes care of the prompt engineering and the parsing for us:

  • Our provided functions are being inserted in the prompt automatically.
  • The LLM response is a valid JSON string that is automatically parsed into ChatCompletionMessageToolCall objects.

In fact, the example above can be refactored to be done in JSON mode too:

import openai 

system_prompt = """You are a helpful assistant designed to output this JSON format:
```
{
"functions": [
{
"name": "<Function name>",
"arguments": {
"<argument name>": <argument value>
}
}
]
}
```

You can use the following functions:
[
{
"type": "function",
"function": {
"name": "get_world_series_winner",
"description": "Get the world series winner in a given year",
"parameters": {
"type": "object",
"properties": {
"year": {"type": "integer"}
},
"required": ["year"],
},
},
}
]
"""


response = openai.chat.completions.create(
model="gpt-3.5-turbo-0125",
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Who won the world series in 2020?"}
]
)

print(response.choices[0].message.content)
# '{"functions": [{name="get_world_series_winner", "arguments": {"year": 2020}}]}'

print(response.choices[0].message.tool_calls)
# None

which is, in fact, what happens when OpenAI processes your request. Note that the prompt is not exactly modified as the example presented here, but the functions are for sure somehow inserted in the prompt (remember that the LLM only receives text as input, so it’s the only way).

JSON mode and Function calling can also be used together:

import openai 

system_prompt = """You are a helpful assistant designed to output this JSON format:
```
{
"functions": [
{
"name": "<Function name>",
"arguments": {
"<argument name>": "<argument value>"
}
}
]
}
```
"""


functions = [
{
"type": "function",
"function": {
"name": "get_world_series_winner",
"description": "Get the world series winner in a given year",
"parameters": {
"type": "object",
"properties": {
"year": {"type": "integer"}
},
"required": ["year"],
},
},
}
]

response = openai.chat.completions.create(
model="gpt-3.5-turbo-0125",
response_format={"type": "json_object"},
tool_choice="none", # to force the response to be in free-form text
tools=functions,
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "Who won the world series in 2020?"}
]
)

print(response.choices[0].message.content)
# '{"functions": [{name="get_world_series_winner", "arguments": {"year": 2020}}]}'

print(response.choices[0].message.tool_calls)
# None

and you don’t even need to worry about adding the functions to the prompt directly, OpenAI does it for you.

When to use each

Function calling only works with a particular JSON structure, that is used to call specified functions with arguments. If we have more than one function, we can’t force the LLM to use functions (i.e., it can still output free-form text), but we can increase our chances by requesting that in the prompt.

In contrast, JSON mode is a more flexible capability, that really forces the LLM to always output a valid JSON string, but this JSON structure is arbitrary.

Nonetheless, the LLM can hallucinate in both approaches:

  • In Function calling, the LLM may ignore our instructions and output free-form text instead of using functions. It can also completely hallucinate when writing the argument names and values.
  • In JSON mode, the LLM always produces JSON, but the specified format may not be respected.

Summing up, if your use case can be framed to use Function calling, it is recommended to use it because we are guaranteed that OpenAI will automatically optimize your prompt according to the specified functions. The language models were also trained with this prompt format, so it is more likely that the response will be better and hallucinations will be less frequent. The response also comes parsed in ChatCompletionMessageToolCall objects, which is nice.

In the future, we hope that OpenAI provides more control over language model outputs, by allowing us to write custom constrained decoding algorithms.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓