Image Function Calls Made Easy with Claudetools
Last Updated on April 22, 2024 by Editorial Team
Author(s): Vatsal Saglani
Originally published on Towards AI.
Unlock the power of the Claude 3 models to seamlessly convert images into actionable structured outputs.
Image generated using ChatGPT
Up until now OpenAI models were best in class for generating structured JSON outputs and function calling. But very recently Anthropic released their Claude 3 family of models. The models in this family are very good at reasoning, coding, and structured data generation.
As these models can generate correct structured JSON output and on top of that as theyβve good reasoning skills we can use them for function calling use case. Recently, I wrote a small python package β claudetools β that help with function calling using the Claude 3 family of models.
You can visit the following blog to know more about Claudetools.
Claudetools: The Secret Sauce for Supercharging Claude 3 with GPT-4 Powers
pub.towardsai.net
P.S.: You can directly use Claudetools as a drop-in replacement for function calling with OpenAI model with some very minor updates.
All the models in the Claude 3 family have vision capabilities. This opens up exciting multimodal interaction possibilities. The vision capabilities are on par with GPT-4-Vision model and even beats GPT-4-Vision on some benchmarks as shown in the following table.
Image from the Anthropic blog
Because of these models sophisticated vision capabilities they can process a wide variety of visual formats, including photos, charts, graphs, and technical diagrams.
As… Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI