Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Creating a Smart Home AI Assistant
Artificial Intelligence   Latest   Machine Learning

Creating a Smart Home AI Assistant

Last Updated on May 12, 2024 by Editorial Team

Author(s): Michael K

Originally published on Towards AI.

Source: Image generated by the author (using Adobe Generative AI)

The hardware AI assistants recently released have been making splashes in the news which gave me a lot of inspiration around the concept of an β€˜action model’ and how powerful they could be. It also made me curious about how hard would it be to give a large language model access to my smart home API β€” because coding an entire assistant is totally easier than just opening a tab to my dashboard.

In this article, using Python and a few open-source tools, we’ll create an assistant that can perform almost any action we desire. We’ll also explore how this works under the hood, and how we can use some extra tools to make debugging these agents a cakewalk.

Wrestling LLM Responses

I’ve previously written an article about prompt engineering, which still proves to be our most powerful concept as the end-users of the models. Tool use is a supercharged version of prompt engineering, which allows us to give the models a way to do more than just generate text in the end.

For example, we could give the model the ability to search Wikipedia, look up customer information for a support request, or send an email β€” the sky is truly the limit, other than your programming ability of course. Combined with tool use, we can also get the LLMs to generate structured output, allowing us to reliably provide a formatted response.

Without these tools, the model’s response can vary wildly, or be heavily influenced by the context provided. This often distracts the model from the requested format, or, depending on the context, can produce erroneous results. The random seed the model uses, as well as its temperature (willingness to generate more varied responses), can be controlled; however, this is far from perfect.

Creating the Solution

To manage the dependencies for the project, I’ll be using Poetry, which we can initialize like so:

Poetry will create all of the boilerplate we need to get started, so the next step is to define any additional dependencies we have. Let’s go ahead and add those now:

Ollama

I’ll be using Ollama to handle communicating with the model, however, Phidata supports numerous LLM integrations, so you could swap out Ollama for whichever works best for you. To get Ollama set up, it only takes a few steps:

Other than Meta’s Llama 3, I’ve had great success with Mistral’s 7B model and Microsoft’s Wizard LM2 when using tools. As more modern models are released, tool use will likely become better supported.

Creating the Assistant

Phidata lets us structure and format the LLM’s response using Pydantic objects, giving us a reliable method to extract information from the response in a programmatic fashion. For example, if we wanted to create an assistant that only answered math questions:

This is incredibly useful for instances where you have complex responses from the model. If you take a look at the prompt it generated, we can see how it gets the model to play nice:

Through prompt engineering, it is massaging their response into exactly what we would need, with or without the fields we would require. For example, if we asked a question without an apparent answer:

Based on my previous experience with Phidata in a few projects, it’s vital to give the model every possible option as it can trigger an error. In the math example above, if you did not tell Pydantic that the answer key could be None as well, it will provide a verbose answer in addition to the context, versus just returning None:

Assistant Tool Use

Much like ourselves, giving the LLM tools to perform actions makes it more efficient, accurate, and useful in the long run. Phidata comes with a bunch of awesome tools built-in, but we can also create our own tools giving it access to databases, APIs, or even local binaries if we desire.

Let’s give the Assistant access to the internal API for my house, so it can tell us the temperature in a few locations around the house:

Phidata does all of the heavy lifting for us, by parsing the response from the model, calling the correct function, and finally returning the response. I’ve included a mock feature so you can test it out without having an API of your own.

API Creation

To interact with our assistant, we’ll use FastAPI to create a light REST API to handle incoming requests and run the assistant code for us. Another option would be to use a queue system, however, for our use case, this should work fine since it is low traffic.

First, let's install the dependencies we’ll need for the API:

Then, we can define our base application:

I’m setting up Logfire here, which is optional, but it increases our visibility greatly, and we don’t have to spelunk through a mountain of logs as well! Most of the libraries used in this project already have integrations with Logfire, allowing us to truly extract as much information as possible in the fewest lines of code.

Testing

To run the server, we can use the fastapi utility that gets linked after we install the library:

By default, FastAPI uses port 8000 so we’ll use that to send a test prompt:

Logfire

If you enabled Logfire, we can follow the chain of actions and see the arguments and values for each step:

Source: Image by the author

The timing chart to the right is also great for understanding where a request might be getting stuck so further investigation can be done. Also, since I plan to eventually try this with a physical device, being able to go back and investigate a weird response is a lifesaver.

Next Steps

The only part missing now is the actual hardware β€” so my next project is to take an extra ESP32 I have lying around, and see how much work it’ll be to do speech-to-text conversion as well as give our helpful assistant a voice.

If you would like the code in the finished format, check out the code repository linked below for the full example.

Resources

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓