Top 5 Learnings From a Year With LLMs in a Business Environment
Last Updated on March 26, 2024 by Editorial Team
Author(s): Pawel Rzeszucinski, PhD
Originally published on Towards AI.
Introduction
Last Saturday marked exactly one year since OpenAI released public access to their models. From day one, my team and I at Team Internet Group were absolutely eager to explore this solution and start testing it in a business environment. Since then, weβve had a few successful implementations. In this post, I will walk you through what we learned from experimenting with LLMs. As a case study, Iβll discuss the solution that we implemented for one of our products, Internet.bs (a country code of Bahamas).
This solution has changed the way people search for domain names online.
Essentially, customers can input, in plain language, the type of business they are looking to start or the kind of website they need. For example, a customer can input βa shop with clothes for dogsβ as a prompt. The solution then generates interesting domain names that match that description, such as barkywardrobe.com, caninechicboutique.com, or fetchfashionista.com.
I like this implementation for several reasons. Firstly, it was our first LLM-powered implementation on production, so thereβs always a special feeling about that. Secondly, and more importantly, it really opened our eyes to what is possible with LLMs. Obviously, that was one year ago, and here we are one year later, where the opportunities seem so much greater.
A project like the one I just described would have, in the past β before the LLM era β taken us around three to four data scientists doing pure research on the natural language processing side of tasks for between nine to twelve months. Then, we would require one data engineer and one full-stack developer to work on making the solution reliable and scalable for another two to three months. The worst part was that the outcome of such an initiative, which was both lengthy and costly, wouldnβt be known until late into the project because, essentially, we would be doing some fundamental research on natural language processing, so the risk was high from the beginning. After the API was made available in the LLM era, it took us one month from the start of the research to full productionization as a side project. We required just one data scientist to design and analyze A/B tests and one data engineer and full-stack developer to productize it. The best thing was that from the very beginning, we knew that the risk of this initiative was very low because we knew exactly what we would get from the LLMs by playing with them beforehand.
Learnings
So, let me share the main five learnings we acquired by working with LLMs in a business environment for the past year:
1. Hyper-acceleration of innovation potential: The case I just described is a perfect example of what I mean. Since we got access to tools like the LLMs, itβs no longer acceptable to follow the old processes of spending weeks to decide on THE project to work on and later months on the implementation. Now an organization should be working on multiple ideas in parallel, testing them to see what sticks. The ones that donβt stick can be quickly moved on from β after all youβve spent a week or two on the PoC, but the ones that do provide a basis for something great, based on very fast iteration processes.
2. βAsk for forgiveness, rather than permissionβ: The problem is that organizations are not ready for such a shift and acceleration in innovation. This was evident in our case study. We were too eager as a team, and we believed in this solution too much to follow the old processes i.e. getting approval from many stakeholders just to see the project drop to the bottom of the backlog. We made a bold decision to work on this project quietly and informing just the crtically important stakeholders. Once in production, we showcased the massive gain in conversion ratio and were prepared for the unpleasant discussions with some colleagues that followed. However, a good corporate culture can translate these uncomfortable discussions into process improvements. We now have a specific route for quick proof of concept tests. If youβre thinking about employing your first LLM project in an organization and anticipate resistance, consider that sometimes itβs better to ask for forgiveness rather than permission, but only sometimes. Donβt abuse this approach. I wrote more on the subject here.
3. βWait calculationβ: Wait calculation is about trying to predict how technology will evolve and improve in the near future when deciding which projects to engage with. Initially, the concept of weight calculation was part of a thought process aimed at space travel (check this substack by Ethan Mollick for more). However, considering the early stages of ChatGPT, one might have considered it perfect for creating custom tutorials for a new business. However, every tutorial needs good visuals, leading to the need for connections to external services for visuals, which then need to be custom-made, requiring human input. The scope of the project grows. Yet, in reality, a few months later, ChatGPT integrated with DALLΒ·E, allowing for the generation of custom images on the spot. So, consider how technology might change moving forward before engaging with initiatives that require a lot of custom work at this moment because soon, new features of your favorite tools could sort these needs.
4. LLMs require LLM Ops: While LLMs allow us to put things into production really fast, this also incentivizes organizations to rush projects without proper care. Weβve seen news stories like the Chevrolet chatbot suggesting competitorsβ cars or a DPD chatbot that started swearing at customers when prompted in certain ways. The OWASP, an online community of cybersecurity professionals, releases their top 10 threats related to different technologies every year, and last year, they included LLMs for the first time, highlighting different parts of the product life cycle where bad actors can leverage tools and techniques for malicious purposes. You can learn about prompt injections or model denial of service for instance. We have to stop thinking about LLMs as fancy toys and start treating them as proper tools. As more organizations and people use these technologies, thereβs more incentive for bad actors to exploit certain backdoors and inefficiencies. LLM Ops should include not just classic model deployment but also monitoring, performance tuning, quality assurance, and testing, security, and compliance, and data management and privacy.
5. Models donβt just hallucinate: While itβs known that models tend to hallucinate, what recently happened to me was different: a model suddenly changed the subject in the middle of a paragraph. It started discussing something totally unrelated to the original topic. For instance, while discussing AI concepts like βhuman in the loopβ and βhuman on the loopβ with the Co-pilot, the model abruptly shifted to talking about nationalism and the autonomy of different nations, and then, as if nothing happened, it returned to discussing AI. Such behaviour is referred to as βdigressionβ. These sudden changes of subject, as abrupt as they seem, remind us that both hallucinations and digressions are important to consider when productizing solutions for customers. Thereβs still a need for guardrails and human-in-the-loop operators to ensure weβre releasing things that are safe and controllable. You can see more about the subject, including the example I mentioned here.
Summary
By sharing these insights, I hope to help others understand both the potential and the challenges of integrating LLMs into their business processes. The journey with LLMs in the business environment has been both exciting and enlightening, showing us the immense potential for innovation and the need for careful, considered implementation.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI