TAI #124; Search GPT, Coding Assistant adoption, Towards AI Academy launch, and more!

Last Updated on November 5, 2024 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

What happened this week in AI by Louie

This week, we saw many more incremental model updates in the LLM space, together with further evidence of LLM coding assistants gaining traction. Google’s CEO Sundar Pichai revealed that more than a quarter of new code at Google is now generated by AI, though each piece is reviewed by engineers before implementation. Microsoft’s GitHub Copilot is also enhancing its LLM-powered coding toolkit and expanding beyond its OpenAI dependency. It is now integrating models like Claude 3.5 and Gemini 1.5 Pro to give developers more choice in LLMs, alongside a new no-code tool, GitHub Spark, which lets users develop micro apps.

Meanwhile, we saw several new LLM models and new consumer and LLM developer features released. OpenAI’s new RAG-powered SearchGPT web search feature in ChatGPT is a big improvement on its initial web search offering. The product now works more like the successful Perplexity.ai, but unlike Perplexity, OpenAI doesn’t have API access to this feature. META is reportedly also now developing a similar LLM and RAG web search product. Cohere’s new multimodal Embed 3 model meanwhile enables more accurate and nuanced retrieval for image, multilingual, and noisy datasets. Improved embedding models like this should help to further improve these RAG web search products as well as the customized LLM pipeline products we teach at Towards AI.

On the other hand, Anthropic was disappointed with the pricing released for Haiku 3.5 this week. After revealing great benchmark scores last week, pricing was surprisingly announced to be 4x higher than Haiku 3.0 and significantly more expensive than OpenAI and Gemini’s lower-tier models. It is not clear if this is due to using a larger model size with higher inference costs — or if Anthropic is just constrained in compute capacity and still prioritizing their flagship Sonnet 3.5 model. We wouldn’t be surprised to see a price decrease as inference capacity ramps.

Why should you care?

The relentless pace of feature and model releases in the LLM space is rapidly increasing their capabilities and bringing them close to making a huge impact across the economy. To some extent, LLM coding assistants are far ahead in boosting productivity and changing people’s workflows. This is partly because a large amount of work has to go into building on top of foundation LLMs to customize them to a particular application and to increase reliability and ease of use in this domain. This is needed to improve performance and ensure the LLM pipeline really boosts productivity or unlocks new product features. This extra work requires learning the new LLM Developer technical skill stack we teach at Towards AI — but to become a great LLM developer and build a product that actually gets adopted, you also need to learn many new non-technical skills, including an ability and intuition for how to bring expertise from your target market into your product development. This expertise should be brought into your prompt design, agent pipeline design, dataset collection and curation, fine-tuning datasets, and evaluation datasets. It is difficult to work out how to adapt your LLM pipeline to the nuances of the data and user demands in a new and unfamiliar industry niche, but for many, this is easiest to do well within the software industry, given developers’ pre-existing understanding of the industry and the problems developers face. It is not surprising, therefore, that software is where we are seeing some of the most successful LLM products so far.

We expect LLM products will perform best and have a chance of huge-scale adoption the more they have been customized to a specific industry niche. This will require millions of LLM developers to build on top of foundation LLMs to develop these products. Prompt Engineering, GPTs, and no-code agent builder platforms alone just don’t provide the level of flexibility needed to deliver the very best LLM product for a specific application or company. Towards AI is focussed on teaching this new generation of LLM Developers, and very soon, we are going to release an extremely in-depth ~90-lesson practical full stack “LLM Developer” conversion course. Together with instructor support in our Discord, we hope this will help many more Software Developers and Machine Learning engineers gain this new LLM Developer skillset. We progress all the way from helping choose your project idea, data collection and curation, LLM fundamentals, prompting, RAG, Fine-tuning, Agents, and Deployment. We will also teach you some of the new non-technical skills and tips along the way. All while building a single advanced LLM project, which we will review and certify in the end.

This new course is already available for pre-order on our new Towards AI Academy course platform, where we have also released a new version of our ebook (more about this below!).

— Louie Peters — Towards AI Co-founder and CEO

🎉 Great news! Building LLMs for Production (second edition) is now available as an e-book at an exclusive price on Towards AI Academy!

TAI #124; Search GPT, Coding Assistant adoption, Towards AI Academy launch, and more!

For the first time, you can access this guide to designing, deploying, and scaling language models directly through our platform — and at a price lower than on Amazon!

Building LLMs for Production is for anyone who wants to build LLM products that can serve real use cases today. It explores various methods to adapt “foundational” LLMs to specific tasks with enhanced accuracy, reliability, and scalability. It tackles the lack of reliability of “out of the box” LLMs by teaching the AI developer tech stack of the future: Prompting, Fine-Tuning, RAG, and Tools Use.

Get Building LLMs for Production on Towards AI Academy and explore all the other resources available to support your AI journey!

We will soon launch our new Towards AI Academy course platform more broadly with a series of extremely in-depth practical LLM courses, so stay tuned! These courses will progress beyond the skills you learn in the book by building a much more advanced LLM project, bringing in more non-technical skills and considerations, and providing instructor support. We will also review and certify your own working advanced LLM project at the end, which could be the foundation of a new business, a new tool or product at your company, or a portfolio project for finding a job in the LLM industry.

P.S. If you already have the first edition, you’re eligible for an additional discount for this second edition of the book (post-September 2024) — just reach out to louis@towardsai.net to upgrade affordably!

Hottest News

1. Open AI Introduced ChatGPT Search

OpenAI has introduced ChatGPT search, which enables real-time information in conversations for paid subscribers. The feature enhances its AI chatbot with real-time updates on sports, stocks, and news, positioning it as a competitor to major search engines like Google and Bing through partnerships with data providers. The web search will be integrated into ChatGPT’s existing interface. The feature will determine when to tap into web results based on queries, though users can manually trigger web searches.

2. GitHub Spark Lets You Build Web Apps in Plain English

GitHub has unveiled GitHub Spark, an experimental tool from GitHub Next labs, at the GitHub Universe conference. Spark enables users to create web apps using natural language and edit the underlying code, focusing on developing “micro apps” and exploring software development through conversational interfaces. Spark also allows users to choose which large language model they want to use.

3. More Than a Quarter of New Code at Google Is Generated by AI

“More than a quarter of all new code at Google is generated by AI, then reviewed and accepted by engineers,” CEO Sundar Pichai said on the company’s third quarter 2024 earnings call. AI is helping Google make money as well. Alphabet reported $88.3 billion in revenue for the quarter, with Google Services (which includes Search) revenue of $76.5 billion, up 13 percent year-over-year, and Google Cloud (which includes its AI infrastructure products for other companies) revenue of $11.4 billion, up 35 percent year-over-year.

4. OpenAI Expands Realtime API With New Voices and Cuts Prices for Developers

OpenAI updated its Realtime API today, which is currently in beta. This update adds new voices for speech-to-speech applications to its platform and cuts costs associated with caching prompts. Beta users of the Realtime API will now have five new voices they can use to build their applications. OpenAI showcased three new voices, Ash, Verse, and the British-sounding Ballad, in a post on X.

5. Cohere Releases Multimodal Embed 3

Cohere has introduced Embed 3, a multimodal embedding model integrating text and image data to enhance search capabilities. It excels in accuracy and performance, efficiently handling multilingual and noisy data for complex data retrieval.

6. Anthropic’s Claude AI Chatbot Now Has a Desktop App

Claude.ai has launched a new analysis tool that allows Claude to execute JavaScript code for data processing and real-time insights. This feature enhances the platform’s ability to perform complex math and data analysis, offering precise and actionable insights for various teams, including marketing, sales, and engineering.

7. Hugging Face Releases Compact LLMs SmolLM2

HuggingFace released SmolLM2, a family of compact language models available in three sizes: 135M, 360M, and 1.7B parameters. They can solve many tasks while being lightweight enough to run on-device.

Five 5-minute reads/videos to keep you learning

1. Evaluating Feature Steering: A Case Study in Mitigating Social Biases

This article shares the findings from a quantitative experiment to understand what feature steering can and can’t do. It focuses on 29 features related to social biases to better understand how useful feature steering may be for mitigating social biases in LLMs. The article also lists limitations, lessons learned, and possible future directions.

2. Why Building in AI Is Nothing Like Making Conventional Software

Building with AI requires us to break our habits and approach building differently. AI products bring unique risks, and if you don’t understand them, you’re bound to make mistakes. This essay will help you understand how building in AI differs from building in conventional software.

3. OpenAI’s O-1 and Inference-Time Scaling Laws

The article explores OpenAI’s o-1 model, which enhances reasoning in LLMs using a “chain of thought” approach and inference-time scaling laws. Trained with reinforcement learning, the model improves with increased computational time during inference, shifting focus from pre-training to inference, potentially reducing costs and enabling more effective problem-solving.

4. I Own My LLM Chat History, and So Should You

The article argues for user ownership of chat histories with large language models, emphasizing the interchangeability of providers like OpenAI and Google. It highlights the advantages of locally storing conversations for enhanced privacy, accessibility, and analysis.

5. A Primer on Using Google’s Gemini API To Improve Your Photography

This blog will walk you through building a Photo Critique and Enhancement App using Google’s Gemini-1.5-Flash-8B API and Streamlit. It also highlights the essentials of Gemini API inferencing. By the end, you will have built an app that critiques and helps you improve your photos.

Repositories & Tools

NotebookLlama is an open-source tutorial series that guides users in creating a PDF to Podcast workflow using Text-to-Speech models.
Docling parses documents and exports them to the desired format.
Screenshot to Code converts screenshots, mockups, and Figma designs into clean, functional code.
OpenHands is a platform for software development agents that can modify code, run commands, browse the web, call APIs, and more.

Top Papers of The Week

1. Mixture of Parrots: Experts Improve Memorization More Than Reasoning

This paper explores the trade-offs between the mixture of expert models and standard dense transformers. They demonstrated that MoEs can effectively leverage additional experts to improve memory-intensive tasks like fact retrieval but find diminishing returns in reasoning tasks like mathematical problem-solving or graph analysis.

2. Distinguishing Ignorance from Error in LLM Hallucinations

This paper distinguishes between two types of LLM hallucinations: those that occur when the model lacks knowledge (HK-) versus when it hallucinates despite having the correct knowledge (HK+). The researchers developed a method called WACK to systematically capture HK+ across models, using techniques like “bad shots” (showing incorrect examples) and “Alice-Bob” (using subtle persuasion) to induce HK+ hallucinations. They found that hallucination types leave distinct signatures in the models’ internal states, different models hallucinate in unique ways even with shared knowledge, and that detecting hallucinations works better when using model-specific datasets rather than generic ones.

3. A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs

The paper presents a method to improve LLM training efficiency using a smaller language model (SLM) to provide soft labels and select valuable training examples. This approach transfers predictive capabilities to the LLM, reducing training time. Empirical results demonstrate enhanced pre-training of a 2.8B parameter LLM using a 1.5B parameter SLM on the Pile dataset.

4. OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

OpenWebVoyager is an open-source framework for developing multimodal web agents using imitation learning. These agents improve iteratively by exploring the web, collecting feedback, and optimizing actions based on successful trajectories, enhancing their real-world web navigation capabilities. Experimental results demonstrate the agents’ continuous improvement and robust performance across various tests.

5. Retrieval-Augmented Diffusion Models for Time Series Forecasting

The Retrieval-Augmented Time series Diffusion model (RATD) improves time series forecasting by using an embedding-based retrieval process to select relevant historical data, which aids in the denoising phase of the diffusion model, addressing the instability of existing models.

Quick Links

1. Meta AI has announced the open-source release of MobileLLM, a set of language models optimized for mobile devices, with model checkpoints and code now accessible on Hugging Face. However, it is only available under a Creative Commons 4.0 non-commercial license, meaning enterprises can’t use it on commercial products.

2. Google’s “Grounding with Google Search” feature now integrates live search data directly into its Gemini 1.5 models, allowing developers to build AI applications that provide more accurate, up-to-date responses.

3. Patronus AI launched what it calls the first self-serve platform to detect and prevent AI failures in real-time. The system’s cornerstone is Lynx, a breakthrough hallucination detection model that outperforms GPT-4 by 8.3% in detecting medical inaccuracies.

Who’s Hiring in AI

Google Cloud GenAI Developer @Accenture (Multiple Locations, USA)

GenAI Software Engineer @RELX INC (Farringdon, United Kingdom)

Software Engineer, AI Tools @Salesforce (Palo Alto, CA, USA)

AI Developer @Insight Global (Chicago, IL, USA)

Software Engineering Intern @MicroStrategy (USA)

Senior Software Engineer @Ocrolus Inc. (USA/Remote)

Data Science Internship Opportunities @Microsoft Corporation (Multiple locations)

Interested in sharing a job opportunity here? Contact sponsors@towardsai.net.

Think a friend would enjoy this too? Share the newsletter and let them join the conversation.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

TAI #124; Search GPT, Coding Assistant adoption, Towards AI Academy launch, and more!

Author(s): Towards AI Editorial Team

What happened this week in AI by Louie

Why should you care?

Hottest News

Five 5-minute reads/videos to keep you learning

Repositories & Tools

Top Papers of The Week

Quick Links

Who’s Hiring in AI

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

TAI #124; Search GPT, Coding Assistant adoption, Towards AI Academy launch, and more!

Author(s): Towards AI Editorial Team

What happened this week in AI by Louie

Why should you care?

Hottest News

Five 5-minute reads/videos to keep you learning

Repositories & Tools

Top Papers of The Week

Quick Links

Who’s Hiring in AI

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement