This AI newsletter is all you need #85

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

What happened this week in AI by Louie

This week, attention was on the emerging competition for OpenAI’s GPT-4 and GPTStore in the form of Google’s Gemini Ultra and Hugging Face’s Hugging Chat Assistants, respectively. Meta also made headlines with its latest and largest update to its code generation AI model, Code Llama 70B.

Google released Gemini last year in December, and while Gemini Pro and Nano were made available immediately, Gemini Ultra is set to release on Wednesday. Gemini Ultra will include multimodal capabilities, enhanced coding features, and the ability to analyze documents, files, and data. Google’s technical paper shows that Ultra outperforms GPT-4 in 7 of 8 text benchmark tests. It is the first model to surpass human experts in MMLU (massive multitask language understanding), although it employs COT@32 instead of 5-shot learning. Additionally, Ultra surpasses GPT-4V (vision) in all ten benchmarks for image interpretation. We are excited to see Gemini Ultra tested in the wild and see if it lives up to its benchmark performance! In other Gemini news, Google brought more Gemini Pro features to Bard, incorporating image generation capability powered by the latest Imagen-2 model.

While Google sticks to OpenAI’s closed-source API model release playbook, there were advances in open-source LLMs this week. Hugging Face released Hugging Chat Assistants, its open-source competitor to the GPTStore. With 4,000 Assistants to view/customize and prompts to improve your own Assistant, it is already larger than the GPT store. Hugging Chat Assistants can be powered by various open-source LLMs, including Mistral’s Mixtral or Meta’s Llama 2. We also saw open-source LLM updates from Mistral (“miqu-1–70b” leaked) and Meta’s CodeLlama 70B. CodeLlama 70B scored 53 percent in accuracy on the HumanEval benchmark, performing better than GPT-3.5’s 48.1 percent and closer to the 67 percent mark OpenAI reported for GPT-4.

Why should you care?

We think competition is important, and individuals and companies should have alternatives to OpenAI. GPT-4 has mainly remained unchallenged as the leading LLM model for over a year now. Ultra is the model Google finally claims outperforms OpenAI’s GPT-4, but there was much disappointment last year when the model was announced but not released for external use and testing. Gemini Ultra could provide some much-needed competition to GPT-4, but its performance for practical applications remains uncertain. We also think it is important to see competition to OpenAI’s GPTStore for releasing LLM-based apps and prompts. Hugging Chat Assistants shows how fast the open-source community can catch up to closed-source rivals and joins Poe AI as a GPTStore alternative.

– Louie Peters — Towards AI Co-founder and CEO

Hottest News

1. Introducing Code Llama, a State-of-the-Art Large Language Model for Coding

Meta has launched Code Llama 70B, a coding AI model comparable to GPT4, in three variations: the base model, a Python-specific version, and an ‘Instruct’ version for interpreting natural language commands. All editions are available for free for research and commercial applications.

2. Hugging Face Launches Open-Source AI Assistant Maker To Rival OpenAI’s Custom GPTs

Hugging Face announced the third-party, customizable Hugging Chat Assistants. This allows users to create their own AI chatbots for free, offering a similar service to OpenAI’s custom GPT Builder but without the associated costs. The Hugging Chat Assistants can be powered by various open-source LLMs, such as Mistral’s Mixtral or Meta’s Llama 2.

3. Mistral Confirms New Open Source AI Model Nearing GPT-4 Performance

Mistral has recently confirmed that the “miqu-1–70b” Large Language Model, released on HuggingFace and exhibiting performance close to that of GPT-4, is a leaked quantized version of their technology.

4. Sam Altman Says GPT-5 Will Be ‘Okay’

OpenAI CEO Sam Altman adopts a cautious tone when discussing AI, recently describing the anticipated GPT-5 as merely “okay” at Davos. This balanced approach suggests a strategic shift towards tempered communication.

5. Google Bard Gets Image Generation and a More Capable Gemini Pro To Take On ChatGPT

Google has updated Bard with image generation abilities using Google’s Imagen 2 model and an improved version of their Gemini Pro language model that supports over 40 languages. These updates allow Bard to generate images from text descriptions and more closely match ChatGPT’s performance.

Five 5-minute reads/videos to keep you learning

1. Open-Source LLMs As LangChain Agents

Open-source LLMs like Mixtral have reached performance levels, allowing them to serve as central reasoning components in intelligent agents, surpassing GPT-3.5 benchmarks. This article explains the inner workings of ReAct agents and shows how to build them using the ChatHuggingFace class integrated into LangChain.

2. Introducing the Enterprise Scenarios Leaderboard for Real-World Use Cases

The Enterprise Scenarios Leaderboard, developed by the Patronus team in partnership with Hugging Face, is a new benchmarking tool designed to assess language model performance across six business-oriented tasks. These tasks include finance, legal issues, creative writing, customer support, toxicity detection, and handling of personally identifiable information (PII), specifically emphasizing enterprise requirements.

3. From Neural Networks to Transformers: The Evolution of Machine Learning

To get to LLMs, there are several layers to peel back, starting with the basics of AI and machine learning. This is a foundational article on the evolution of machine learning, covering everything from neural networks to transformers. It primarily focuses on the applications of transformer-based models, with a quick insight into the future.

4. How To Detect Poisoned Data in Machine Learning Datasets

Almost anyone can poison a machine learning dataset to alter its behavior and output. This article discusses what data poisoning is and why it matters. It also covers key concepts such as data poisoning techniques, detection efforts, and prevention strategies.

5. The Promise and Challenges of Crypto + AI Applications

The intersection of AI and blockchain has the potential to revolutionize various systems, with AI poised to enhance blockchain’s efficiency and reliability. This post will classify different ways that crypto + AI could intersect, as well as the prospects and challenges of each category.

Repositories & Tools

1. RAGs is a Streamlit app that uses natural language to create a RAG pipeline from a data source.

2. Nomic Embed is an open embedding model with performance similar to OpenAI’s text-embedding-3-small.

3. LLMs-from-scratch is a repository of resources with hands-on experience and foundational knowledge necessary for building LLMs.

4. RawDog is a CLI assistant that responds by generating and auto-executing a Python script.

5. Zerve is a unified developer space for data science and AI teams to explore, collaborate, and build.

Top Papers of The Week

1. OLMo: Accelerating the Science of Language Models

OLMo is the first entirely open-source LLM whose release includes the model weights and inference code and the training data, training code, and evaluation code. This empowers researchers and developers to use the best and open models to advance the science of language models collectively.

2. Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Researchers have developed a method to enhance LLM training by using a smaller instruction-tuned LLM to paraphrase web scrapes, creating a cleaner, structured dataset. This approach has been shown to accelerate pre-training, reduce computational costs, and improve performance, achieving a 3x speed increase, 10% perplexity reduction, and better zero-shot learning capabilities on various tasks.

3. MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

The LLaVA team has introduced MoE-LLaVA, an open-source, sparse Large Vision-Language Model (LVLM) that leverages a mixture of experts (MoE) to maintain constant computational costs despite a substantial parameter increase. By selectively activating top-k experts for each task, MoE-LLaVA achieves efficient and cost-effective performance.

4. Corrective Retrieval Augmented Generation

CRAG introduces a retrieval evaluator to assess and enhance document quality, triggering tailored retrieval actions. It employs web search and optimized knowledge utilization for automatic self-correction. CRAG significantly improves RAG’s performance across diverse datasets, showing a 36.6% accuracy gain.

5. MMBench: Is Your Multi-modal Model an All-around Player?

MMBench is a novel multi-modality benchmark. It develops a comprehensive evaluation pipeline comprising a meticulously curated dataset and a novel CircularEval strategy and incorporates ChatGPT.

Quick Links

1. Amazon announces Rufus, a new generative AI-powered conversational shopping assistant. It aims to simplify product discovery, comparison, and recommendations by leveraging Amazon’s extensive product catalog and a wealth of web-based information.

2. Volkswagen has set up its global AI lab to function as a competence center and incubator, concentrating on generating proofs of concept for automotive innovations and incorporating AI advancements into Volkswagen’s vehicles.

3. The Browser Company is integrating an AI agent into the Arc browser to surf the web and return results without using search engines. The company said only some of these features use LLMs, but they all work to “bring the internet to you.”

4. AI2 introduced OLMo, a 7 billion parameter model outperforming Llama 2 in generative tasks, with comprehensive training data, code, and over 500 checkpoints per model, all under the Apache 2.0 License.

Who’s Hiring in AI

Senior Deep Learning Scientist, LLM Retrieval Augmented Generation @NVIDIA (US/Remote)

Data Science Intern (6 months) @GoTo Group (Singapore/Internship)

Research Associate: Decoding @Terray Therapeutics (Remote)

Technical Support Engineer — EMEA @Nozomi Networks (London, UK)

Full-stack Data Scientist @Supernova Technology (Remote/Freelance)

Staff Data Scientist, AI @Evisort (US/Canada/Remote)

Generative AI and ML Engineer, Lead @boozallen (Bethesda, MD, USA)

Interested in sharing a job opportunity here? Contact sponsors@towardsai.net.

If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!

Think a friend would enjoy this too? Share the newsletter and let them join the conversation.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

This AI newsletter is all you need #85

Author(s): Towards AI Editorial Team

What happened this week in AI by Louie

Why should you care?

Hottest News

Five 5-minute reads/videos to keep you learning

Repositories & Tools

Top Papers of The Week

Quick Links

Who’s Hiring in AI

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Do AI Agents Really Use the Tools You Build for Them? I Tested It.

Understanding Neural Networks — and Building One!

LLMs Don’t Just Need to Be Smart — They Need to Be Specific. Here’s How.

Beyond pre-trained LLMs: Augmenting LLMs through vector databases to create a chatbot on organizational data

Harnessing the power of LLMs and LangChain for structured data extraction from unstructured data

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

This AI newsletter is all you need #85

Author(s): Towards AI Editorial Team

What happened this week in AI by Louie

Why should you care?

Hottest News

Five 5-minute reads/videos to keep you learning

Repositories & Tools

Top Papers of The Week

Quick Links

Who’s Hiring in AI

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement