This AI newsletter is all you need #90
Last Updated on March 13, 2024 by Editorial Team
Author(s): Towards AI Editorial Team
Originally published on Towards AI.
What happened this week in AI by Louie
This week, we saw a continued rapid pace of new LLM releases and the recent accelerated development of robotics foundation models.
Cohere released its 35 billion parameter model βCommand-R,β its first partially open model, while Elon Musk also announced that xAI will open-source its Grok LLM this week. These are the latest developments in a trend towards hybrid model release strategies after Mistral added a new closed model to the previously fully open-sourced model portfolio. Google recently released its first fully open LLM model series β Gemma. We also had a model update from Inflection in its somewhat unique LLM strategy. Inflection-2.5 approaches GPT-4 performance on benchmarks but stands out with its narrow focus on powering a high EQ personal chatbot, βPi,β which is now claimed at 6 million monthly active users.
The world of robotics has traditionally been dominated by extensive human-coded routines and modeling. While machine learning use has grown for perception, it is only more recently that end-to-end ML robotics has been considered. Two weeks ago, Figure AI raised $675m to accelerate the development of its humanoid robots from investors, including Microsoft and Nvidia, while also announcing a partnership with OpenAI. Tesla has also put a lot of focus into developing its humanoid robot program over the past year. This week, we saw Covariant AI release its RFM-1 robotics foundation model, which aims to eventually give robots human-like reasoning capabilities. RFM-1 is an 8-billion parameter transformer trained on text, images, videos, robot actions, and sensor readings.
Why should you care?
Language β and coding-centered AI applications have been the focus since the launch of ChatGPT and the wave of progress in LLMs. However, we think many of these recent advances, not to mention the massive expansion in available AI training compute, can have a transferable impact on advancing physical robots. We are excited to see an accelerated pace of progress, but it is currently hard to predict how far we are from solutions that can be scaled commercially.
– Louie Peters β Towards AI Co-founder and CEO
This issue is brought to you thanks to AIport:
Did you know that only a little over half of all AI nations develop their own generative models?
Our colleagues from the AIport online community have put together Volume 1 of the Global GenAI Landscape 2024. This GenAI research encompasses four times more nations than other similar editions. The landscape boasts 128 generative models from 107 international companies across six continents.
Highlights:
- Of the 62 countries analyzed, only 35 develop GenAI solutions in-house.
- 10% of companies worldwide have achieved multimodality in their GenAI models.
- A total of 11 companies have developed more than one type of GenAI model.
Check out the full landscape here and subscribe to AIport for more exclusive AI content!
Hottest News
1. Inflection AI Upgrades Its Chatbot Pi to Near GPT-4 Performance
Inflection has launched its latest AI version, Inflection-2.5, enhancing its AI model, Pi, with advanced cognitive capabilities that challenge LLMs like GPT-4. Notably, Inflection-2.5 achieves competitive performance in AI tasks, particularly in coding and math, with 40% less computational power required during its training phase. Inflection-2.5 is available to all Piβs users on iOS, Android, or the new desktop app.
2. OpenAI Announces New Board Members, Reinstates CEO Sam Altman
OpenAI announced that Altman will rejoin the companyβs board of directors several months after losing his seat and being pushed out as CEO. The three new members joining the board are Sue Desmond-Hellmann, former CEO of the Bill and Melinda Gates Foundation; Nicole Seligman, ex-Sony Entertainment president; and Fidji Simo, CEO of Instacart. This brings OpenAIβs board to eight people.
3. Midjourney Accuses Stability AI of Image Theft, Bans Its Employees
It is claimed that employees from Stability AI infiltrated Midjourneyβs database and stole all prompt and image pairs, causing a 24-hour outage. In response, MidJourney reportedly banned all Stable Diffusion developers from its services. At the moment, the situation is still unfolding, and there is limited information.
4. Cloudflare Announces Firewall for AI
Cloudflare is developing a βFirewall for AI,β a web application that can be deployed in front of large language models (LLMs) to identify abuses before they reach the models. The firewall will comprise a set of tools to detect vulnerabilities and provide visibility to model owners. It will also include products already part of WAF, such as Rate Limiting and Sensitive Data Detection.
5. Elon Musk Says xAI Will Open-Source Grok This Week
Elon Muskβs AI startup xAI will open-source Grok, released last year, with features including access to βreal-timeβ information and views undeterred by βpolitically correctβ norms. Musk didnβt elaborate on what aspects of Grok he planned to open-source
Five 5-minute reads/videos to keep you learning
1. A Practical Guide to RAG Pipeline Evaluation (Part 1: Retrieval)
This blog analyzes LLMs such as GPT-4 in the context of retrieval systems. It shows that while they decently determine context relevance with a 79% accuracy rate for binary relevance, they face challenges in terms of low recall and dealing with multiple relevant contexts in complicated queries. It covers the current state of RAG development, LLM-based vs. deterministic metrics, and how to use metrics to optimize retrieval.
2. You Can Now Train a 70b Language Model at Home
Answer AI is introducing an open-source system leveraging FSDP and QLoRA that enables training a 70 billion parameter language model on just two 24GB GPUs. This article explains the concept, how it works, and how to use it.
3. How Generative AI Can Augment Human Creativity
Enterprises struggle to capitalize on the potential of generative AI because of challenges like evaluation overload, lack of domain expertise, producing a comprehensive solution, and more. This article demonstrates how this technology can help organizations overcome these challenges by augmenting the creativity of employees and customers.
4. Should We Even Care About Using LLMs To Query Enterprise Data?
This blog provides a speculative glimpse of how a single workflow might adapt to incorporate natural language processing and understanding. It hypothesizes that these natural language questions power the decision-making systems for LLM systems themselves.
5. How Selective Forgetting Can Help AI Learn Better
A team of computer scientists has created a nimbler, more flexible type of machine learning model, which must periodically forget what it knows. This article explains how this could reveal more about how these programs understand language.
Repositories & Tools
- DSPy is a framework for algorithmically optimizing LM prompts and weights, primarily when LMs are used one or more times within a pipeline.
- Daytona is an open-source development environment manager that automates the setup of coding environments.
- Quivr is a personal productivity assistant (RAG) that allows users to chat with documents and apps using Langchain, GPT 3.5 / 4 turbo, Private, Anthropic, VertexAI, Ollama, LLMs, and Groq.
- Spyx is an innovative Spiking Neural Networks (SNNs) simulation and optimization library. It optimizes the training of expansive models of multi-billion parameters.
- OS-Copilot is a self-improving embodied conversational agent integrated into the operating system to automate daily tasks.
Top Papers of The Week
1. Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Chatbot Arena is an open platform that enhances NLP by aligning LLMs with human preferences using simple feedback comparisons. It incorporates over 240,000 user votes to refine assessment criteria, promote question variety, and ensure expert agreement, thus confirming the trustworthiness of its results.
2. Resonance RoPE: Improving Context Length Generalization of Large Language Models
The study presents Resonance RoPE, a solution to improve the ability of Transformers with Rotary Position Embedding (RoPE) to handle longer sequence lengths than those seen during training (train-short-test-long scenarios). This is achieved by enhancing RoPE for out-of-distribution positions to improve model performance on longer sequences without extra computational costs.
3. Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation
This paper introduces Bonito, an open-source model for conditional task generation. Bonito was explicitly created to convert unannotated text into task-specific training datasets for instruction tuning. It is a new large-scale dataset with 1.65M examples created by remixing existing instruction-tuning datasets into meta-templates.
4. ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
Recent research has identified a vulnerability in LLMs, where ASCII art can be used to conduct jailbreak attacks by exploiting their weaknesses in interpreting non-semantic prompts. The ViTC benchmark has been developed to test LLMsβ abilities against these challenges, revealing that even advanced models such as GPT-3.5, GPT-4, Gemini, Claude, and Llama2 are susceptible.
5. Large Language Models Surpass Human Experts in Predicting Neuroscience Results
Synthesizing decades of research for scientific discoveries outstrips human information processing capacities. This paper presents BrainBench, a benchmark for predicting neuroscience results. LLMs surpass experts in predicting experimental outcomes. BrainGPT, an LLM tuned on the neuroscience literature, performed better yet.
Quick Links
1. Hugging Face is launching a new robotics project under former Tesla staff scientist Remi Cadene. The move signals a major departure and ambitious expansion for Hugging Face, which has primarily focused on software, not hardware, until now.
2. Cognizant, a leading IT provider, has launched a dedicated AI lab that promises to galvanize the commercialization of AI technologies. The initiative is a response to the growing demand for AI solutions.
3. Google is tackling spammy, low-quality content on Search. Google is updating its Search algorithm to demote low-quality, automated content and elevate more valuable, trustworthy websites in search rankings.
4. Towards AI explores the Top 13 AI Call Center Software for 2024, including their features, pros, cons, pricing, and user/expert opinions.
Whoβs Hiring in AI
Senior Machine Learning Engineer β LLM @Recursion (London, UK/Hybrid)
Gen AI β Engineer @Capco (Milan, Italy)
DevOps Engineer β II @Netomi (Gurugram, India)
Senior ML Infrastructure Engineer β Supercompute @Cohere (Vancouver, WA, Canada: United Kingdom)
Senior Python/Machine Learning Developer @FullStack Labs (Latin America/Remote)
VP, Data Engineering @Mozilla (US/Remote)
Interested in sharing a job opportunity here? Contact [email protected].
If you are preparing your next machine learning interview, donβt hesitate to check out our leading interview preparation website, confetti!
Think a friend would enjoy this too? Share the newsletter and let them join the conversation.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI