This AI newsletter is all you need #38
Last Updated on July 25, 2023 by Editorial Team
Author(s): Towards AI Editorial Team
Originally published on Towards AI.
What happened this week in AI by Louis
This week, AI continues to thrive under competitive pressure, and excitement is building for the arrival of GPT-4. Tech giants, generative AI startups, and even individuals are competing to be at the forefront of the AI landscape by providing viable alternatives to ChatGPT. The generative AI space has also become an increased focus for venture capital, and the latest example is Stability AI, which is reportedly in talks to raise funds at a $4 billion valuation. While many generative AI startups can be bootstrapped and built on the back of API access to models such as GPT or cheap fine-tuning, larger raises are still needed to build out GPU clusters for training or to pay for cloud training and inference.
As we have been tracking in this newsletter, there has been great progress towards open-source alternatives to models such as ChatGPT in the last few weeks, as well as signs of increased flexibility in access to these models and alternatives to OpenAI. We were particularly excited to see a project from Stanford to fine-tune Metaβs 7BN llama model (whose weights recently leaked) using an open-source model called Alpaca. They created 52k instruction-following examples using OpenAI da-vinci, then supervised fine-tuned the model for three hours on eight 80GB A100s (costing just ~$100). The resulting model can also run on a single GPU, albeit slowly. It performs similarly to ChatGPT on many tasks. We see huge potential for an increased pace of breakthroughs in NLP and transformers now that access to these models is available outside of large tech companies and for affordable fine-tuning costs.
This week, AI21 Labs launched Jurassic-2 and Task-Specific APIs, which is another positive step towards competitive access to LLMs via API and increased transparency. Jurassic-2 is the next generation of AI21βs foundation models, including significant quality improvements and new capabilities.
As we prepare for a future where rapid AI progress leads to transformative AI systems, it is crucial to prioritize supporting AI safety research. OpenAI and Anthropic AI, among others, have been vocal about the importance of AI safety. Anthropic believes that empirically grounded safety research will have the most relevance and impact and acknowledges that a major reason for its existence as an organization is the necessity to conduct safety research on βfrontierβ AI systems. While new open source access to these models is promising for the pace of progress and reducing concentration of power within a few large companies, it also provides increasing flexibility for misuse of these models. It is hard to know whether OpenAIβs policy of gating access to these models via API access with checks and balances in place or more open access to these models would be the best model for limiting harm, but it seems we are already going down different paths.
Hottest News
GPT-4 is coming next week! At a hybrid information event entitled βAI in Focus β Digital Kickoffβ on 9 March 2023, four Microsoft Germany employees presented Large Language Models (LLM) like the GPT series as a disruptive force for companies and their Azure-OpenAI offering.
According to reports, the parent company of Stable Diffusion, an AI tool for creating digital images, is looking to raise funds at a valuation of approximately $4 billion. However, a final decision has not been made regarding the financing, and the valuation is subject to change.
3. Googleβs Universal Speech Model (USM): State-of-the-art speech AI for 100+ languages
Google recently shared its Universal Speech Model (USM), which it claims to be a critical first step towards supporting 1,000 languages. USM is a family of speech models with 2B parameters trained on 12 million hours of speech and 28 billion sentences of text, spanning 300+ languages.
4. Anthropicβs core views on AI safety β When, why, what, and how
Anthropic AI recently shared why it anticipates rapid progress in AI and large impacts from the technology, which has led to concerns about AI safety. The company emphasizes the urgency of supporting AI safety research, which should be done by a wide range of public and private actors.
5. MuAViC: The first audio-video speech translation benchmark
Meta AI has released MuAViC (Multilingual Audio-Visual Corpus), the first benchmark designed to enable the use of audio-visual learning for highly accurate speech translation. MuAViC will also be used to train Metaβs AV-HuBERT model to translate speech into challenging and noisy environments.
Three 5-minute reads/videos to keep you learning
HuggingFace has recently released the integration of trl with peft, which aims to make Large Language Model (LLM) fine-tuning with Reinforcement Learning more accessible. This library is designed to simplify the RL step and provide more flexibility.
2. The State of Competitive Machine Learning
This article summarizes the state of the competitive landscape by analyzing over 200 competitions that took place in 2022. Additionally, it delves into the analysis of 67 winning solutions to identify the best strategies for winning at competitive ML.
3. Emergent Abilities of Large Language Models
This article explores the concept of βemergenceβ in general before delving into its application to Large Language Models. It also discusses the underlying reasons for these emergent abilities and their implications.
4. The Waluigi Effect (mega-post)
This article presents a mechanistic explanation of the Waluigi Effect and other bizarre βsemioticβ phenomena that arise within large language models, such as GPT-3/3.5/4 and their variants (ChatGPT, Sydney, etc.). It proposes a novel idea of βflattery and dialogβ in prompt engineering.
5. Using AI to turn the Web into a database
This article presents a promising approach to implementing the semantic web using powerful Large Language Models (LLMs) combined with knowledge bases. It also proposes the concept of βSemantic Web Agentsβ that can navigate the web and perform tasks on behalf of users.
Papers & Repositories
The paper presents an experiment with a pre-trained LLM (PaLM) and a pre-trained model for vision (ViT) These models are combined with new learnable weights in a larger neural network to solve tasks that involve language, vision, and planning.
The Visual ChatGPT connects ChatGPT with a series of Visual Foundation Models, enabling users to send and receive images during chatting. The goal is to build an AI that can handle various tasks by combining the general interface of ChatGPT with the domain expertise of foundational models.
3. Large Language Models Encode Clinical Knowledge
MultiMedQA is a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries.
4. Prismer: A Vision-Language Model with An Ensemble of Experts
Prismer, is a data- and parameter-efficient vision-language model that leverages an ensemble of domain experts. Experimental results demonstrate that Prismer achieves competitive performance with current state-of-the-art models in fine-tuned and few-shot learning tasks while requiring up to two orders of magnitude less training data.
5. Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
The Cascade of Foundation models (CaFo) is a framework that combines various pre-training paradigms, including CLIP, DINO, DALL-E, and GPT-3, to improve few-shot learning. CaFo uses a βPrompt, Generate, then Cacheβ approach to leverage the strengths of each pre-training method and achieve state-of-the-art performance in few-shot classification.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Announcing the Whatβs AI Podcast!
Louis Bouchard has launched a new project aimed at demystifying the various roles in the AI industry and discussing interesting AI topics with expert guests. The podcast, available on Spotify and Apple Podcasts, features interviews with industry experts. The latest episode features Chris Deotte, a Quadruple Kaggle Grandmaster at NVIDIA, who discusses topics such as crafting a strong data science resume, achieving grandmaster status on Kaggle, working at NVIDIA, and approaches to current data science challenges. Right now users have the opportunity to participate in a NVIDIA RTX 4080 giveaway directly from the podcast! Check it out here.
Meme of the week!
Meme shared by dimkiriakos#2286
Featured Community post from the Discord
AdriBen#5135 shared a paper titled βA Scalable, Interpretable, Verifiable & Differentiable Logic Gate Convolutional Neural Network Architecture From Truth Tablesβ and a DCNN design that could be suitable for security for formal verification, rule model, fairness, and trustworthy machine learningβ. This paper introduces a new definition of the elementary convolution operator as a tractable Boolean Function, which enables the computation of the complete distribution of the neural network before production. Read it here and support a fellow community member. Share your feedback and questions in the thread here!
AI poll of the week!
Join the discussion on Discord.
TAI Curated section
Article of the week
How Can Hardcoded Rules Overperform ML? by Ivan Reznikov
Although machine learning has its advantages in problem-solving, it is not always the best solution. In certain areas, such as those where interpretability, robustness, and transparency are critical, rule-based systems may even outperform machine learning. This article discusses the use cases of hybrid systems and the benefits of integrating them into an ML pipeline. We will examine practical examples of industries, such as healthcare, finance, and supply chain management.
Our must-read articles
The Impact of 5G Technology on IoT & Smart Cities by Deepankar Varma
PCA: Bioinformaticianβs Favorite Tool Can Be Misleading by Salvatore Raieli
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Job offers
Expert Data Scientist @Impact (Remote)
Data Engineer V @ID.me (Remote)
Associate Data Scientist @Freenome (Remote)
Data Scientist @Deep Genomics (Remote)
Machine Learning Scientist @Convergent Research (Remote)
Senior Full Stack Engineer @ClosedLoop (Remote)
Senior ML Engineer @SuperAnnotate (Yerevan, Armenia)
Senior Data Engineer β Analytics @ASAPP (Bangalore, India/Hybrid)
Interested in sharing a job opportunity here? Contact [email protected].
If you are preparing your next machine learning interview, donβt hesitate to check out our leading interview preparation website, confetti!
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI