This AI newsletter is all you need #54
Last Updated on July 15, 2023 by Editorial Team
Author(s): Towards AI Editorial Team
Originally published on Towards AI.
What happened this week in AI by Louie
This week we were excited to read Demis Hassabis discussing Deepmindβs upcoming new Gemini Large Language model. Historically DeepMind has primarily dedicated its efforts to reinforcement learning (RL) and remained relatively quiet in pursuing large language models (LLMs). Nevertheless, DeepMind was behind the Chinchilla paper which has since become a benchmark for training LLMs and also introduced Sparrow in 2022. There have been many changes at Deepmind this year with Deepmind recently merging with Google Brain and combining resources on AI efforts, and we have been looking forward to updates from Deepmind. In a recent development, DeepMind has unveiled its latest project named Gemini, an upcoming competitor to ChatGPT.
DeepMindβs previous emphasis on reinforcement learning (AlphaGo) has proven to be highly advantageous, as the RLHF (Reinforcement Learning from Human Feedback) approach served as the secret ingredient behind the impressive performance of chat agents such as ChatGPT. According to Demis Hassabis, the CEO of DeepMind, Gemini combines the capabilities of AlphaGo-type systems with the profound language understanding found in LLMs. The model is still in the development stage and is expected to remain so for another few months. As Hassabis noted, this projectβs completion could require an investment ranging from tens to hundreds of millions of dollars.
We are thrilled to witness the continuous innovation at the forefront of LLMs, such as the GPT-4, and other companiesβ efforts to push the boundaries of these modelsβ capabilities even further. Moreover, there is growing anticipation to witness a new wave of breakthroughs in LLMs, particularly with a huge increase in training compute available to leading AI companies after recent deliveries of Nvidiaβs H100 Tensor Core GPUs. We are interested to see if increased compute budgets for these models will be used primarily for more complex training steps and architectures, larger training data sets, or increased model parameters.
– Louie Peters β Towards AI Co-founder and CEO
Hottest News
DeepMind is developing a new chatbot named Gemini, intended to compete with, and potentially surpass, OpenAIβs ChatGPT. By leveraging the achievements of AlphaGo and the language capabilities of LLMs, DeepMind aims to establish dominance in the generative AI market.
2. ElevenLabs Introduces Its Voice Library
ElevenLabs has recently introduced Voice Library, a community platform integrated with a multilingual model that facilitates the development of lifelike synthetic voices with consistent primary speech characteristics for commercial applications. The Voice Design tool allows users to customize age, gender, and accent to generate unique and natural-sounding voices.
3. MosaicML Agrees to Join Databricks to Power Generative AI for All
MosaicML, a startup dedicated to democratizing large-scale neural network training and inference, has announced a collaboration with Databricks in a $1.3B deal. The partnership aims to drive advancements in generative AI software expertise and extend customer reach while enhancing engineering capabilities.
4. Advancing Innovation With Open-Source AI: Hugging Face CEO Testifies Before the US Congress
The CEO of Hugging Face, ClΓ©ment Delangue, recently testified before the US Congress on Open-Source AI. In his testimony, he highlighted the importance of open-source AI in advancing innovation, promoting fair competition, and ensuring responsible development. Delangue emphasized that open-source principles democratize AI and foster a more inclusive and collaborative future in the field.
5. Adobe Indemnity Clause Designed To Ease Enterprise Fears About AI-Generated Art
Adobe offers an indemnity clause to address copyright concerns surrounding their generative AI tool Firefly. By training the model on legal content and promising to cover any copyright claims, Adobe aims to ease enterprise usersβ worries and ensure the legality and safety of AI-generated artwork.
Five 5-minute reads/videos to keep you learning
Google has announced the first Machine Unlearning Challenge, a collaboration between academic and industrial researchers. This new area of machine learning focuses on eliminating the impact of certain training examples from a model to protect privacy rights. The challenge, held on Kaggle, aims to evaluate unlearning modelsβ forgetting quality and model utility, providing insights for improvement.
2. The Rise Of The AI Engineer
The article delves into the emergence of AI engineering as a specialized field and outlines the necessary skills for success in this domain. It emphasizes the importance of possessing a solid comprehension of machine learning algorithms, data manipulation, and programming languages, as well as the ability to bridge the gap between research and implementation to create practical AI solutions.
3. The Generative AI Revolution: Exploring the Current Landscape
This article provides an overview of the present state of Generative AI, emphasizing its capacity to generate coherent text, images, and code. It discusses notable models such as the Transformer, GPT family, Palm models, Chinchilla model, Megatron Turing model, and LlaMa models. The post also explores the potential impact of Generative AI in various fields, including animation, gaming, art, movies, and architecture.
4. AI and the Automation of Work
ChatGPT and generative AI are poised to revolutionize the way we work. However, how distinct is this transformation compared to previous waves of automation spanning the last two centuries? Furthermore, what implications does it hold for employment? This article explores paradigm-breaking technologies from the past and endeavors to envision the future impact of AI on the nature of work.
5. What Is Langchain and Why Should I Care as a Developer?
Langchain is experiencing remarkable growth as one of the fastest-growing open-source projects in history. This post provides a spotlight on how LangChain empowers developers to undertake incredible projects, offering a high-level overview of its capabilities. The author also shares a personalized account of their experimentation journey with the framework.
Papers & Repositories
This paper explores the potential consequences for GPT-{n} as LLMs increasingly contribute a significant portion of the language available online. It reveals that utilizing model-generated content during training leads to irreversible defects in the resultant models, resulting in the disappearance of the tails of the original content distribution. If the output is not curated diligently, one may encounter a phenomenon known as βmodel collapse.β
2. BradyFU/Awesome-Multimodal-Large-Language-Models
The repository features a curated collection of papers and datasets on Multimodal Large Language Models (MLLMs). It offers valuable insights into various aspects such as multimodal instruction tuning, in-context learning, chain-of-thought, and LLM-aided visual reasoning.
3. Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language
The paper introduces LENS (Language Models ENhanced to See), a modular approach aimed at addressing computer vision problems by harnessing the capabilities of large language models (LLMs). The system employs a language model to engage in reasoning over outputs from a collection of independent and highly descriptive vision modules, which collectively offer comprehensive information about an image.
4. A More Efficient Way to Train a CLIP Model
Expanding on the recent work of CLIPA, which introduces an inverse scaling law for CLIP training, this paper introduces CLIPA-v2. CLIPA-v2 enhances the efficiency of image-text matching models, such as CLIP, by utilizing shorter sequences. It achieves a zero-shot ImageNet accuracy of 81.1% while requiring only $10,000 in resources.
5. End-to-end Autonomous Driving: Challenges and Frontiers
This survey offers a comprehensive analysis of over 250 papers, covering various aspects such as motivation, roadmap, methodology, challenges, and future trends in end-to-end autonomous driving. It delves into several critical challenges, including multi-modality, interpretability, causal confusion, robustness, and world models, among others.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Weekly AI Podcast
In this weekβs episode of the βWhatβs AIβ podcast, Louis Bouchard interviews Petar VeliΔkoviΔ, a research scientist at DeepMind and an affiliate lecturer at Cambridge. Petar shares insights on the value of a Ph.D., emphasizing its role as a gateway to research and the opportunities it provides for building connections and adaptability. He also highlights the evolving landscape of AI research, underscoring the importance of diverse backgrounds and contributions. The interview provides valuable perspectives on academia versus industry, the role of a research scientist, working at DeepMind, teaching, and the significance of curiosity in driving impactful research. Tune in on YouTube, Spotify, or Apple Podcasts if you are interested in AI research!
Upcoming Community Events
The Learn AI Together Discord community hosts weekly AI seminars to help the community learn from industry experts, ask questions, and get a deeper insight into the latest research in AI. Join us for free, interactive video sessions hosted live on Discord weekly by attending our upcoming events.
In critical domains like medical diagnoses and safety-critical systems, quantifying prediction uncertainty in machine learning is crucial. Conformal prediction offers a robust framework for this purpose. It allows the quantification of uncertainty for any machine learning model as a post-processing layer, without requiring model refitting. Join us for an upcoming talk where we delve into the applications of conformal prediction. Attendees are encouraged to familiarize themselves with the insights shared on the MLBoost YouTube channel before the event.
Join the event here and discover how conformal prediction enhances reliable decision-making by measuring uncertainty beyond traditional point predictions.
Date & Time: 7th July 2023, 10:00 am EST
2. Reading Group: Segment Anything Model (SAM)
Learn AI Togetherβs weekly reading group presents informative presentations and discussions on the latest advancements in AI. This (free) event offers a wonderful opportunity to learn, ask questions, and engage with community members. This week, the focus will be on reviewing the βSegment Anythingβ paper, a recent publication from Meta research. You can access the paper here and join the discussion here.
Date & Time: 8th July 2023, 10:00 pm EST
Add our Google calendar to see all our free AI events!
Meme of the week!
Meme shared by neon8052
Featured Community post from the Discord
Akshitireddy has developed an open-source project called Interactive LLM Powered NPCs, which revolutionizes the way users interact with non-player characters (NPCs) in games. This project enables users to engage in immersive conversations with NPCs, leveraging their microphones to speak, listening to their voices, and witnessing realistic facial animations. The project focuses on enhancing the gaming experience in previously released titles such as Cyberpunk 2077, Assassinβs Creed, GTA 5, and other popular open-world games. Check it out on GitHub and support a fellow community member. Share your feedback and questions in the thread here!
AI poll of the week!
Join the discussion on Discord.
TAI Curated section
Article of the week
Train and Deploy Custom Object Detection Models Without a Single Line Of Code by Peter van Lunteren
Object detection is a computer vision technique used to identify specific objects within images. While numerous online tutorials cover object detection, none of them offer an automated method that eliminates the need for coding. In this tutorial, the author introduces EcoAssist, an open-source application hosted on GitHub, which simplifies object detection and makes it highly accessible to users.
Our must-read articles
Gorilla: Everything You Need to Know by Muhammad Arham
Googleβs Latest AI Model Enables Virtual Try On clothes with Unchanged Details and Flexible Poses by Shen Huang
Meet vLLM: UC Berkeleyβs Open Source Framework for Super Fast and Chearp LLM Serving by Jesus Rodriguez
10 Sklearn Treasure Features Overlooked By 99% of Online Courses by Bex T.
If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Job offers
Research Engineer (Prototyping) GitHub Next @GitHub (Remote)
Product Engineer, PropTech @Picket Homes (Nashville, TN, USA)
Conversational AI Designer @Cresta (Remote)
Forward Deployed Engineer @Cohere (Remote)
Machine Learning Researcher @Shiru (Remote)
Data Engineer Mid Level @pulseData (Remote)
Machine Learning Engineer @Acentra Health (Remote)
Interested in sharing a job opportunity here? Contact [email protected].
If you are preparing your next machine learning interview, donβt hesitate to check out our leading interview preparation website, confetti!
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI