Unlock the full potential of AI with Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

This AI newsletter is all you need #50
Latest   Machine Learning

This AI newsletter is all you need #50

Last Updated on July 25, 2023 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

What happened this week in AI by Louie

The open-source movement continues at pace this week with Falcon, a new family of state-of-the-art language models, ascending to the top of Hugging Face’s leaderboard.

Falcon-40B, developed by the Technology Innovation Institute in Abu Dhabi, represents the first “truly open” model, boasting capabilities that rival many existing closed-source models. The Falcon family consists of two base models: Falcon-40B and Falcon-7B. Currently, the 40B parameter model leads the Open LLM Leaderboard, while the 7B model excels in its respective weight class. Falcon-7B and Falcon-40B have been trained on 1.5 trillion and 1 trillion tokens, respectively, aligning with modern models optimized for inference. The exceptional quality of the Falcon models is attributed to their training data, which is predominantly derived (>80%) from RefinedWeb — a groundbreaking, extensive web dataset based on CommonCrawl. Another noteworthy feature of the Falcon models is the Multi-query attention, which employs a shared key and value across all heads.

Falcon introduces an exciting new chapter in the open-source development of large language models, making them accessible for commercial applications. This news is truly fantastic for practitioners, enthusiasts, and industries alike, as it unlocks a multitude of exciting use cases.

– Louie Peters — Towards AI Co-founder and CEO

Hottest News

  1. OpenAI’s plans according to Sam Altman

OpenAI’s future plans include prioritizing the development of a more affordable and efficient GPT-4, which will feature extended context windows and incorporate multimodality. However, the current progress is hindered by GPU shortages, preventing the immediate creation of models significantly larger in scale. Nevertheless, OpenAI intends to expand the availability of the finetuning API to encompass the latest models. Please note that the official article has been removed from public access at OpenAI’s request.

2. Nvidia demo about speaking to AI game characters

NVIDIA recently presented an impressive demonstration of conversational AI applied to game characters, showcasing its remarkable power to enhance realism and engage players. This innovation provides game developers with a valuable tool to elevate their games’ storytelling capabilities and overall player engagement.

3. Lawyer cites fake cases invented by ChatGPT, judge is not amused

In a notable incident, a lawyer cited fake cases generated by ChatGPT in their legal findings, underscoring the significance of verifying the accuracy and legitimacy of AI-generated content. Although generative AI technologies like ChatGPT have the potential to be advantageous in the legal sector, it is crucial to recognize their imperfections and the potential ethical and legal challenges they may present if not diligently reviewed and scrutinized.

4. Why Nvidia is suddenly one of the most valuable companies in the world

NVIDIA’s GPUs have emerged as a critical element in AI development, resulting in the company’s market value skyrocketing to an impressive $939.3 billion. Given the substantial data requirements of AI applications, numerous companies are investing in thousands of NVIDIA’s high-priced chips.

5. JPMorgan Trained AI to Interpret the Federal Reserve’s Intent

According to a report by Bloomberg, JPMorgan Chase has utilized a model trained on ChatGPT-like language models to evaluate statements made by a financial regulator in the United States. The purpose of this model is to predict whether the regulator intends to increase or decrease interest rates. Leveraging the predictive capabilities of LLMs can offer accurate predictions, leading to big profits for investors adjusting their strategies.

5-minute reads/videos to keep you learning

  1. A Very Gentle Introduction to Large Language Models without the Hype

This article serves as an introduction to Large Language Models (LLMs) and does not assume any technical or mathematical background. Its purpose is to offer a glimpse into the inner workings of AI systems such as ChatGPT, explaining the fundamental concepts behind them. It provides a clear understanding of LLMs like ChatGPT and outlines what can and cannot be expected from them.

2. Barkour: Benchmarking animal-level agility with quadruped robots

Google has developed a quadruped agility benchmark called Barkour, taking inspiration from dog agility competitions. The benchmark encourages efficient, controllable, and versatile locomotion controllers for quadruped robots and uses a new policy trained with a student-teacher framework to ensure greater robustness, versatility, and dynamism.

3. Ten Years of AI in Review

This article revisits the pivotal advancements in AI that have propelled us to the present state. It offers a comprehensive overview of the remarkable progress that has propelled AI into becoming a household name, appealing to both seasoned AI practitioners and passionate enthusiasts of the field.

4. Combining Text-to-SQL with Semantic Search for Retrieval Augmented Generation

This article showcases a powerful new query engine ( SQLAutoVectorQueryEngine ) in LlamaIndex that can leverage both a SQL database as well as a vector store to fulfill complex natural language queries over a combination of structured and unstructured data, providing a comprehensive solution for AI professionals.

5. Improving mathematical reasoning with process supervision

Researchers have discovered that incorporating process supervision can greatly improve the mathematical problem-solving abilities of AI models. By providing rewards for each correct step of reasoning, the model achieved a remarkable state-of-the-art performance, surpassing the outcomes achieved through outcome supervision. This approach not only enhances performance but also aligns more closely with human thinking patterns.

Papers & Repositories

  1. Macaw-LLM

Macaw-LLM is a multi-modal language modeling framework that seamlessly integrates images, videos, audio, and text. It achieves this by aligning multi-modal data, encoding it with CLIP and Whisper, and feeding the outputs to LLaMA for efficient learning. It supports multiple languages and can be expanded with the inclusion of additional models.

2. Controllable Text-to-Image Generation with GPT-4

This research introduces Control-GPT, a technique that leverages programmatic sketches generated by GPT-4 to guide diffusion-based text-to-image pipelines. Incorporating these sketches as references enhances the ability of the pipelines to follow instructions. This approach significantly improves spatial reasoning and enhances controllability.

3. An Open-Ended Embodied Agent with Large Language Models

Voyager is a new LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills and makes novel discoveries without human intervention. It has an automatic curriculum, and a skill library, and uses code as the action space for greater compositional actions.

4. A Ph.D. Student’s Perspective on Research in NLP

This document is a compilation of 14 NLP research directions that are rich for exploration, reflecting the views of a diverse group of Ph.D. students in an academic research lab. It serves as a valuable guide for identifying exciting areas to explore in the field of NLP for Ph.D. students and researchers.

5. The Rise of ArtGPT-4 and its Artistic Vision

Recent advancements in AI have led to the creation of ArtGPT-4, a new language and image comprehension model designed specifically for artistic images. Despite its smaller size and lower data requirements, surpasses other models in terms of artistic image understanding and creation.

Enjoy these papers and news summaries? Get a daily recap in your inbox!

The Learn AI Together Community section!

Weekly AI Podcast

This week’s episode of the “What’s AI” podcast features Felix Tao, CEO of Mindverse AI, and his years of experience working as a researcher at Facebook and Alibaba, mostly involved in language applications and AI. In this interview, Felix provides valuable insights into the evolution of AI, the advancements in large language models, and the delicate balance between research and practical applications. Tune in on YouTube, Spotify, or Apple Podcasts. Youtube, Spotify, and Apple podcasts!

Meme of the week!

Meme shared by AgressiveDisco#4516

Featured Community post from the Discord

Chartistic#5022 has developed a free-to-use AI workflow builder with leading AI models. With the Craitive Studio, users can effortlessly build no-code AI workflows. The platform allows seamless integration of LLMs, input prompts, and customization of outputs. This functionality opens up a wide range of applications, such as comparing outputs from different generative image models, extracting content and captions from images, leveraging ChatGPT to generate image prompts, effortlessly creating blog posts and accompanying visuals, generating variations for optimal ad visuals, and more. Check Craitive Studio here. Share your feedback here and support a fellow community member!

AI poll of the week!

Join the discussion on Discord.

TAI Curated section

Article of the week

Mastering Sentiment Analysis with Python using the Attention Mechanism by The AI Quant

This article delves into the fascinating realm of sentiment analysis and provides a comprehensive guide on building a custom sentiment analysis model using Python. The attention mechanism plays a crucial role in this process. Leveraging the powerful Keras library, you will gain insights into training a deep-learning model capable of accurately interpreting emotions.

Our must-read articles

The Real Cost vs. Benefit Dilemma of AI in Business by John Adeojo

SaMD? We need AaMD! The Algorithm as A Medical Device by Dr. Mandar Karhade, MD. Ph.D.

If you are interested in publishing with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

Job offers

Side Hustle Expert — AI Tools @Fud (Remote)

Senior Software Quality Engineer @Minitab (Remote)

Lead Data Engineer @Insite AI (Remote)

AI/NLP Researcher @Verneek (New York, USA)

Senior Data Engineer @Computronics Solutions (Lausanne, Switzerland)

Software Engineer, AI/ML @LRDTech (Singapore)

Deep Learning Engineer in Artificial Intelligence Start-up @Gemmo (Remote)

Interested in sharing a job opportunity here? Contact [email protected].

If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓