Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

This AI newsletter is all you need #35
Latest   Machine Learning

This AI newsletter is all you need #35

Last Updated on July 25, 2023 by Editorial Team

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

What happened this week in AI by Louis

While ChatGPT continues to create excitement in AI, this week we saw lots of active discussion and argument around the model’s inner workings and why it is so successful in producing meaningful text. The opinion is divided on just how smart ChatGPT is, how it works, and how significant its release is to the future of AI. We find both sides of these arguments valuable. Different perspectives and thoughtful new ways of describing the workings of large language models and transformers such as ChatGPT can help to continue to improve these models and build on their shortcomings.

An article in The New Yorker described ChatGPT as a “Blurry JPEG of the Web” because of its ability to retain much of the information on the Web. However, it’s important to note that if you’re looking for an exact sequence of bits, you won’t find it. Instead, all you will ever get is an approximation, or worse, a hallucination (ChatGPT confidently giving you false information it invented). Unlike a blurry JPEG, ChatGPT is revolutionary because the approximation is presented as grammatical text, which ChatGPT excels at creating. Though it may not be perfect, the text produced by ChatGPT is usually acceptable. In essence, you’re still looking at a blurry JPEG, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.

Stephen Wolfram’s article provides another interesting and very in-depth take on how ChatGPT works. He runs through the surprising contrast between ChatGPT’s relatively basic concept, simple NN elements, and basic operations behind it — and its remarkable results. While it doesn’t always “globally make sense” because it’s just saying things that “sound right” based on its training set. Wolfram also discusses the similarities and differences between ChatGPT’s underlying neural-net structure and the human brain — and some of its shortcomings, such as lack of computational capability. Overall, he concludes; ChatGPT is a great example of the remarkable things large amounts of simple computational elements can do and also provides great impetus to better understand human language and the processes of thinking behind it.

While ChatGPT has made significant strides in making AI more accessible, the next frontier in the field could be building trust in AI. Especially building trustable AI that does not fail confidently and can explain its results in ways humans can understand and judge. Demonstrating the reliability and trustworthiness of an AI system could become a competitive advantage that outweighs having the largest or fastest repository of answers.

Hottest News

1.Microsoft to demo its new ChatGPT-like AI in Word, PowerPoint, and Outlook soon

Microsoft plans to expand it to core productivity apps such as Word, PowerPoint, and Outlook by integrating OpenAI’s language AI technology and Prometheus Model in the coming weeks.

2. ChatGPT Burns Millions Every Day. Can Computer Scientists Make AI One Million Times More Efficient?

Training and running a large language model like ChatGPT is expensive. While our brains are a million times more efficient than the GPUs, CPUs, and memory that make up ChatGPT’s cloud hardware, neuromorphic computing researchers are working to make the miracles that big server farms in the clouds can do today much simpler and cheaper, bringing them to small devices. This approach is similar to the way the brain works, with hardware, software, and algorithms blended in an intertwined way.

3. Audiobook Narrators Fear Apple Used Their Voices to Train AI

After facing backlash, Spotify has paused an arrangement that allowed Apple to use some audiobook files for training machine learning models. The dispute arose after several narrators learned of a clause in contracts between authors and Findaway Voices, a leading audiobook distributor, which gave Apple the right to “use audiobook files for machine learning training and models.”

4. Honest Lying: Why Scaling Generative AI Responsibly is Not a Technology Dilemma in as Much as a People Problem

In recent instances, AI models confidently provided responses containing incorrect historical, scientific, or physical information. The expected widespread adoption of large language models may increase the production of false or erroneous memories without the intent to deceive, which is defined as “confabulating” or “honest lying” by ClinMedjournal.

5. How should AI systems behave, and who should decide?

OpenAI shares information on how ChatGPT’s behavior is shaped, as well as their plans for improving its behavior, allowing for more user customization, and increasing public input into the decision-making process in these areas.

Three 5-minute reads/videos to keep you learning

1.How Not to Test GPT-3

Doing psychology on large language models is harder than you might think. A recent study by a business professor at Stanford on the Theory of Mind has been one of the major news in the AI world recently. This article explores whether GPT-3 really mastered the theory of mind (ToM). The truth, however, is that GPT often fails in problems involving the theory of mind.

2. How to find a job in Generative AI, and what is it like?

In this video, Louis Bouchard dives into a conversation with Or Gorodissky, VP of R&D at D-ID about how to find a job in Generative AI and what the day-to-day is like. The interview covers a wide range of topics related to Generative AI such as the ideal education to get into this field, the job interview process and what’s it like to work in a startup in such a fast-evolving industry.

3. PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware

PEFT approaches enable comparable performance to full fine-tuning with few trainable parameters. HuggingFace has introduced a PEFT library that provides the latest Parameter-Efficient Fine-tuning techniques, seamlessly integrated with HuggingFace Transformers and Accelerate. This enables the use of the most popular and performant models from Transformers, coupled with the simplicity and scalability of Accelerate.

4. Create consistent characters in Midjourney

In this Twitter thread, @nickfloats shares the best way he has found to create consistent characters in Midjourney. He provides a step-by-step guide for the process, including prompting techniques and setting up the entire workflow.

5. Could Stable Diffusion Solve a Gap in Medical Imaging Data?

The Stanford AIMI scholars have found a way to generate synthetic chest X-rays by fine-tuning the open-source Stable Diffusion foundation model. This breakthrough is promising, as it could lead to more extensive research, a better understanding of rare diseases, and even the development of new treatment protocols. This article provides a walk-through of the process.

Papers & Repositories

1.LangChain: Building applications with LLMs through composability

The LangChain library aims to assist in the development of applications that can leverage the power of LLMs with other sources of computation or knowledge. It is designed to help with six main areas: LLMs and Prompts, Chains, Data-Augmented Generation, Agents, Memory, and Evaluation.

2. Describe, Explain, Plan, and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

This paper investigates the planning problem in Minecraft, an open-ended and challenging environment for developing multi-task embodied agents. The DEPS approach offers better error correction through feedback during long-haul planning and provides a sense of proximity through a learnable module called the Goal Selector. The experiments also show the first multi-task agent capable of accomplishing over 70 Minecraft tasks and almost doubling overall performance.

3. Level Generation Through Large Language Models

The paper explores the utilization of LLMs to create levels for the game Sokoban, and concludes that LLMs are capable of generating such levels. It also reveals that the performance of LLMs is significantly related to the size of the dataset. Additionally, the paper presents initial tests on regulating LLM-level generators and discusses the potential research areas.

4. Augmented Language Models: A Survey

This survey reviews the works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. After reviewing the current advances in ALMs, the work concludes that this new research direction has the potential to address common limitations of traditional LMs, such as interpretability, consistency, and scalability issues.

5. Google Research, 2022 & beyond: Algorithmic advances

This article is part of a series of posts covering different research areas at Google. It highlights Google’s progress in scaling up machine learning (ML) solutions, ensuring privacy in ML, developing market algorithms, and advancing the algorithmic foundations of large-scale ML deployment.

Enjoy these papers and news summaries? Get a daily recap in your inbox!

The Learn AI Together Community section!

Upcoming Community Events

The Learn AI Together Discord community hosts weekly AI seminars to help the community learn from industry experts, ask questions, and get a deeper insight into the latest research in AI. Join us for free, interactive video sessions hosted live on Discord weekly by attending our upcoming events.

  1. Graph Neural Networks (NN Architecture Seminar #7.1)

This week’s session in the (free) nine-part Neural Networks Architectures series will be led by Pablo Duboue (DrDub) and focus on Graph Neural Networks. This is the first half of the 7th lecture. During this session, he will explore topics such as Graph Processing Architectures, Local vs. global, GNNs, DGCNNs, GCN, and MPNN. Find the link to the seminar here or add it to your calendar here.

Date & Time: 21st February, 11 pm EST

2. Graph Neural Networks (NN Architecture Seminar #7.2)

This is the second half of the nine-part Neural Networks Architectures series will be led by Pablo Duboue (DrDub) and focuses on Graph Neural Networks. Find the link to the seminar here or add it to your calendar here.

Date & Time: 23rd February, 6:30 pm EST

If you missed the first part of the series, find last week’s event recordings here.

3. LAIT’s Reading Group

Learn AI Together’s weekly reading group offers informative presentations and discussions on the latest developments in AI. It is a great (free) event to learn, ask questions, and interact with community members. Join the upcoming reading group discussion here.

Date & Time: 25th February, 10 pm EST

Add our Google calendar to see all our free AI events!

Meme of the week!

Meme shared by neuralink#7014

Featured Community post from the Discord

Daemonz#2594 has created an open-source tool for exploring GitHub data using GPT-powered querying. The tool generates SQL queries and presents the results visually, allowing users to ask questions in natural language and chat with the 5 billion rows of GitHub data. Check out the tool here and support a fellow community member. Share your feedback in the thread here.

AI poll of the week!

Join the discussion on Discord.

TAI Curated section

Article of the week

Traffic Forecasting: The Power of Graph Convolutional Networks on Time Series by Barak Or, PhD

The Graph Convolutional Network (GCN) represents a groundbreaking development in deep learning, demonstrating its versatility and potential for addressing real-world problems. Traffic prediction is a critical issue in transportation, and the capacity to apply GCN algorithms for this purpose holds immense promise and has the potential to significantly impact the transportation industry.

Our must-read articles

How to Use Hugging Face Pipelines? by Tirendaz AI

Understanding Machine Learning Performance Metrics by Pranay Rishith

Creating our first optimized DCGAN by Pere Martra

If you want to publish with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.

Job offers

Senior Data Scientist @Link Financial Technologies, Inc (Remote)

Senior ML Operations Engineer @Overjet (Remote)

Senior Full Stack Engineer @Labelbox (Remote)

AI/ML Wireless Communications Engineer @Anduril Industries (Costa Mesa, CA, USA)

Senior Software Engineer @CCRi (Charlottesville, NC, USA)

ML Research Scientist @Genesis Therapeutics (Burlingame, CA, USA)

Junior — Mid-level Data Scientist @Anagenex ( Boston, MA, USA)

Interested in sharing a job opportunity here? Contact [email protected].

If you are preparing your next machine learning interview, don’t hesitate to check out our leading interview preparation website, confetti!

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓