Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

TAI #116; Rise of the Protein Foundation Model; Comparing AlphaProteo, Chai-1, HelixFold3, and AlphaFold-3.
Latest   Machine Learning

TAI #116; Rise of the Protein Foundation Model; Comparing AlphaProteo, Chai-1, HelixFold3, and AlphaFold-3.

Author(s): Towards AI Editorial Team

Originally published on Towards AI.

What happened this week in AI by Louie

There were plenty of LLM developments again this week, including a new Agent coding product from Replit, an Enterprise LLM platform with a 500k Claude context window at Anthropic, a new Deepseek 2.5 model, and a $1bn seed fundraising by Ilya Sutskever for his new startup. There has also been plenty of progress in AI outside LLMs, and one of our favorite applications for AI is protein, biomolecule, and drug design foundation models. This week was very busy on this front with Deepmind’s AlphaProteo protein binder model and Chai Discovery’s Chai-1 multimodal structure prediction model released. This followed Baidu’s HelixFold3 model last week, which attempted an open-source replication of AlphaFold-3 (itself released in May).

Chai Discovery came out of stealth with a $30m seed round and launch of its Chai-1 model. It stands out as a highly versatile multi-modal model, excelling in various structure prediction tasks. It offers model weights and code for non-commercial use, and commercial access is provided through a web interface. AlphaProteo from Google DeepMind specifically targets the design of novel protein binders, crucial for drug discovery and research. It leverages AI to generate high-affinity binders with greater efficiency than traditional methods. HelixFold3, developed by the PaddleHelix team at Baidu, focuses on replicating the capabilities of AlphaFold-3 in an open-source manner, demonstrating comparable accuracy in predicting the structures of proteins, nucleic acids, and small molecule ligands.

Why should you care?

To fully understand human biology and design treatments for more diseases, we need to have a better understanding of proteins, protein binding, and interactions with other biomolecules. AI use in drug development or drug target identification has already shown very positive early results. It can already cut development time down to ~30 months from initiation to phase 1 trial (vs. 60 months for normal drugs), and a recent study measured an 80–90% phase 1 success rate for AI drugs (vs. 40–65% for normal drugs). However, there are still only ~70 AI drugs in clinical trials relative to many thousands overall, and none has yet passed phase 3.

These four new AI protein models showcase the rapid progress and diverse applications within protein structure prediction, which we expect to lead to faster progress in AI-assisted drug design. Each model offers unique strengths and addresses specific challenges in the field.

While these models alone won’t find a new drug β€” (separate models need to be used to identify new drug targets, for example, more data needs to be collected for many areas of biology, and many lab experiments are still needed to verify and iterate predictions) β€” we think the emergence of these protein foundation models could potentially be the catalyst for a β€œchatgpt” moment for AI’s use in drug design. AI tools are now much more accessible, and we hope many more biology β€œfoundation” models” will be developed and made available.

β€” Louie Peters β€” Towards AI Co-founder and CEO

In collaboration with Bright Data:

See how businesses are transforming their AI systems using cutting-edge, real-time, and ethically sourced web data.

Stay current with real-time insights: AI models thrive on real-time data from the web, enabling them to reflect the latest trends and deliver accurate, up-to-the-minute responses.

Access diverse perspectives across the web: Tap into rich, unstructured data from millions of online sources like news platforms, forums, and social media, providing AI with invaluable, varied content for deeper learning.

Boost LLM performance with real-world context: Large Language Models gain a competitive edge by using nuanced web data, improving their ability to understand context, tone, and intent for more refined output.

Scale data collection effortlessly: With Bright Data, businesses can scale data gathering operations based on their needs, seamlessly adapting to real-time changes in demand while fueling AI models with the latest information.

Ensure data collection is ethical and compliant: Navigate the complexities of data privacy regulations confidently. Bright Data ensures data is collected transparently, upholding legal standards and ethical guidelines.

🔗 Read more about how web data can elevate your AI systems today!

Hottest News

  1. Replit Announced Replit Agent

Replit announced Replit Agent in early access. It is an AI-powered tool that assists users in building software projects. It can understand natural language prompts and help create applications from scratch, as well as automate tasks such as setting up a dev environment, installing packages, configuring DB, and more.

2. Chai Discovery Releases Chai-1

Chai-1 is a new multi-modal foundation model for molecular structure prediction. It enables unified prediction of proteins, small molecules, DNA, RNA, covalent modifications, and more.

3. Anthropic Introduces Claude for Enterprise

The Claude Enterprise plan to help organizations collaborate with Claude using internal knowledge. It offers an expanded 500K context window, more usage capacity, and a native GitHub integration. It also includes security features like SSO, role-based permissions, and admin tooling.

4. Together AI Presents a New Approach for Distilling and Accelerating Hybrid Models

Together AI presents a case for distilling Transformer models into hybrid linear RNNs like Mamba. As per the research paper, this approach will preserve the generative capabilities of Transformers and significantly enhance their efficiency, making them more suitable for deployment. The distilled hybrid Mamba model (50%) achieves a score in the MT benchmark similar to the teacher model and slightly better on the AlpacaEval benchmark in both LC win rate and overall win rate. We see this as a very promising method to deploy leading LLMs more affordably, and it may join the model inference optimization toolkit, such as quantization, pruning, and more traditional transformer-to-transformer distillation.

5. Ilya Sutskever’s Startup, Safe Superintelligence, Raises $1B

Ilya Sutskever, co-founder of OpenAI, has launched Safe Superintelligence (SSI), an AI startup focused on developing safe superintelligent AI systems. SSI recently raised $1 billion, valuing the company at $5 billion, with backing from top venture capital firms like Andreessen Horowitz and Sequoia Capital. The funds are earmarked for computing power and talent acquisition, with a core focus on foundational AI research and safety.

6. OpenAI Considers $2,000 Monthly Subscription Prices for New LLMs

OpenAI is reportedly considering a $2,000 monthly subscription for advanced LLMs like Strawberry and Orion, a significant increase from the current $20 ChatGPT Plus fee. Additionally, OpenAI is looking to streamline its corporate structure to attract investment, aiming for a valuation above $100 billion, while garnering interest from major investors like Apple and Microsoft. We are skeptical the $2,000 price point will materialize, aside from maybe large enterprise plans, given competition in the LLM space and progress on inference costs. However, we see a lot of potential for investing in significantly higher inference time compute to achieve new capabilities.

7. Intel CEO To Pitch Board on Plans To Shed Assets Cut Costs

Intel CEO Pat Gelsinger is set to present a cost-cutting strategy involving asset sales and scaled-back capital expenditures, which may include halting a $32 billion German factory project. Engaging Morgan Stanley and Goldman Sachs, Intel aims to stabilize amidst financial difficulties and intense AI sector competition.

8. AlphaProteo Generates Novel Proteins for Biology and Health Research

AlphaProteo, a new AI tool by DeepMind, designs novel proteins with high binding affinity for targets like VEGF-A, enhancing drug development and biological research. Trained on extensive protein data, it often outperforms existing methods and optimized binders.

9. Elon Musk Claims He Just Activated the World’s Most Powerful AI Supercomputer

Elon Musk’s xAI has launched Colossus, a major training cluster boasting 100,000 Nvidia H100 GPUs, making it the world’s most powerful AI system. Built in 122 days in Memphis and set to double in capacity soon, this development comes amid a global GPU shortage, with rivals like Meta and Microsoft also competing for AI supremacy.

10. DeepSeek 2.5 Is Out

DeepSeek-AI has released DeepSeek-V2.5, a powerful Mixture of Experts (MOE) model with 238 billion parameters, featuring 160 experts and 16 billion active parameters for optimized performance. The model excels in chat and coding tasks, with cutting-edge capabilities such as function calls, JSON output generation, and Fill-in-the-Middle (FIM) completion.

11. SalesForce Releases xLAM: A Family of Large Action Models for AI Agents

Salesforce has introduced two advanced AI models, xGen-Sales and xLAM, to help businesses increase automation and efficiency. These models are designed to set a new standard for AI-driven automation, particularly in sales and tasks that require triggering actions within software systems.

Seven 5-minute reads/videos to keep you learning

1. A Beginner’s Guide to Converting Numerical Data to Categorical: Binning and Binarization

This post explores two techniques: Binning and Binarization. These techniques are perfect for scenarios involving datasets such as Google Playstore data, where categories β€” like the number of app downloads β€” are more telling than raw numbers. The post provides easy-to-follow code examples for wrangling numerical data into meaningful categories.

2. In Defense of the Small Language Model

The article highlights that smaller language models (SLMs) like Llama 3.1–8B can excel in specific tasks such as PII redaction using advanced prompt engineering and efficient fine-tuning. These SLMs offer a cost-effective alternative to larger models like GPT-4o, making them suitable for enterprises balancing performance with budget constraints.

3. The Missing Guide to the H100 GPU Market

The article analyzes the H100 GPU market, highlighting stable prices and flexible short-term rentals. It compares the H100 to the A100 GPU, noting falling prices and growing availability. It also outlines the costs of purchasing versus renting, recommending rentals for better scalability and operational efficiency in AI applications

4. Top 10 Things We Learned About Educating Our Teams on Using AI Every Day

This article shares essential lessons about educating teams on the effective use of AI in their everyday tasks. It includes insights from Atlassian’s observation of 25 project managers and knowledge workers as they tested and implemented AI into their daily routines.

5. Introduction to GGML

ggml is a minimalist C/C++ machine learning library designed for efficient Transformer inference, offering benefits like easy compilation, small binary size, and optimized memory usage. While it supports various hardware platforms, it requires low-level programming skills and may not support all tensor operations across all backends.

6. The Power of Pipelines

Machine learning projects often require the execution of a sequence of data preprocessing steps followed by a learning algorithm. Managing these steps individually can be cumbersome and error-prone. This post will explore how pipelines automate critical aspects of machine learning workflows, such as data preprocessing, feature engineering, and the incorporation of machine learning algorithms.

7. Simplifying Data Preprocessing With ColumnTransformer in Python: A Step-by-Step Guide

This blog explores how to handle data without ColumnTransformer and how it can make data preprocessing tasks much easier. The author works with a dummy dataset to make the concepts more digestible.

Repositories & Tools

  1. Anthropic Quickstarts is a collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API.
  2. Composio equips your AI agents & LLMs with 100+ high-quality integrations via function calling.
  3. Laminar is a small library for building web application interfaces. It keeps the UI state in sync with the underlying application state.
  4. CAR is a tool that creates and extracts archives without copying data.

Top Papers of The Week

1. LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models

LongRecipe presents an efficient training method to extend LLMs’ context windows from 8k to 128k using innovative techniques that reduce computational resources by 85%, enabling this expansion on modest hardware.

2. LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture

LongLLaVA is a Multi-modal Large Language Model (MLLM) designed to process extensive image datasets efficiently. It integrates Mamba and Transformer blocks, focuses on temporal and spatial dependencies, and utilizes progressive training to handle up to a thousand images with high throughput, low memory consumption, and competitive benchmark performance.

3. Evaluating Environments Using Exploratory Agents

This paper uses an exploratory agent to provide feedback on the design of five engaging levels and five unengaging levels of a game. The study shows that the exploratory agent can clearly distinguish between engaging and unengaging levels. This can be an effective tool for assessing procedurally generated levels in terms of exploration.

4. Attention Heads of Large Language Models: A Survey

This survey aims to shed light on the internal reasoning processes of LLMs by concentrating on the interpretability and underlying mechanisms of attention heads. It distills the human thought process into a four-stage framework: Knowledge Recalling, In-Context Identification, Latent Reasoning, and Expression Preparation. Using this framework, it reviews existing research to identify and categorize the functions of specific attention heads.

5. FLUX that Plays Music

FluxMusic is an advanced text-to-music generation model that combines diffusion-based rectified flow Transformers with a latent VAE framework for mel-spectral analysis. Leveraging attention sequences and pre-trained text encoders improves semantic capture and input flexibility. The model surpasses existing methods based on both automatic metrics and human evaluations.

6. OLMoE: Open Mixture-of-Experts Language Models

OLMoE is a fully open language model leveraging sparse Mixture-of-Experts (MoE). It has 7 billion (B) parameters but uses only 1B per input token. It is pretrained on 5 trillion tokens and further adapted to create OLMoE-1B-7B-Instruct. The model outperforms all available models with similar active parameters, surpassing larger ones like Llama2–13B-Chat and DeepSeekMoE-16B.

Upcoming Events

The Towards AI team is attending the AI & Big Data Expo Europe, 1–2 October 2024

Event Highlights:

  • 6 Co-Located Events β€” A comprehensive exploration of AI and Big Data with six co-located events.
  • 7,000+ Attendees β€” Professionals, thought leaders, and enthusiasts from around the globe.
  • 200+ Speakers β€” Industry experts from Netflix, IKEA, The UN, Deloitte, Booking.com, and more to share their insights, experiences, and forecasts.
  • Thematic Tracks β€” Covering Enterprise AI, Machine Learning, Security, Ethical AI, Deep Learning, Data Ecosystems, NLP, and more.

Date & Location: 1–2 October 2024 at the RAI, Amsterdam.

Your in-person ticket will also grant you access to the co-located events exploring IoT Tech, Intelligent Automation, Cyber Security & Cloud, Unified Communications, Edge Computing, and Digital Transformation!

Let’s meet at the event!

Quick Links

1. At its annual fall product event, Apple unveils its latest generation of iPhones. While the new lineup has minimal hardware upgrades, the artificial intelligence-boosted iPhone 16 promises improvements in its Siri personal assistant as it rolls out new software.

2. In a recent post on Twitter (now X), music producer Timberland endorsed Suno AI, an AI-powered music tool. The reactions to Timbaland’s post are a mix of excitement and concern.

3. X permanently stopped Grok AI from using EU citizens’ tweets after Ireland’s Data Protection Commissioner (DPC) took it to court in Ireland. The platform had forcibly opted EU citizens into Grok’s training without asking permission.

Who’s Hiring in AI

Our Towards AI Jobs Search Platform is gaining momentum! We received 500,000 Google search impressions in August and are now listing 29,000 live AI jobs. Our LLM-enhanced pipeline continuously searches for jobs that meet our AI criteria and regularly removes expired jobs. We make it much easier to search for and filter by specific AI skills, and we also allow you to set email alerts for jobs with specific skills or from specific companies. All for free! We hope our platform will make it much quicker to find and apply for jobs that truly match your AI skills and experience.

Data Analyst Intern (6-month) @HoYoverse (Singapore)

AI2 Personalization Data Science Associate Program @TD Bank (Toronto, Canada)

Senior Software Engineer β€” Ads @Microsoft Corporation (Beijing, China)

AI Research Intern (Summer 2025) @RTX Corporation (Remote)

Sr Manager, Machine Learning @LinkedIn (NY, USA/Hybrid)

AI Tutor β€” Bilingual (Full-Time) @xAI (Remote)

Data Engineer, Analytics @Meta (Menlo Park, CA, USA)

Interested in sharing a job opportunity here? Contact [email protected].

Think a friend would enjoy this too? Share the newsletter and let them join the conversation.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓