Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Microsoft Muse Can Design Video Games Based on Your Playing Style
Artificial Intelligence   Latest   Machine Learning

Microsoft Muse Can Design Video Games Based on Your Playing Style

Last Updated on February 28, 2025 by Editorial Team

Author(s): Jesus Rodriguez

Originally published on Towards AI.

Microsoft Muse Can Design Video Games Based on Your Playing Style

Created Using Midjourney

I recently started an AI-focused educational newsletter, that already has over 175,000 subscribers. TheSequence is a no-BS (meaning no hype, no news, etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers, and concepts. Please give it a try by subscribing below:

TheSequence | Jesus Rodriguez | Substack

The best source to stay up-to-date with the developments in the machine learning, artificial intelligence, and data…

thesequence.substack.com

Games have played a monumental role in the evolution of AI. From creating training environments to simulating real world conditions, games represent incredible catalyzers on AI learning. A new field known as world action models is rapidly emerging as a field to combine games and AI. Microsoft just dropped an ecising research in this area with a model that can create games after watching human players.

Sounds crazy? Let’s discuss.

Muse, a generative AI model, marks a pivotal advancement in the convergence of artificial intelligence and video games. This model, developed by the Microsoft Research Game Intelligence and Teachable AI Experiences teams in collaboration with Xbox Game Studios’ Ninja Theory, introduces the first World and Human Action Model (WHAM), designed to generate game visuals and controller actions. Muse aims to support human creativity by generating complex gameplay sequences. This essay provides a technical overview of Muse, emphasizing its architecture, capabilities, and key innovations.

Architectural Overview

Muse employs a transformer-based generative model trained on extensive human gameplay data. The model utilizes visuals and controller actions from the Xbox game Bleeding Edge, training current instances at a resolution of 300×180 pixels. The WHAM-1.6B instance of Muse has been trained using over 1 billion images and controller actions, which corresponds to more than 7 years of continuous human gameplay. The foundation of Muse relies on ethically sourced and responsibly used data, ensuring compliance with user agreements and privacy standards.

Image Credit: Microsoft Research

Capabilities of Muse

  1. Gameplay Generation: Muse generates complex gameplay sequences that maintain consistency for several minutes. By prompting the model with an initial 10 frames (1 second) of human gameplay and corresponding controller actions, Muse predicts subsequent game evolution in β€œworld model mode”.
  2. Consistency: Muse ensures that generated gameplay sequences respect the inherent dynamics of the game. The generated sequences align character movements with controller actions, prevent characters from traversing walls, and generally adhere to the game’s physics. Evaluation of consistency involves prompting the model with ground truth gameplay sequences and controller actions. The generated game visuals are then compared to the ground truth visuals using FrΓ©chet Video Distance (FVD), a metric established in the video generation community.
  3. Diversity: Muse generates a range of gameplay variants from identical initial prompts, covering a spectrum of potential gameplay evolutions. This includes both behavioral diversity, such as varied camera movements and path navigation, and visual diversity, including different character hoverboards. Diversity is quantitatively assessed using the Wasserstein distance, which compares model-generated sequences to the diversity found in human gameplay recordings.
  4. Persistency: Muse integrates user modifications into the generated gameplay sequences. For example, if a character is added to an original game visual, Muse can β€œpersist” the added character and generate plausible scenarios showing how the gameplay sequence evolves from that modified starting point.

Key Innovations of Muse

  1. World and Human Action Model (WHAM): Muse introduces WHAM, a generative AI model capable of generating both game visuals and controller actions, representing a novel approach to modeling video game environments and human interactions.
  2. Data-Driven Approach: Muse uses a substantial dataset of human gameplay data from Bleeding Edge, enabling the model to learn complex game dynamics and generate realistic gameplay sequences. The model was trained on more than 1 billion images and controller actions, corresponding to over 7 years of continuous human gameplay.
  3. Multidisciplinary Collaboration: The development of Muse involved machine learning researchers, game developers, and creatives, ensuring the model’s capabilities align with the needs of game creatives and ethical, responsible technology development. Input from game creators early in the process helped shape model capabilities.
  4. WHAM Demonstrator: The WHAM Demonstrator offers a visual interface for interacting with Muse, allowing users to load visuals as initial prompts and generate multiple potential continuations. Users can also adjust generated sequences using game controllers, facilitating iterative creative processes. The WHAM Demonstrator enables users to directly interface with the model, explore its creative potential, and test ideas.
  5. Evaluation Protocols: Muse’s development includes evaluation protocols for consistency, diversity, and persistency, facilitating systematic performance evaluation and providing insights for enhancing capabilities. Muse’s evaluation framework and user study insights allowed for the identification of key capabilities required by game creatives.

Evaluation of Muse

Muse’s evaluation focuses on consistency, diversity, and persistency.

  • Consistency: Muse generates gameplay sequences using ground truth gameplay sequences and controller actions, with generated game visuals compared to ground truth visuals using FrΓ©chet Video Distance (FVD).
Image Credit: Microsoft Research
  • Diversity: Assessed quantitatively using the Wasserstein distance, comparing model-generated sequences to human gameplay recordings.
Image Credit: Microsoft Research
  • Persistency: Demonstrated through modified gameplay sequences and observation of the model’s integration of newly introduced elements.
Image Credit: Microsoft Research

Impact and Future Directions

Muse signifies a significant advancement in utilizing AI for gameplay ideation. By open-sourcing weights and sample data and offering the WHAM Demonstrator executable, Microsoft promotes further exploration and development in this domain.

Conclusion

Muse, the first WHAM, showcases generative AI models’ potential in supporting gameplay ideation. Muse’s architecture, grounded in transformer networks and trained on extensive human gameplay data, enables the generation of consistent, diverse, and persistent gameplay sequences. The project’s multidisciplinary approach and rigorous evaluation protocols underscore its importance. By making Muse accessible to the community, Microsoft fosters innovation and enhances the understanding of generative AI in creating novel, AI-driven game experiences.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓