Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-FranΓ§ois Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Task Arithmetic for Model Editing
Artificial Intelligence   Data Science   Latest   Machine Learning

Task Arithmetic for Model Editing

Author(s): Ayo Akinkugbe

Originally published on Towards AI.

Task Arithmetic for Model Editing
Photo by charlesdeluvio on Unsplash

Introduction

In the 2004 film β€œEternal Sunshine of the Spotless Mind,” Clementine (played by Kate Winslet) and Joel (played by Jim Carey) visit Lacuna Inc. to undergo a revolutionary procedure after a breakup: the selective erasure of painful memories. The process begins with a β€œmemory mapping” phase, where the technicians Patrick (Elijah Wood), Stan (Mark Ruffalo), and Mary(Kirsten Dunst) precisely identify and map specific memories before surgically removing them from the patients’ minds. What makes this fictional procedure so compelling isn’t just its ability to delete unwanted memories, but its surgical precision β€” removing targeted experiences while leaving the rest of the person’s memories intact.

This concept isn’t just science fiction anymore β€” at least with large language models. In the world of machine learning, we face a remarkably similar challenge: How do we selectively modify what our models have β€œlearned” without destroying everything else they know?

Consider this scenario: You’ve deployed a sophisticated language model for your e-commerce platform that excels at product categorization, customer service, and content generation. But then you discover it has learned some biased associations about certain product categories, or you need it to understand entirely new product types that didn’t exist when you first trained it. Traditional approaches would require you to either:

  1. Retrain from scratch β€” this is expensive, time-consuming, and you lose all the valuable knowledge the model has accumulated.
  2. Fine-tune on new data β€” this risks catastrophic forgetting where the model loses its previous capabilities
  3. Use with the limitations β€” suboptimal performance is accepted rather than risk breaking what works

Enter model editing β€” our real-world equivalent of Lacuna Inc.’s memory mapping technology.

What is Model Editing

Model editing is the process of changing a model’s behavior without retraining it from scratch. This includes:

  • Adding new capabilities (learning new product categories)
  • Removing unwanted behaviors (eliminating bias or harmful outputs from models)
  • Modifying existing skills and tasks(adjusting the confidence or style of responses)
  • Combining capabilities and modals (merging specialized models into a single multi-task system)

Task arithmetic, introduced by Ilharco et al. in β€œEditing Models with Task Vectors” treats model capabilities as vectors that can be combined, scaled, and manipulated with mathematical precision. Just as the fictional technicians could map and manipulate specific memories, task arithmetic allows us to map learned behaviors as mathematical vectors that can be added, subtracted, scaled, and combined. Unlike Clementine and Joel’s irreversible memory erasure, task arithmetic operations are reversible β€” you can always subtract what you’ve added or add back what you’ve removed. For instance

  • To remove biased behavior β€” Subtract that task vector.
  • Need to add new capabilities? β€” Add the corresponding task vectors.
  • Want to fine-tune the strength of a particular skill? β€” Scale its vector up or down.

This approach opens up exciting possibilities that were previously impractical or impossible: creating models that can be updated as surgically as editing a document, combining the best capabilities from multiple specialized models, and maintaining fine-grained control over model behavior in production systems.

This post explores how task arithmetic works, dives deep into implementation techniques, and examines practical applications.

Understanding Task Vectors

At its core, task arithmetic is surprisingly elegant yet intuitive. When we fine-tune a model on a specific task, we’re essentially teaching it new patterns and behaviors. The difference between the fine-tuned model’s parameters and the original model’s parameters captures exactly what the model learned from that task.

Task vector = Fine tuned model parameters β€” Base model parameters

Think of this as isolating the β€œmemory” of a specific skill. Just as Lacuna Inc. could map Joel’s memories of Clementine, we can map a model’s learned behaviors into discrete, manipulable vectors.

The Linear Superposition Hypothesis

Task arithmetic works under the assumption that different capabilities in neural networks exist in a kind of linear superposition β€” they can be added together without destructive interference. This is similar to how different radio frequencies can coexist in the same space without canceling each other out in electromagnetism.

A Pseudocode Example

For instance if we intended to create a bi-task model that both classifies sentiment and detect spam using task arithmetic, we would:

  • Choose a base model
  • Train separate specialized model. In this case that would be a sentiment model and a spam model separately
  • Extract the tasks vectors by performing element wise subtraction of the base model parameters from each specialized model parameters
  • Create a multimodal model through addition of both vectors to the base model
def multimodal_task_arithmetic():
"""
Example: A model that can both classify sentiment AND detect spam
"""

base_model = load_pretrained_model("bert-base")

# Train separate specialists
sentiment_model = fine_tune(base_model, sentiment_data)
spam_model = fine_tune(base_model, spam_data)

# Extract task vectors
sentiment_vector = sentiment_model.parameters - base_model.parameters
spam_vector = spam_model.parameters - base_model.parameters

# Create multi-task model through addition
multi_task_model = base_model.parameters + sentiment_vector + spam_vector

return multi_task_model

Recent research (Tam et. al, 2024) suggests that neural networks learn different tasks in different subspaces of the parameter space. When these subspaces don’t significantly overlap, we can add and subtract learned behaviors without interference. Below is a visual representation.

import numpy as np
import matplotlib.pyplot as plt

def visualize_task_spaces():
"""
A conceptual visualization of how different tasks occupy different
regions of parameter space
"""

# Define the color palette
colors = ['#4E79A7', '#F28E2B', '#E15759', '#76B7B2', '#59A14F']

# Simulate parameter space (reduced to 2D for visualization)
base_point = np.array([0, 0])

# Different tasks learn in different directions
task_a_vector = np.array([3, 1]) # Sentiment analysis
task_b_vector = np.array([1, 3]) # Spam detection

# Combined model is sum of vectors
combined = base_point + task_a_vector + task_b_vector

plt.figure(figsize=(10, 8))

plt.arrow(0, 0, task_a_vector[0], task_a_vector[1],
head_width=0.2, head_length=0.2, fc=colors[0], ec=colors[0],
label='Sentiment Task')

plt.arrow(0, 0, task_b_vector[0], task_b_vector[1],
head_width=0.2, head_length=0.2, fc=colors[1], ec=colors[1],
label='Spam Detection Task')


plt.arrow(0, 0, combined[0], combined[1],
head_width=0.2, head_length=0.2, fc=colors[3], ec=colors[3],
label='Combined Model')

plt.scatter([0], [0], c='black', s=100, label='Base Model')

plt.grid(True, alpha=0.3)
plt.legend()
plt.title('Task Vectors in Parameter Space')
plt.xlabel('Parameter Dimension 1')
plt.ylabel('Parameter Dimension 2')
plt.show()

visualize_task_spaces()

Case Study β€” Knowledge Transfer Across Domains for Sentiment Classification

Let’s say you have a good Amazon review sentiment classifier, and want to classify Yelp reviews. To predict sentiment on Yelp (even if you don’t have Yelp sentiment labels), take the Amazon sentiment vector and adjust it based on the difference in language modeling between Yelp and Amazon. This creates a new Yelp-specific sentiment vector:

This equation allows you to adapt sentiment knowledge from Amazon to Yelp without labeled Yelp data β€” just LM data. This is how task analogies work β€”a technique to generate new tasks by relating known ones using vector arithmetic, especially when no labels are available.

Step by Step Implementation

1. Get Language Modeling (LM) Vectors: You first train or fine-tune a model (like T5 or BERT) on language modeling objectives for both datasets:

  • Amazon LM vector: Ο„_amazon,lm β€” Train on Amazon text with an unsupervised objective (e.g., masked language modeling).
  • Yelp LM vector: Ο„_yelp,lm β€” Do the same for the Yelp text.

These give you task vectors representing how the model adapts to the language style and patterns of each dataset.

2. Subtract the Vectors:

This captures the domain shift β€” how Yelp differs from Amazon in language patterns.

3. Apply the Shift to a Known Task Vector – Add the difference to the Amazon sentiment vector, trained with sentiment labels:

This gives you a new task vector you can use for sentiment analysis on Yelp, even without Yelp sentiment labels.

A quick recap on how task arithmetic is implemented:

  1. Choose your base model: e.g. T5-small or LLaMA.
  2. Fine-tune on Task A: (e.g., sentiment analysis on Amazon)
  3. Fine-tune on Task B: (e.g., LM on Yelp, or sentiment on Yelp)
  4. Compute task vectors: subtract weights layer-by-layer
  5. Apply vector to base model or another model:

Conclusion

Factual knowledge could also be treated as a task: fine-tune on a small factual correction, extract the vector, and apply it. In some cases, this can replace model editing methods like ROME or MEMIT. However the task arithmetic method of model editing has its challenges as not all tasks are linearly composable as the Superposition Hypothesis doesn’t always hold in practice. Additionally model instabilities can emerge when applying large parameter deltas across different architectures. The key to successful implementation lies in understanding these limitations.

References

  • Ilharco, G., Ribeiro, M. T., Wortsman, M., Gururangan, S., Schmidt, L., Hajishirzi, H., & Farhadi, A. (2022). Editing models with task arithmetic. arXiv preprint arXiv:2212.04089. https://arxiv.org/abs/2212.04089
  • Meng, K., Bau, D., Andonian, A., & Belinkov, Y. (2022). Locating and editing factual associations in GPT. arXiv preprint arXiv:2202.05262. https://arxiv.org/abs/2202.05262
  • Meng, K., Sharma, A. S., Andonian, A., Belinkov, Y., & Bau, D. (2022). Mass-editing memory in a transformer. arXiv preprint arXiv:2210.07229. https://arxiv.org/abs/2210.07229
  • Tam, D., Bansal, M., & Raffel, C. (2023). Merging by matching models in task parameter subspaces. arXiv preprint arXiv:2312.04339. https://arxiv.org/abs/2312.04339

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓