Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-FranΓ§ois Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Building Custom Text Classifiers with Mistral AI Classifier Factory: A Technical Guide
Artificial Intelligence   Data Science   Latest   Machine Learning

Building Custom Text Classifiers with Mistral AI Classifier Factory: A Technical Guide

Last Updated on April 22, 2025 by Editorial Team

Author(s): Vivek Tiwari

Originally published on Towards AI.

Introduction

Mistral AI has introduced the Classifier Factory, a capability designed to empower developers and enterprises to create custom text classification models. Leveraging Mistral’s efficient language models and fine-tuning methodologies, this tool provides a streamlined pathway for building classifiers tailored to specific needs, ranging from content moderation and intent detection to sentiment analysis and product categorization. The Classifier Factory is accessible both through the Mistral AI platform (β€œla plateforme”) and its API, offering flexibility in integration.

This report provides a comprehensive technical guide to utilizing the Mistral AI Classifier Factory. It details the necessary setup, data preparation requirements, the step-by-step fine-tuning workflow, methods for leveraging the resulting custom models, and illustrative examples of potential use cases. The analysis synthesizes information from available Mistral AI documentation and related resources to offer a practical overview for users seeking to implement custom classification solutions. While specific code examples from dedicated cookbooks for intent, moderation, and product classification were inaccessible during this analysis , this guide focuses on the core principles and API interactions derived from the primary Classifier Factory documentation and general fine-tuning guidelines.

The core value proposition lies in enabling the creation of specialized models that go beyond the capabilities of general-purpose language models or pre-built APIs, allowing for nuanced classification aligned with unique business logic or domain requirements.

Getting Started: Setup and Authentication

Initiating work with the Mistral AI Classifier Factory involves a standard setup procedure common to many cloud-based AI services. This familiar workflow facilitates quicker adoption for developers experienced with similar platforms.

Account and API Key Generation:

The first step is to obtain access to the Mistral AI platform. This typically involves visiting the platform website, registering an account, and navigating to the API Keys section to generate a new key.15 This API key serves as the credential for authenticating all subsequent API requests. It is crucial to keep these keys secure and avoid sharing them or embedding them directly in code; regular rotation is also recommended as a security best practice

Library Installation:

Interaction with the Mistral AI API is facilitated through client libraries. For Python development, the official mistralai library needs to be installed. This is typically done using pip:

pip install mistralai

Client Initialization:

Once the library is installed and an API key is obtained, the Mistral client can be initialized within the application code. The standard and recommended practice is to store the API key as an environment variable rather than hard coding it. This approach enhances security and simplifies key management, particularly in production environments.16

The Python client can be initialized as follows:

import os
from mistralai.client import MistralClient

# Load API key from environment variable (recommended)
api_key = os.environ.get("MISTRAL_API_KEY")
if not api_key:
raise ValueError("MISTRAL_API_KEY environment variable not set.")

client = MistralClient(api_key=api_key)
print("Mistral client initialized successfully.")

This initialized client object will be used for all subsequent interactions with the Mistral API, including file uploads, job creation, and making predictions with the fine-tuned classifier. The emphasis on using environment variables in documentation and examples points towards an encouragement of production-ready security practices from the outset.

Data Preparation: The Foundation for Classification

The performance of any classifier built using the Mistral AI Classifier Factory is fundamentally dependent on the quality and structure of the training data provided. The platform mandates a specific format for data submission.

The JSONL Format Explained:

All training and validation data must be supplied in the JSON Lines (.jsonl) format.3 In this format, each line within the file constitutes a complete, valid JSON object. This structure is advantageous for handling large datasets as it allows for efficient streaming and line-by-line processing without needing to load the entire file into memory. Strict adherence to this format is essential for successful data upload and processing.3

Structuring Data for Single-Target Classification:

For tasks where each input is assigned a single label from a predefined set (e.g., sentiment analysis, basic intent detection), the JSONL objects must follow a specific structure :

  • Raw Text Input: Each JSON object requires a β€œtext” key containing the input string and a β€œlabels” key. The β€œlabels” value is itself a dictionary containing a single key-value pair: the key represents the name of the classification task (e.g., β€œsentiment”), and the value is the corresponding label for that input text (e.g., β€œpositive”).
{"text": "I love this product!", "labels": {"sentiment": "positive"}}
{"text": "The new policy is controversial.", "labels": {"sentiment": "neutral"}}
{"text": "I don't like the new design.", "labels": {"sentiment": "negative"}}

It is also possible to provide a list of labels as the value if multiple labels within the single target are applicable.

{"text": "This product is okay, maybe good.", "labels": 
{"sentiment": ["neutral", "positive"]}}
  • Chat/Conversational Input: When classifying conversational turns, the input text is replaced by a β€œmessages” key. Its value is a list of message objects, each containing β€œrole” (e.g., β€œuser”, β€œassistant”) and β€œcontent” keys, mirroring the format used in chat completion APIs.3 The β€œlabels” structure remains the same as for raw text. This distinct structure for chat suggests the underlying model may leverage conversational context, such as turn order or speaker roles, for more accurate classification in dialogue settings. This is further supported by the existence of a specific chat classification endpoint (v1/chat/classifications) and similar conversational handling in other Mistral APIs like Moderation
{"messages": [{"role": "user", "content": "send email to mommy that i'll be going the party"}], 
"labels": {"intent": "email_sendemail"}}

Structuring Data for Multi-Target Classification:

The Classifier Factory also supports scenarios where an input needs to be classified according to multiple, independent criteria simultaneously (e.g., classifying a product review by sentiment and identifying mentioned product features). In this case, the β€œlabels” dictionary within the JSON object contains multiple key-value pairs, one for each classification target.

  • Raw Text Input:
{"text": "I love this product! It's so easy to use.", 
"labels": {"sentiment": "positive", "mentions_ease_of_use": "yes"}}
  • Chat/Conversational Input:
{"messages":,
"labels": {"country": "France", "category": ["sweet-snacks", "plant-based-foods"]}}

This flexible structure within the mandatory JSONL format allows users to define a wide array of classification tasks, accommodating both simple and complex labeling schemes within a unified framework.

Data Quality and Formatting Best Practices:

While specific minimum data requirements are not detailed in the available documentation, general machine learning principles apply. Sufficient data volume is necessary for effective fine-tuning. Observations from related cookbook code snippets suggest practical considerations: for instance, filtering out labels with very few examples (e.g., < 200 samples) and potentially balancing the dataset by capping the number of samples per label (e.g., max 600 samples) might be beneficial preprocessing steps. This implies that users should consider standard data science techniques like handling class imbalance or removing rare categories before formatting the data as JSONL, as the platform likely does not perform such complex balancing automatically.

Consistency in labeling is paramount. Ambiguous or incorrect labels will degrade the performance of the fine-tuned classifier. Finally, meticulous validation of the .jsonl file structure before uploading is crucial to prevent errors during the fine-tuning job creation process.

Fine-Tuning Your Classifier: Step-by-Step Workflow

Once the data is prepared in the correct JSONL format, the process of fine-tuning a custom classifier involves interacting with the Mistral AI API through a series of steps. This process is asynchronous, meaning a job is submitted and its progress is monitored over time.

Step 1: Uploading Training and Validation Data

The initial step is to upload the prepared .jsonl files to the Mistral platform. This makes the datasets accessible for the fine-tuning job. Each file upload requires specifying the purpose as β€œfine-tune”.

try:
*# Upload training data*with open("train.jsonl", "rb") as f_train:
training_data_file = client.files.upload(file=("train.jsonl", f_train), purpose="fine-tune")
print(f"Uploaded training file ID: {training_data_file.id}")
 *# Upload validation data (optional but highly recommended)*with open("validation.jsonl", "rb") as f_val:
validation_data_file = client.files.upload(file=("validation.jsonl", f_val), purpose="fine-tune")
print(f"Uploaded validation file ID: {validation_data_file.id}")
except FileNotFoundError as e:
print(f"Error: {e}. Please ensure data files exist.")
except Exception as e:
print(f"An error occurred during file upload: {e}")

The API responds to each successful upload with a unique file ID. These IDs are essential for referencing the datasets when creating the fine-tuning job. Using a separate validation dataset is strongly recommended for monitoring the model’s performance on unseen data during training, which helps in tuning hyperparameters and preventing overfitting.

Step 2: Creating a Fine-Tuning Job

With the file IDs obtained, a fine-tuning job can be created. This involves specifying the model, job type, data files, and various configuration parameters.

*# Assuming training_data_file and validation_data_file objects exist from Step 1# and contain.id attributes*try:
created_job = client.fine_tuning.jobs.create(
model="ministral-3b-latest", *# Currently the designated model for Classifier Factory*
job_type="classifier", *# Must be set to 'classifier'*
training_files=[training_data_file.id], *# List of one or more training file IDs*
validation_files=[validation_data_file.id], *# List of validation file IDs (optional)*
hyperparameters={
"training_steps": 100, *# Example: Adjust based on dataset size and validation metrics*"learning_rate": 4e-5 *# Example (0.00004): Requires tuning, common range 1e-5 to 5e-5# "weight_decay": 0.0 # Optional regularization parameter, may have defaults*
},
auto_start=False, *# Default: Start manually after validation. Set True to start automatically.*
integrations=
)
print(f"Created fine-tuning job with ID: {created_job.id}")
print(f"Initial job status: {created_job.status}")
except Exception as e:
print(f"An error occurred during job creation: {e}")

Parameter Deep Dive: The key parameters require careful consideration

The availability of hyperparameters like training_steps and learning_rate provides control but also necessitates careful tuning based on validation performance, often requiring experimentation. The support for validation files and integrations like Weights & Biases highlights the platform’s encouragement of rigorous monitoring during training.

Step 3: Managing and Monitoring the Job

Fine-tuning jobs progress through several states, such as VALIDATING, QUEUED, RUNNING, SUCCEEDED, FAILED, or CANCELLED.

  • Starting the Job (if auto_start=False): If the job was created with auto_start=False, it must be manually started after the platform validates the input files and configuration. This is typically done after checking that the job status indicates readiness (e.g., VALIDATED or QUEUED post-validation).
job_id = created_job.id
*# Check job status before attempting to start*
retrieved_job = client.fine_tuning.jobs.get(job_id=job_id)
print(f"Job status before starting: {retrieved_job.status}")
*# Status might be VALIDATED or QUEUED after successful validation if auto_start=False*if retrieved_job.status in: *# Adjust based on observed behavior*try:
client.fine_tuning.jobs.start(job_id=job_id)
print(f"Attempting to start job: {job_id}")
except Exception as e:
print(f"Failed to start job {job_id}: {e}")
else:
print(f"Job cannot be started in status: {retrieved_job.status}")
  • Checking Job Status: Since fine-tuning can take time, periodically checking the job status is necessary. This is done using the job ID.
import time
import json
*# from IPython.display import clear_output # Optional: for use in Jupyter notebooks*
job_id = created_job.id *# Or the ID of the job you want to monitor*print(f"Monitoring job: {job_id}")while True:
try:
retrieved_job = client.fine_tuning.jobs.get(job_id=job_id)
status = retrieved_job.status
*# clear_output(wait=True) # Optional: clear previous output in notebooks*print(f"Timestamp: {time.strftime('%Y-%m-%d %H:%M:%S')}, Job Status: {status}")
*# Optional: Display more details like metrics if available via W&B or job events# print(json.dumps(retrieved_job.model_dump(), indent=2))*if status in:
print(f"\\nJob {job_id} finished with status: {status}")
if status == "SUCCEEDED":
print(f"Fine-tuned model ID: {retrieved_job.fine_tuned_model}")
break *# Exit loop once job reaches a terminal state*
time.sleep(60) *# Poll every 60 seconds (adjust interval as needed)*except Exception as e:
print(f"An error occurred while checking job status: {e}")
*# Implement retry logic or break if error persists*
time.sleep(120) *# Wait longer after an error*
  • Listing and Cancelling Jobs: The API also provides endpoints to list all fine-tuning jobs associated with the account and to cancel a job that is currently QUEUED or RUNNING
try:
list_jobs = client.fine_tuning.jobs.list()
print("\\nListing active/recent jobs:")
for job in list_jobs.data:
print(f" ID: {job.id}, Status: {job.status}, Model: {job.fine_tuned_model or 'N/A'}")
except Exception as e:
print(f"Failed to list jobs: {e}")

Step 4: Identifying Your Fine-tuned Model

Upon successful completion (status SUCCEEDED), the details retrieved for the job will include the identifier for the newly created fine-tuned model, typically under a key like fine_tuned_model.18 This model ID is the crucial output of the fine-tuning process and is required to use the custom classifier for predictions.

Leveraging Your Custom Classifier

Once a fine-tuning job completes successfully and the fine-tuned model ID is obtained, the custom classifier is ready for use. The process involves calling specific classification endpoints provided by the Mistral API.

Using the Classification Endpoints:

Mistral provides distinct endpoints for different types of classification inputs 3:

  • v1/classifications: Designed for classifying raw text inputs.
  • v1/chat/classifications: Designed for classifying conversational inputs (using the β€œmessages” structure).

This separation reinforces the idea that the fine-tuning process optimizes the model differently based on whether it’s trained on raw text or conversational data.

Making Predictions with Your Fine-tuned Model:

To get predictions, a request is sent to the appropriate endpoint, specifying the model parameter as the ID of the fine-tuned classifier obtained previously.

While the exact Python client methods are not explicitly confirmed in the available snippets for custom classifiers, they are likely analogous to other Mistral API functionalities like moderation (client.classifiers.moderate, client.classifiers.moderateChat ). Assuming a similar pattern (client.classifiers.classify, client.classifiers.classify_chat), the usage would resemble the following:

  • Raw Text Classification Example (Hypothetical Client Method):
*# Assuming 'retrieved_job' contains the successful job details from Section 4*
fine_tuned_model_id = retrieved_job.fine_tuned_model
if not fine_tuned_model_id:
print("Error: Fine-tuned model ID not found.")
else:
texts_to_classify =
try:
*# Assuming the method is client.classifiers.classify*
response = client.classifiers.classify(
model=fine_tuned_model_id,
inputs=texts_to_classify
)
print("\\nClassification Results (Raw Text):")
print(json.dumps(response.model_dump(), indent=2)) *#.model_dump() common in pydantic models used by mistralai*except Exception as e:
print(f"An error occurred during raw text classification: {e}")
  • Chat Classification Example (Hypothetical Client Method):
fine_tuned_model_id = retrieved_job.fine_tuned_model
if not fine_tuned_model_id:
print("Error: Fine-tuned model ID not found.")
else:
*# Example: Classify the intent of the last user message in context*
conversation_input =
try:
*# Assuming the method is client.classifiers.classify_chat*
response = client.classifiers.classify_chat(
model=fine_tuned_model_id,
inputs=[conversation_input] *# API likely expects a list of conversations*
)
print("\\nClassification Results (Chat):")
print(json.dumps(response.model_dump(), indent=2))
except Exception as e:
print(f"An error occurred during chat classification: {e}")

The clear separation between the fine-tuning workflow (Sections 3 & 4) and the inference step using dedicated endpoints (Section 5) is a standard MLOps practice. It allows the trained model artifact, identified by its ID, to be deployed and used independently of the training infrastructure. Furthermore, while not explicitly confirmed for custom classifiers in the snippets, Mistral APIs often support batching (processing multiple inputs in one call), as seen in the Embeddings and Moderation APIs. It is highly probable the classification endpoints also support batching (as shown in the examples above via the inputs list) for improved efficiency, which is crucial for applications handling significant volumes of classification requests.

Understanding and Interpreting the Output:

The specific structure of the response object is not detailed in the available documentation. However, based on standard classification API patterns, the output is expected to be a list of results, corresponding to the list of inputs provided in the request. Each result object would likely contain:

  • The predicted label(s) based on the fine-tuned model.
  • Potentially, confidence scores or probabilities associated with the predicted label(s) or even scores for all possible labels defined during training.

These outputs can then be integrated into downstream applications. For example, an intent classification result could route a user query to the appropriate service, a moderation score could trigger content filtering, or sentiment analysis results could be aggregated for market insights.

Illustrative Use Cases (Conceptual)

Include a visual representation of the process flowchart if applicable. This can help to provide a clearer understanding of the process steps and their relationships.

Process Notes

Given the limitations in accessing specific cookbook examples, this section provides conceptual illustrations of how the Classifier Factory workflow and data structures can be applied to common classification tasks. These examples demonstrate the customization power of the tool.

Building an Intent Classifier:

  • Goal: To categorize user requests or utterances into predefined intents, enabling applications like chatbots or command processors to understand user goals (e.g., calendar_set, play_music, email_sendemail).
  • Process: Follow the data preparation, upload, fine-tuning, and prediction steps outlined in Sections 3–5.

Data Structure: Utilize the single-target JSONL format. For classifying single user utterances, use the β€œtext” key. For classifying turns within a dialogue, use the β€œmessages” key. The β€œlabels” dictionary would contain the intent name and the specific intent value, like {β€œintent”: β€œcalendar_set”}

JSON
*// Example training data line (text input)
{
"text": "what's on my schedule for tomorrow morning",
"labels": {"intent": "calendar_query"}
}
*// Example training data line (chat input)
{"messages":, "labels": {"intent": "play_music"}}

Building a Custom Moderation Classifier:

  • Goal: To classify text or conversational content according to custom moderation policies that might differ from or be more granular than those covered by Mistral’s standard Moderation API. This could involve identifying specific types of prohibited content, enforcing brand safety guidelines, or detecting nuanced forms of harmful speech.
  • Process: Follow the standard workflow (Sections 3–5).
  • Data Structure: Depending on the complexity of the policies, this could use either single-target or multi-target JSONL format.
  • Single-target: {β€œlabels”: {β€œcustom_policy_violation”: β€œhate_speech”}}
  • Multi-target: {β€œlabels”: {β€œis_spam”: β€œyes”, β€œcontains_pii”: β€œno”, β€œis_off_topic”: β€œyes”}} Use the β€œtext” or β€œmessages” key as appropriate for the input type.
*// Example training data line (multi-target, text input)
{
"text": "Check out this amazing deal only available today! call 555-1234",
"labels": {"is_spam": "yes", "contains_pii": "yes"}
}

Building a Product Classifier (e.g., Food Classification):

  • Goal: To categorize products based on multiple attributes simultaneously. For instance, classifying food items by country of origin, primary ingredients, and dietary categories (e.g., snacks, beverages, plant-based-foods).
  • Process: Follow the standard workflow (Sections 3–5).

Data Structure: Utilize the multi-target JSONL format. The β€œlabels” dictionary would include a key-value pair for each attribute being classified.

*// Example training data line
{
"text": "A fermented milk drink, popular in Eastern Europe.",
"labels": {"country": "Eastern Europe",
"category": ["beverages", "dairies"], "dish_name": "Kefir"}
}

These conceptual examples highlight that the primary advantage of the Classifier Factory is its adaptability. Users define the classification task, the labels, and provide the domain-specific data, enabling the creation of highly specialized models. However, the success of these custom classifiers hinges directly on the quality, quantity, and relevance of the user-provided labeled data. The platform provides the fine-tuning mechanism, but the intelligence is derived from the data.

Conclusion: Key Takeaways and Recommendations

The Mistral AI Classifier Factory presents a powerful tool for developers seeking to build custom text classification models tailored to specific needs. By leveraging efficient underlying models like ministral-3b-latest and providing a structured, API-driven workflow , it enables the creation of specialized classifiers for diverse applications including moderation, intent detection, sentiment analysis, and more. Its flexibility in handling both single and multi-target classification, as well as raw text and conversational inputs, makes it adaptable to a wide range of use cases.

Key Strengths:

  • Customization: Enables fine-tuning for specific classification tasks and labels beyond generic models.
  • Efficiency: Utilizes Mistral’s smaller, efficient models (ministral-3b-latest).
  • Flexibility: Supports single/multi-target classification and text/chat inputs.
  • Automation: Provides an API for programmatic control over the entire workflow.
  • Monitoring: Integrates with tools like Weights & Biases for tracking training progress.

However, harnessing the full potential of the Classifier Factory requires careful attention to several best practices. Success is not guaranteed merely by using the tool; it depends significantly on user input and configuration.

Recommendations for Success:

  1. Prioritize Data Quality: The foundation of any effective classifier is high-quality, consistently labeled training data. Ensure data accurately reflects the target task and is meticulously formatted according to the required JSONL structure.
  2. Ensure Data Sufficiency and Balance: Provide enough training examples for the model to learn effectively. Consider techniques to address class imbalance if certain labels are significantly rarer than others, as suggested by preprocessing steps in related examples.
  3. Utilize Validation and Monitoring: Always use a separate validation dataset and leverage monitoring tools (like the W&B integration) to track performance during training. This is crucial for tuning hyperparameters (training_steps, learning_rate) effectively and avoiding overfitting.
  4. Iterative Development: Start with smaller datasets or fewer training steps to establish a baseline. Evaluate the results and iteratively refine the data, labels, or hyperparameters based on performance on the validation set.
  5. Secure API Key Management: Adhere to security best practices by using environment variables or other secure methods for storing and accessing API keys.
  6. Prioritize Data Quality: The foundation of any effective classifier is high-quality, consistently labeled training data. Ensure data accurately reflects the target task and is meticulously formatted according to the required JSONL structure.
  7. Ensure Data Sufficiency and Balance: Provide enough training examples for the model to learn effectively. Consider techniques to address class imbalance if certain labels are significantly rarer than others, as suggested by preprocessing steps in related examples.
  8. Utilize Validation and Monitoring: Always use a separate validation dataset and leverage monitoring tools (like the W&B integration) to track performance during training. This is crucial for tuning hyperparameters (training_steps, learning_rate) effectively and avoiding overfitting.
  9. Iterative Development: Start with smaller datasets or fewer training steps to establish a baseline. Evaluate the results and iteratively refine the data, labels, or hyperparameters based on performance on the validation set.
  10. Secure API Key Management: Adhere to security best practices by using environment variables or other secure methods for storing and accessing API keys.

The Classifier Factory empowers users by providing the infrastructure for custom model creation, but this empowerment comes with the responsibility of careful data curation and methodical fine-tuning. It represents a valuable component within the broader Mistral AI ecosystem , potentially complementing other services like embeddings for data analysis or text generation models for building complex AI applications. For further exploration, users should consult the official Mistral AI documentation and the Mistral Cookbook repository for related examples and integrations.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓