Optimizing Airport Gate Assignments Using Multi Objective Reinforcement Learning (MORL) — Part 1

Last Updated on September 27, 2024 by Editorial Team

Author(s): Ranjith Menon

Originally published on Towards AI.

RL is the Gate-way to Airport Efficiency (pun intended)

Ever found yourself sprinting across an airport to board a connecting flight? Or wandering endless corridors in search of your boarding gate? Efficient gate assignments can transform air travel experiences, making journeys smoother and more enjoyable. In this series, we dive into how airport gate assignments are traditionally managed and look at a novel approach using a multi-objective reinforcement learning (MORL) framework. This blog assumes familiarity with Reinforcement Learning (RL), Markov Decision Process (MDP) and Python.

Part 1 of this series defines the problem, provides an overview of the solution, and covers environment definition, features, and MDP formulation.

A Primer on Airport Gate Assignments

Traditionally, gate assignments at airports rely on resource management systems (RMS) combined with a human-in-the-loop approach. These systems, typically linear solvers, use a set of pre-defined rules and constraints. While they get the job done, they fail to adapt to new information/deviations and don’t use historical data to detect inherent patterns that could potentially impact gate assignments. This invariably causes inconvenience to passengers and impacts airport operations. Creating next-gen systems requires a strategic framework that balances multiple objectives — such as enhancing customer experience and maximizing revenue. MORL offers a novel solution by optimizing airport gate assignments to ensure passengers navigate with ease while airports benefit from increased customer spending.

To train the RL agent effectively, we simulate an airport environment using a toy problem. This model features a mini-airport with 3 gates and 4 flights operating in 30-minute intervals. The mock schedule includes 4 flights and a linked flight scenario, illustrating how passengers with connections navigate between gates. A snapshot of this schedule is shown below.

Figure 1: Mock Schedule [Image by Author]

For non-connecting flights, walking time is calculated from an imaginary security gate. The goal is to ensure optimal walking time for passengers when assigning gates.

For the given schedule, a basic gate assignment plan checks available gates at each time step, evaluates the incoming flight, and assigns a gate accordingly before moving to the next time step. The image below illustrates how this transition works.

Figure 2: Sample Gate Assignment [Image by Author]

Introducing Optimized Allocations

As seen in the previous image, the allocation process is based solely on availability, as handled by RMS, without any optimization. It’s time to introduce our first optimization feature: walking distance. We’ll implement this using a walking distance matrix, which records the distances between gates. To accommodate flights that aren’t linked, we introduce an additional gate, A0, representing the security checkpoint.

Figure 3: Walking Distance Matrix [Image by Author]

In the matrix shown, the diagonal entries are zeros, as they represent the distance between a gate and itself. To take another example, the distance from security gate A0 to boarding gate A3 is 5, indicating a 5-minute walk from the security checkpoint to gate A3. The maximum walking time between gates is capped at 14 minutes. There we have it, the walking distance feature will be used to optimize our first objective — that is, customer experience.

Introducing a Multi-Objective Framework

We will focus on two objectives for simplicity in our toy problem.

Objective 1 — Customer Experience: we could hypothesize that — in the scenario of a connecting flight, if the goal is to maximize customer experience, the solution considers the allocation of flights to gates that ensure a safe connection and also offer sufficient amenities such as restrooms and cafes along a path that has an optimal walking time.
Objective 2 — Revenue: On the other hand, if the revenue objective is prioritized over customer experience, we could hypothesize that — the solution will adjust to consider the revenue-earning touch points in the path, including shops, duty-free, etc., while not compromising on the safe connection.

To enhance this, we add two more matrices: one for the number of restrooms between gates and another for the number of retail outlets. Our final feature sets will look like the image below.

Figure 4: Complete Feature Set [Image by Author]

The first objective, customer experience, will balance walking distance and restrooms/amenities, while the second, revenue, will balance walking distance and retail outlets.

Markov Decision Process (MDP) Formulation

The airport toy problem is defined using a mock schedule in (Figure 1), gate assignment methodology — as is, shown with a view to explain how the multi-objective framework will enhance it, in (Figure 2), and objectives with feature sets (Figure 3 & 4), it’s now time to formulate an MDP.

The state space S entails all possible states s, gates g and timesteps t such that each state s ∈ S ; g ∈ G ; t ∈ T
The actions taken by the RL agent are all possible gate allocations for scheduled flights.
The distance between gates [say g(i) and g(j)] are represented as Dg(i), Dg(j)) and these are stored using matrices as part of the reward components (or) feature sets. [In addition to distance matrices, touchpoints like cafes, and restrooms are also stored in a similar structure]
The rewards — given an action and state are computed as a weighted component since this is a multi-objective solution. This is shown in [Equation 1] below, where α is the weight associated with the customer experience objective, β is the weight associated with the revenue objective, and R represents the reward itself for each of the objectives.

Equation 1: Weighted Reward

Reward Decomposition

While the weighted rewards are calculated using [Equation 1], the terms R(CE) and R(RE) include subcomponents based on the feature sets from [Figure 4]. Both objectives share walking distance, with customer experience also considering the number of restrooms and revenue factoring in the number of retail outlets. The sub-components are computed as shown in [Figure 5], where Rw is the reward for optimal walking distance, Rr is for restrooms, and Rs is for shops.

Figure 5: Reward sub-components [Image by Author]

This reward structure directs the RL agent to optimize gate allocations by balancing customer experience and revenue, while also penalizing choices that may cause operational inefficiencies. Through iterative learning, the agent enhances its strategy to maximize cumulative rewards and align gate assignments with the airport’s strategic objectives.

Continuous Integration & Training

In a dynamic airport setting, objectives can evolve with continuous integration and training. New objectives can be added with or without new feature sets, leveraging existing ones with different reward components. This design ensures adaptability and flexibility. The image below demonstrates the seamless integration and ongoing training of objectives within the environment.

Figure 7: Integration Capabilities [Image by Author]

The Importance of Dynamic Gate Allocation

Before diving into Part 2, it’s crucial to understand the need for optimization. Traditional airport systems often lead to sub-optimal decision-making due to their complexity and reliance on manual expertise. Framing this as a reinforcement learning problem allows airports to move from static to dynamic, data-driven approaches. RL leverages advanced simulations to anticipate and manage real-time disruptions, training on both historical and new data to optimize gate allocations. Transitioning from rule-based models to AI-driven solutions represents a significant leap toward more efficient, responsive, and passenger-friendly airport operations.

Summary

To summarize what we have seen so far,

Defined an airport gate assignment problem using a multi-objective RL framework.
Created a toy problem with a mini-airport and mock schedule to simulate flight arrivals.
Mimicked traditional gate allocation without optimization to highlight its limitations for both customer experience and operational efficiency.
Introduced two optimization objectives with their feature sets.
Formulated the MDP and decomposed rewards into sub-components with a weighted approach.
Addressed solution integration for new objectives and features.

In Part 2, we’ll dive into the code to build the RL environment and implement the solution. If this topic, or any related to airline operations, travel tech, or data science, interests you, connect with me on LinkedIn to continue the conversation.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Optimizing Airport Gate Assignments Using Multi Objective Reinforcement Learning (MORL) — Part 1

Author(s): Ranjith Menon

RL is the Gate-way to Airport Efficiency (pun intended)

A Primer on Airport Gate Assignments

Introducing Optimized Allocations

Introducing a Multi-Objective Framework

Markov Decision Process (MDP) Formulation

Reward Decomposition

Continuous Integration & Training

The Importance of Dynamic Gate Allocation

Summary

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

LAI #66: Information Theory for People in a Hurry

🔎 Decoding LLM Pipeline — Step 1: Input Processing & Tokenization

Meta to Launch Its Own In-House AI Chip

I Built an AI Money Coach in Python — Here’s How You Can Too (Step-by-Step Guide!)

ChatGPT Now Works Natively in Xcode and VS Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Optimizing Airport Gate Assignments Using Multi Objective Reinforcement Learning (MORL) — Part 1

Author(s): Ranjith Menon

RL is the Gate-way to Airport Efficiency (pun intended)

A Primer on Airport Gate Assignments

Introducing Optimized Allocations

Introducing a Multi-Objective Framework

Markov Decision Process (MDP) Formulation

Reward Decomposition

Continuous Integration & Training

The Importance of Dynamic Gate Allocation

Summary

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥