Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

Optimizing Airport Gate Assignments Using Multi Objective Reinforcement Learning (MORL) — Part 1
Artificial Intelligence   Data Science   Latest   Machine Learning

Optimizing Airport Gate Assignments Using Multi Objective Reinforcement Learning (MORL) — Part 1

Last Updated on September 27, 2024 by Editorial Team

Author(s): Ranjith Menon

Originally published on Towards AI.

RL is the Gate-way to Airport Efficiency (pun intended)

Ever found yourself sprinting across an airport to board a connecting flight? Or wandering endless corridors in search of your boarding gate? Efficient gate assignments can transform air travel experiences, making journeys smoother and more enjoyable. In this series, we dive into how airport gate assignments are traditionally managed and look at a novel approach using a multi-objective reinforcement learning (MORL) framework. This blog assumes familiarity with Reinforcement Learning (RL), Markov Decision Process (MDP) and Python.

Part 1 of this series defines the problem, provides an overview of the solution, and covers environment definition, features, and MDP formulation.

Photo by Rob Wilson on Unsplash

A Primer on Airport Gate Assignments

Traditionally, gate assignments at airports rely on resource management systems (RMS) combined with a human-in-the-loop approach. These systems, typically linear solvers, use a set of pre-defined rules and constraints. While they get the job done, they fail to adapt to new information/deviations and don’t use historical data to detect inherent patterns that could potentially impact gate assignments. This invariably causes inconvenience to passengers and impacts airport operations. Creating next-gen systems requires a strategic framework that balances multiple objectives — such as enhancing customer experience and maximizing revenue. MORL offers a novel solution by optimizing airport gate assignments to ensure passengers navigate with ease while airports benefit from increased customer spending.

To train the RL agent effectively, we simulate an airport environment using a toy problem. This model features a mini-airport with 3 gates and 4 flights operating in 30-minute intervals. The mock schedule includes 4 flights and a linked flight scenario, illustrating how passengers with connections navigate between gates. A snapshot of this schedule is shown below.

Figure 1: Mock Schedule [Image by Author]

For non-connecting flights, walking time is calculated from an imaginary security gate. The goal is to ensure optimal walking time for passengers when assigning gates.

For the given schedule, a basic gate assignment plan checks available gates at each time step, evaluates the incoming flight, and assigns a gate accordingly before moving to the next time step. The image below illustrates how this transition works.

Figure 2: Sample Gate Assignment [Image by Author]

Introducing Optimized Allocations

As seen in the previous image, the allocation process is based solely on availability, as handled by RMS, without any optimization. It’s time to introduce our first optimization feature: walking distance. We’ll implement this using a walking distance matrix, which records the distances between gates. To accommodate flights that aren’t linked, we introduce an additional gate, A0, representing the security checkpoint.

Figure 3: Walking Distance Matrix [Image by Author]

In the matrix shown, the diagonal entries are zeros, as they represent the distance between a gate and itself. To take another example, the distance from security gate A0 to boarding gate A3 is 5, indicating a 5-minute walk from the security checkpoint to gate A3. The maximum walking time between gates is capped at 14 minutes. There we have it, the walking distance feature will be used to optimize our first objective — that is, customer experience.

Introducing a Multi-Objective Framework

We will focus on two objectives for simplicity in our toy problem.

  • Objective 1 — Customer Experience: we could hypothesize that — in the scenario of a connecting flight, if the goal is to maximize customer experience, the solution considers the allocation of flights to gates that ensure a safe connection and also offer sufficient amenities such as restrooms and cafes along a path that has an optimal walking time.
  • Objective 2 — Revenue: On the other hand, if the revenue objective is prioritized over customer experience, we could hypothesize that — the solution will adjust to consider the revenue-earning touch points in the path, including shops, duty-free, etc., while not compromising on the safe connection.

To enhance this, we add two more matrices: one for the number of restrooms between gates and another for the number of retail outlets. Our final feature sets will look like the image below.

Figure 4: Complete Feature Set [Image by Author]

The first objective, customer experience, will balance walking distance and restrooms/amenities, while the second, revenue, will balance walking distance and retail outlets.

Markov Decision Process (MDP) Formulation

The airport toy problem is defined using a mock schedule in (Figure 1), gate assignment methodology — as is, shown with a view to explain how the multi-objective framework will enhance it, in (Figure 2), and objectives with feature sets (Figure 3 & 4), it’s now time to formulate an MDP.

  • The state space S entails all possible states s, gates g and timesteps t such that each state s ∈ S ; g ∈ G ; t ∈ T
  • The actions taken by the RL agent are all possible gate allocations for scheduled flights.
  • The distance between gates [say g(i) and g(j)] are represented as Dg(i), Dg(j)) and these are stored using matrices as part of the reward components (or) feature sets. [In addition to distance matrices, touchpoints like cafes, and restrooms are also stored in a similar structure]
  • The rewards — given an action and state are computed as a weighted component since this is a multi-objective solution. This is shown in [Equation 1] below, where α is the weight associated with the customer experience objective, β is the weight associated with the revenue objective, and R represents the reward itself for each of the objectives.
Equation 1: Weighted Reward

Reward Decomposition

While the weighted rewards are calculated using [Equation 1], the terms R(CE) and R(RE) include subcomponents based on the feature sets from [Figure 4]. Both objectives share walking distance, with customer experience also considering the number of restrooms and revenue factoring in the number of retail outlets. The sub-components are computed as shown in [Figure 5], where Rw is the reward for optimal walking distance, Rr is for restrooms, and Rs is for shops.

Figure 5: Reward sub-components [Image by Author]
Figure 6: Legend [Image by Author]

This reward structure directs the RL agent to optimize gate allocations by balancing customer experience and revenue, while also penalizing choices that may cause operational inefficiencies. Through iterative learning, the agent enhances its strategy to maximize cumulative rewards and align gate assignments with the airport’s strategic objectives.

Continuous Integration & Training

In a dynamic airport setting, objectives can evolve with continuous integration and training. New objectives can be added with or without new feature sets, leveraging existing ones with different reward components. This design ensures adaptability and flexibility. The image below demonstrates the seamless integration and ongoing training of objectives within the environment.

Figure 7: Integration Capabilities [Image by Author]

The Importance of Dynamic Gate Allocation

Before diving into Part 2, it’s crucial to understand the need for optimization. Traditional airport systems often lead to sub-optimal decision-making due to their complexity and reliance on manual expertise. Framing this as a reinforcement learning problem allows airports to move from static to dynamic, data-driven approaches. RL leverages advanced simulations to anticipate and manage real-time disruptions, training on both historical and new data to optimize gate allocations. Transitioning from rule-based models to AI-driven solutions represents a significant leap toward more efficient, responsive, and passenger-friendly airport operations.

Summary

To summarize what we have seen so far,

  • Defined an airport gate assignment problem using a multi-objective RL framework.
  • Created a toy problem with a mini-airport and mock schedule to simulate flight arrivals.
  • Mimicked traditional gate allocation without optimization to highlight its limitations for both customer experience and operational efficiency.
  • Introduced two optimization objectives with their feature sets.
  • Formulated the MDP and decomposed rewards into sub-components with a weighted approach.
  • Addressed solution integration for new objectives and features.

In Part 2, we’ll dive into the code to build the RL environment and implement the solution. If this topic, or any related to airline operations, travel tech, or data science, interests you, connect with me on LinkedIn to continue the conversation.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓