AI in Finance Panel: Accelerating AI Risk Mitigation with XAI and Continuous Monitoring
Last Updated on July 21, 2023 by Editorial Team
Author(s): Anusha Sethuraman
Originally published on Towards AI.
Artificial Intelligence
At the AI in Finance Summit, NY, in December 2020, Fiddler AI held a panel discussion on the state of responsible AI with a group of model risk specialists from the financial services and tech industries. Below is a summary of the discussion. You can watch the full panel discussion here.
How is AI in Finance changing the traditional practice of model risk management?
Model risk management (MRM) is a well-established practice in banking, but one that is also growing and changing rapidly due to advancements in AI. βWe run banks with models,β said Agus Sudjianto (EVP & Head of Model Risk, Wells Fargo)-a statement echoed by Jacob Kosoff (Head of MRM & Validation, Regions Bank), who added that 30% of his teamβs models are now machine learning models instead of traditional statistical approaches. Innovations from Silicon Valley, such as TensorFlow, PyTorch, and other frameworks that are predominately for deep learning, have made their way to Wall Street, accelerating the adoption of AI in Finance.
The goal of MRM, also called model safety, is to avoid the type of financial and reputational harm that models can cause when they are inevitably wrong. Machine learning models pose new challenges: they are inherently very complex, and even if issues are caught before the model is deployed, changes in the underlying data can completely alter the modelβs behavior.
MRM teams have to respond to these new requirements and become thought leaders in how to build trustworthy AI systems in finance. At banks, βitβs not only upskilling the quants, who have traditionally been using statistical models,β said Sri Krishnamurthy (CEO, QuantUniversity). βThey have to think about the whole workflow from development to production and build out different frameworks.β Silicon Valley is approaching these problems from a holistic viewpoint as well. Tulsee Doshi (Product Lead, Fairness & Responsible AI, Google) explained that responsible AI principles covering everything from scientific excellence to fairness, privacy, and security are built into Googleβs launch review process, and increasingly need to be applied to every stage of product development.
What are some strategies to implement responsible AI in Finance today?
The panelists shared some approaches they use to institute checks and balances into the model development process. At Google, Doshi said, context is everything: βHow a model is deployed in a particular product, who those users are, and how theyβre using that model is a really important part of the risk management process.β As an example, Doshi explained that an ML technology like text-to-speech can have some positive applications, particularly for accessibility, but also the potential for real harm. Instead of open sourcing a text-to-speech model that could be used broadly for any use case, βwe want to realize where the context makes sense and prioritize those use cases.β Then, the team will design metrics that are appropriate for these use cases.
Banks experience high risks and strict regulatory guidelines, and itβs crucial to have the right guardrails in place. βIn the past, the focus of data scientists was model performance and AutoMLβ¦for us, itβs very dangerous to focus on that,β Sudjianto said. At Wells Fargo, βfor every 3 model developers, we have 1 independent model evaluatorβ reporting to different parts of the organizational chain in order to avoid conflicts of interest. After articulating the use for the model, what can go wrong, and the appetite for risk, the team will evaluate all the potential root causes for a wrong prediction, from the data, to the modeling framework, to training. βThatβs why interpretability is so critical,β said Sudjianto.
To implement AI responsibly at a financial institution, having the right culture is essential. The MRM team needs to be βwilling to challenge authority, willing to challenge executives, willing to say βyour model is wrong,ββ Kosoff said. And from the top-down, everyone at the company must understand that βthis is not a compliance exercise, this is not a regulatory exerciseβ-and actually, MRM is key to protecting value.
As Krishnamurthy explained, sometimes the cultural change also means recognizing that βitβs not all about technology.β Focusing on having the latest, most sophisticated tools for deep learning systems can be dangerous for institutions just starting to move off more traditional statistical models: βYou will learn how to use the tool, but you wonβt have the conceptual grounding.β Instead, teams might need to take a step back, clearly define their goals for their models, and determine whether they have the required knowledge to use a black box ML system safely.
How do teams combat algorithmic bias?
Banks are accustomed to fighting bias in order to establish fair lending practices-but as financial institutions implement more AI systems across the board, they are confronting new kinds of algorithmic bias. These are the scenarios that keep our panelists up at night, worried about a modelβs mistake causing news outlets and government agencies to come knocking.
For example, as Sudjianto noted, there can be marketing models that seem very innocent but actually touch on issues with privacy and discrimination that are heavily regulated; NLP is also a major landmine (βlanguage by nature is very discriminatoryβ). Kosoff and Krishnamurthy gave a few more examples of potential bias, like fraud detection being more likely to flag transactions in certain zip codes, or minority customers getting a different automated call center experience.
To combat bias, teams need to consider a wide range of factors before launch, such as the modelβs use cases, limitations, data, performance, and fairness metrics. Google uses βmodel cardsβ to capture all this information. βIt forces you to document and report on what youβre doing, which helps any downstream team that would pick up that model and use it either externally or internally,β Doshi said. But even the best practices prior to launch canβt prevent the risk of some unforeseen change in the production environment. βWe donβt know what errors we will see that we didnβt think about or didnβt have the metrics for,β Doshi said.
This is where continuous monitoring comes in. Kosoff shared an example of how monitoring has been especially critical during the COVID-19 crisis. βFor fraud on a transaction for debit cards or credit cards, the most predictive variable is card present or card not presentβ-but during February and March of 2020, suddenly ML systems were detecting high amounts of fraud as customers switched to doing most or all of their shopping online.
What changes can we expect in 3β5 years?
In the next 3β5 years, we are undoubtedly going to see an explosion of increasingly complex modeling techniques-which will, in turn, put more pressure on monitoring and validation. So what changes can we expect from the responsible AI space in the near future?
Doshi noted that with whitepapers coming from the EU and movement from the US, Singapore, and other governments, βweβre going to see more and more regulation come out around actually putting in proper processes around explainability and interpretability.β There most likely will also be a shift in computer science education, so that students will graduate with training in model risk management and explainability.
Kosoff can imagine a future where there is a kind of βdriverβs licenseβ that certifies that someone understands the risks well enough in order to build models. As a step in this direction, Regions Bank is exploring the idea of having all new model developer hires spend their first 6 months embedded on the model risk team. Upon joining their permanent teams, βtheyβll be more trained, more qualified, theyβll know more aspects of the bank, and theyβll have a strong understanding of fairness and everything weβve talked about on model risk and model evaluation.β
Krishnamurthy pointed out that currently very few models are actually making it out of the exploration phase-but in the next few years, βthe production story is going to start getting consolidated.β Krishnamurthy also believes that βsome of the noise is going to subsideβ: the initial approach to throw deep learning models at everything will be replaced by a more sober understanding of the limitations. Finally, continuing a trend that began with 2020βs stay-at-home orders, cloud tools for ML will become more prominent.
In Sudjiantoβs opinion, testing is still one of the biggest gaps: βPeople talk about counterfactual testing, robustness testing-itβs still in the academic worldβ¦in the real world, itβs not scalable.β Institutions need to train individuals to be the equivalent of reliability and safety engineers for ML, and they also need the tools to operate at speed and scale and detect failures ahead of time. As Sudjianto said, βMonitoring cannot be passive anymore.β
Panelists:
Agus Sudjianto, EVP & Head of Model Risk, Wells Fargo
Jacob Kosoff, Head of MRM & Validation, Regions Bank
Sri Krishnamurthy, CEO, QuantUniversity
Tulsee Doshi, Product Lead, Fairness & Responsible AI
Krishna Gade, Founder & CEO, Fiddler
P.S. We built Fiddler to fill in these tooling gaps and help teams build trust into AI. Teams can easily import their models and data sets to Fiddler and have continuous monitoring and explanations for their models, creating a system of record for ML in production. As the responsible AI space continues to evolve, weβre very excited to share more on this topic. If youβre interested in seeing what Fiddler can do, you can sign up for a free demo here.
Originally published at https://blog.fiddler.ai on February 26, 2021.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI