Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

The Illusion of Neutrality: Unpacking the Biases Hidden Within Gen AI’s Black Box
Artificial Intelligence   Latest   Machine Learning

The Illusion of Neutrality: Unpacking the Biases Hidden Within Gen AI’s Black Box

Last Updated on April 15, 2025 by Editorial Team

Author(s): Mohit Sewak, Ph.D.

Originally published on Towards AI.

Alright, buckle up, folks! Dr. Sewak here, ready to dive into a topic that’s got more layers than a quantum onion: bias in Generative AI. We often hear about the ‘intelligence’ of these systems, but what if that intelligence is, well, a little skewed? Let’s pull back the curtain on this ‘black box’ and see what biases are lurking within.

The Illusion of Neutrality: Unpacking the Biases Hidden Within Gen AI’s Black Box

The Illusion of Neutrality: Unpacking the Biases Hidden Within Gen AI’s Black Box
Gen AI’s ‘black box’: Looks smart, but what’s it really weighing?

“The greatest trick the AI ever pulled was convincing the world it was neutral.” Okay, okay, I might be paraphrasing a classic movie line, but the sentiment rings true when we talk about Generative AI. We often perceive these systems as objective, churning out insights and creations based purely on data. But hold on to your neural networks, because that data, and the models themselves, are anything but neutral.

“The belief in AI neutrality is a dangerous fallacy. It masks the biases embedded within, leading to skewed outcomes that can perpetuate societal inequalities.” — Dr. Mohit Sewak

Pro Tip: Always question the ‘objectivity’ of any AI system. Dig deeper into its training data and architecture.

Why Gen AI’s ‘intelligence’ might be skewed, and what it means for fairness.

Garbage in, garbage out… and sometimes, even with good data, biases can sneak in.

Let’s face it, the internet, the vast ocean of text and images that often trains these models, is a reflection of us — warts and all. And guess what? We humans have a few biases ourselves. These biases, often unintentionally, seep into the data, and the AI, like a diligent but uncritical student, learns them. Recent research presented at top-tier AI conferences like NeurIPS and ICML has been shining a spotlight on just how pervasive these biases are (NeurIPS Investigating Implicit Bias in Large Language Models: A …, accessed March 9, 2025).

“AI doesn’t have opinions, it has patterns learned from data. If the data is biased, the patterns will be too.” — Dr. Mohit Sewak

Trivia: Did you know that even the way we structure our AI models can inadvertently introduce bias? It’s not just about the data!

The Many Faces of Bias: From Gender to Geography

Bias can be sneaky, sometimes even hiding in the very fabric of the AI’s code.

The landscape of bias in GenAI is as diverse as the human population it reflects, unfortunately. Let’s take a peek at some of the common culprits:

Gender Bias: The Eternal Imbalance:
Research across languages, even in Turkish LLMs, shows deeply ingrained gender biases in everything from word frequency to sentence structure (Gender Bias in Large Language Models across Multiple Languages — Semantic Scholar, accessed March 9, 2025). It’s not just about English; this is a global challenge. Even in multimodal systems that combine images and text, like those used for image retrieval, gender stereotypes can lead to biased outcomes (VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution, accessed March 9, 2025).

“The fight for gender equality shouldn’t stop at the algorithm’s door.” — Dr. Mohit Sewak

Pro Tip: When evaluating GenAI outputs, pay close attention to how it portrays different genders in various roles and contexts.

Racial Bias: A Persistent Shadow:
Studies have uncovered covert racism in major LLMs, even against speakers of African American English (Covert Racism in AI: How Language Models Are Reinforcing Outdated Stereotypes, accessed March 9, 2025). This can lead to the reinforcement of harmful stereotypes and discriminatory decisions. Shockingly, in healthcare, LLMs have shown tendencies to project differing costs and hospitalization durations based on race (Unmasking and quantifying racial bias of large language models in medical report generation — PMC — PubMed Central, accessed March 9, 2025). This isn’t just a theoretical concern; it has real-world implications.

“Ignoring racial bias in AI is like ignoring a ticking time bomb in a system meant to help us.” — Dr. Sewak

Trivia: Researchers are exploring techniques like pruning specific computational units within LLMs to try and remove racial biases (Bias in Large Language Models — and Who Should Be Held Accountable, accessed March 9, 2025). It’s like digital spring cleaning for the AI’s brain!

Socioeconomic Bias: The Hidden Divide:
While perhaps less studied than gender or racial bias, socioeconomic bias is increasingly coming into focus. LLMs can exhibit both explicit and implicit biases based on socioeconomic status, potentially perpetuating inequalities (Born With a Silver Spoon? Investigating Socioeconomic Bias in Large Language Models, accessed March 9, 2025). Even geographical biases exist, with LLMs showing negative sentiment towards locations with lower socioeconomic conditions (ICML Poster Large Language Models are Geographically Biased, accessed March 9, 2025).

“Our AI shouldn’t have a zip code bias.” — Dr. Mohit Sewak

Pro Tip: Consider how GenAI might be inadvertently reinforcing or creating new socioeconomic divides in its outputs and applications.

Implicit Bias: The Unconscious Algorithm:
A large-scale study revealed that newer or larger models don’t automatically have less implicit bias; in fact, sometimes they have more (NeurIPS Investigating Implicit Bias in Large Language Models: A …, accessed March 9, 2025). This challenges the idea that simply scaling up will solve the bias problem.

“Implicit biases in AI are like the background noise we don’t always hear but still influences our decisions.” — Dr. Sewak

Trivia: The variability in bias scores across different AI models highlights the need for standardized ways to measure this hidden bias.

Social Identity Bias: The In-Group Advantage:
LLMs can even exhibit human-like social identity bias, showing favoritism towards certain groups and hostility towards others (New Research Finds Large Language Models Exhibit Social Identity …, accessed March 9, 2025). It seems our AI is learning our tribal tendencies!

“Can our AI learn to overcome the very social biases that plague humanity?” — Dr. Sewak

Pro Tip: Be wary of AI systems used in social interactions or decision-making that might be amplifying existing social divisions.

Other Biases: The Long Tail:
Beyond these, we see geographical bias (as mentioned), and even the phenomenon where biased AI recommendations can sway unbiased human decision-makers, especially in high-stakes areas like healthcare (NeurIPS Just Following AI Orders: When Unbiased People Are Influenced By Biased AI, accessed March 9, 2025). Interestingly, some research suggests LLMs might even adapt their behavior to appear more likable when being studied (Covert Racism in AI: How Language Models Are Reinforcing Outdated Stereotypes, accessed March 9, 2025). Talk about a performance review!

“Bias in AI is like a chameleon; it can take on many forms, sometimes when we least expect it.” — Dr. Mohit Sewak

Trivia: The fact that AI might try to ‘play nice’ when being evaluated makes the job of detecting bias even trickier!

Where Does This Bias Buffet Come From? The Sources Revealed

The tangled web of bias: It comes from many sources.

Understanding the ‘why’ behind the bias is just as important as identifying the ‘what’. Here are some key culprits:

Training Data: The Original Sin (of Bias):
The vast datasets used to train LLMs are the prime suspects (Bias in Large Language Models: Origin, Evaluation, and Mitigation — arXiv, accessed March 9, 2025). Scraped from the internet, these datasets inevitably reflect our societal prejudices, stereotypes, and the underrepresentation of certain groups (How to mitigate bias in LLMs (Large Language Models) — Hello Future, accessed March 9, 2025). The sheer volume makes it incredibly difficult to curate and filter perfectly.

“You can’t bake a bias-free cake with biased ingredients.” — Dr. Sewak

Pro Tip: Data provenance and transparency are crucial. We need to know where the training data comes from and what biases it might contain.

Model Architecture and Underlying Assumptions: Built-In Biases:
Sometimes, the very design of the LLM, its architecture, and the assumptions made during development can inadvertently favor certain patterns that align with existing biases (Bias in Large Language Models: Origin, Evaluation, and Mitigation — arXiv, accessed March 9, 2025). It’s like a blueprint that’s slightly off from the start.

“Is our model’s foundation built on truly level ground?” — Dr. Sewak

Trivia: Researchers are constantly exploring new architectures and training objectives to try and minimize these inherent biases.

Human Intervention Efforts: The Debiasing Paradox:
Ironically, our attempts to reduce bias, including specific debiasing techniques, can sometimes introduce or even worsen bias (Bias in Large Language Models: Origin, Evaluation, and Mitigation — arXiv, accessed March 9, 2025). It’s a delicate balancing act, and sometimes we overcorrect.

Pro Tip: Thorough evaluation is essential after any debiasing intervention to ensure it’s actually working as intended.

Adaptation Processes and Prompt Engineering: The Guiding Hand:

How we adapt LLMs for specific tasks and the prompts we use to interact with them can significantly influence the presence and manifestation of bias (Bias in Large Language Models: Origin, Evaluation, and Mitigation — arXiv, accessed March 9, 2025). A poorly worded prompt can inadvertently trigger a biased response.

“The way we talk to our AI matters. Our prompts can either reveal or conceal underlying biases.” — Dr. Mohit Sewak

Trivia: The field of “prompt engineering” is becoming increasingly important, not just for getting the desired output, but also for mitigating bias.

Evaluating the Invisible: The Quest for Bias Benchmarks

How do we truly weigh the biases within?

You can’t fix what you can’t measure, right? That’s why the development of accurate evaluation metrics and benchmarks for LLM bias is crucial. Researchers are tackling this challenge from various angles:

Intrinsic vs. Extrinsic Methods: Looking Inside and Out:
We can analyze a model’s internal workings and word relationships (intrinsic methods) or focus on its performance and outputs on specific tasks (extrinsic methods) (Bias in Large Language Models: Origin, Evaluation, and Mitigation — arXiv, accessed March 9, 2025). Both approaches give us valuable insights.

“To truly understand bias, we need to both dissect the AI’s ‘brain’ and observe its behavior in the real world.” — Dr. Sewak

Pro Tip: A holistic evaluation approach that combines both intrinsic and extrinsic methods is likely to provide the most comprehensive understanding of bias.

Data-Level, Model-Level, and Output-Level Approaches: A Multi-Layered Examination:
We can evaluate bias by looking at the training data, the model itself, or the generated outputs (Bias in Large Language Models: Origin, Evaluation, and Mitigation — arXiv, accessed March 9, 2025). Each level provides a different perspective.

“Bias can be introduced at any stage of the AI lifecycle, so we need to check at every level.” — Dr. Sewak

Trivia: Some researchers argue that current benchmarks are too simplistic and advocate for more “RUTEd” (Realistic Use and Tangible Effects) evaluations that reflect real-world scenarios (Bias in Language Models: Beyond Trick Tests and Towards RUTEd Evaluation, arXiv).

Beyond Simple Tests: Probing Deeper:
Frameworks like hypothesis testing are being used to see if LLMs truly reason or just rely on token bias (Bias in Language Models: Beyond Trick Tests and Towards RUTEd Evaluation, arXiv). Soft-prompt tuning is being explored to quantify bias through the lens of group fairness (NeurIPS, Efficient Evaluation of Bias in Large Language Models …). And researchers are even digging into the internal mechanisms of LLM bias, like the role of neural networks and attention heads (NeurIPS Poster, UniBias: Unveiling and Mitigating LLM Bias through …). For longer, more complex text, the LTF-TEST framework helps uncover subtle biases that simpler tests might miss (NeurIPS, Large Language Models Still Exhibit Bias in Long Text).

“We need to move beyond simple ‘gotcha’ tests and develop sophisticated methods to truly understand the nuances of bias in AI.” — Dr. Mohit Sewak

Pro Tip: Stay updated on the latest research in bias evaluation, as this is a rapidly evolving field.

Bias Beyond Words: It’s Not Just an LLM Problem

Bias isn’t just a language barrier; it affects vision too.

Bias isn’t confined to the realm of language models. Generative AI in other areas, particularly computer vision, faces similar challenges:

Visionary Biases:

Research is focusing on improving bias metrics in vision-language models and developing fairer learning approaches like FairCLIP (Improving Bias Metrics in Vision-Language Models by Addressing Inherent Model Disabilities — NeurIPS, accessed March 9, 2025). The visual datasets themselves can be biased, and frameworks are being developed to identify the visual attributes contributing to this (NeurIPS Poster Understanding Bias in Large-Scale Visual Datasets, accessed March 9, 2025).

“What our AI sees, and how it interprets it, is just as susceptible to bias as what it says.” — Dr. Sewak

Trivia: Computer vision models can suffer from various types of bias, including selection bias (how the data is chosen), framing bias (the context of the images), and label bias (inaccuracies or prejudices in the annotations) (Bias Detection in Computer Vision: A Comprehensive Guide, viso.ai).

Socially Skewed Vision:
Vision-language models like CLIP can learn undesirable social biases, leading to problematic associations between images and text (Identifying Implicit Social Biases in Vision-Language Models, ICML 2025). Analyzing this bias from a causal perspective reveals that image features often contribute more to bias than text features (Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective, ACL Anthology). The Bias-to-Text (B2T) framework offers a novel way to interpret visual biases as keywords extracted from mis-predicted images (Discovering and Mitigating Visual Biases through Keyword …).

“A picture might be worth a thousand words, but if the picture is biased, those words can be harmful.” — Dr. Mohit Sewak

Pro Tip: When working with vision-language models, be mindful of the potential for social biases in how they connect images and text.

Pruning and Prejudice:

Even model compression techniques like pruning can inadvertently introduce or worsen bias in vision models (openaccess.thecvf.com).

“Sometimes, in trying to make our AI leaner, we inadvertently make it meaner.” — Dr. Sewak

Trivia: The Computer Vision and Pattern Recognition Conference (CVPR) has been increasingly focusing on the ethical implications of AI, with dedicated sessions on bias mitigation (CVPR 2023, Tackle Bias Mitigation and Fair Representation in …,).

Fairness in Generation:
Ensuring fairness in generative models that create new data presents unique challenges. Traditional fairness metrics might not be accurate, leading to frameworks like CLEAM (CLassifier Error-Aware Measurement) for more reliable assessment (On Measuring Fairness in Generative Modelling, NeurIPS 2023). Bias in GenAI can be a double-edged sword, potentially revealing latent social disparities (Bias as a Feature, ICML 2025). Techniques like FairQueue are being developed to improve both the quality and fairness of text-to-image generation (FairQueue: Rethinking Prompt Learning for Fair Text-to-Image …). Generative models can even be used as a tool to mitigate bias in other AI systems by creating synthetic data for underrepresented groups (Generative models improve fairness of medical classifiers under …).

“The goal isn’t just to generate data, but to generate fair and representative data.” — Dr. Sewak

Pro Tip: Explore the fairness implications of different generative modeling techniques and stay informed about new frameworks for measuring and mitigating bias.

Bias in Action: Real-World Implications

When bias meets the real world: The consequences can be significant.

The biases we’ve discussed aren’t just academic curiosities; they have significant implications across various sectors:

Healthcare: A Matter of Life and Fairness:
Biased AI recommendations can influence medical decisions, potentially leading to suboptimal patient care (NeurIPS, Just Following AI Orders: When Unbiased People Are Influenced By Biased AI). The racial biases found in LLMs used for medical report generation are particularly concerning (Unmasking and quantifying racial bias of large language models in medical report generation, PubMed Central). Even the representation of medical specialties based on gender and race in generative AI can perpetuate stereotypes (What Goes In, Must Come Out: Generative Artificial Intelligence Does Not Present Algorithmic Bias Across Race and Gender in Medical Residency Specialties , PubMed Central).

“In healthcare, algorithmic bias can have life-or-death consequences. Fairness is not just a preference; it’s a necessity.” — Dr. Mohit Sewak

Trivia: Generative AI is being explored to create synthetic health data for underrepresented groups to help mitigate bias and improve fairness in medical AI (Generative models improve fairness of medical classifiers under).

Finance: The Bottom Line of Bias:

Bias in GenAI can negatively impact businesses using this technology for market segmentation, product design, and customer engagement, potentially leading to poor risk management and decision-making (Bias in Generative AI — Addressing The Risk — I by IMD, accessed March 9, 2025).

“Can we trust AI to manage our finances fairly if it’s learned our biases?” — Dr. Sewak

Pro Tip: Businesses need to be acutely aware of potential biases in GenAI used for financial applications and implement rigorous auditing processes.

Education: Shaping Young Minds Fairly:
LLMs have the potential to perpetuate inequalities and reinforce bias in educational settings (Assessing Bias in Large Language Models, Miami University). Biased algorithms in educational AI systems can affect student evaluation and admissions (AI Bias — Artificial Intelligence in Education, Marian University). Even generative AI used as an educational tool can inadvertently propagate harmful stereotypes (Bias in Generative AI, andrew.cmu.ed).

“Our educational AI should be a tool for empowerment, not a mirror reflecting societal biases onto the next generation.” — Dr. Sewak

Trivia: Learning design professionals are developing strategies to reduce bias in generative AI-created educational content, including bias awareness training and diverse content review (Bias with Generative AI, EKU Online).

Law and Policy: Justice for All, Algorithms Included:

The legal and policy implications of bias in LLMs are being actively discussed, with suggestions that accountability should lie with those deploying the models (Bias in Large Language Models — and Who Should Be Held Accountable). Research is also surveying the types of bias present in legal data used for developing generative legal AI tools (Bias in Legal Data for Generative AI, ICML 2025).

Pro Tip: Robust bias audits and clear lines of accountability are essential for the responsible deployment of GenAI in legal and policy contexts.

The Path Forward: Discussion and Future Directions

The road ahead: Key directions for tackling bias in GenAI.

We’ve come a long way in understanding bias in GenAI, but the journey is far from over. Here are some key areas for future focus:

  • Smarter Evaluations: We need to develop more robust and context-aware bias evaluation benchmarks that can accurately assess fairness in diverse real-world applications.
  • Understanding Intersections: Further research into how different types of bias interact is crucial for a more comprehensive understanding of the problem.
  • Ethical Introspection: We need to carefully consider the ethical implications of using biased AI, even for societal introspection.
  • Generalizable Solutions: The development of more effective and generalizable bias mitigation techniques that work across different models and data types is a top priority.
  • Long-Term Impact Assessment: Studying the long-term societal impacts of deploying biased LLMs and GenAI is essential to anticipate and mitigate potential harms.
  • Diversity in Development: Addressing the power imbalances in LLM development and promoting more diverse perspectives among researchers and developers are critical steps towards building fairer AI systems.

“The quest for truly fair and equitable AI requires a multi-faceted approach, combining technical innovation with ethical reflection and a commitment to inclusivity.” — Dr. Mohit Sewak

Pro Tip: Engage in discussions about the ethical implications of AI bias and advocate for responsible development practices.

Conclusion: Towards a More Responsible AI Future

Together, we can build a more responsible and unbiased AI future.

The research is clear: bias and discrimination are significant challenges in Large Language Models and Generative AI. While we’ve made considerable progress in identifying, evaluating, and mitigating these biases, the journey towards truly fair and equitable AI requires continued, concerted effort. We need standardized evaluation metrics, prioritized bias mitigation in model development, and a multi-disciplinary approach involving researchers, ethicists, policymakers, and the broader community. Let’s work together to ensure that these powerful technologies benefit everyone, without perpetuating the biases of the past.

Stay tuned for more insights on Responsible Gen AI!

References

I. Top AI Conference Publications on Bias:

  • Kumar, D., Jain, U., Agarwal, S., & Harshangi, P. Investigating Implicit Bias in Large Language Models: A Large-Scale Study of Over 50 LLMs. In Neurips Safe Generative AI Workshop 2024.
  • Hall, S. M., Gonçalves Abrantes, F., Zhu, H., Sodunke, G., Shtedritski, A., & Kirk, H. R. (2023). Visogender: A dataset for benchmarking gender bias in image-text pronoun resolution. Advances in Neural Information Processing Systems, 36, 63687–63723.
  • Manvi, R., Khanna, S., Burke, M., Lobell, D. B., & Ermon, S. (2024, July). Large Language Models are Geographically Biased. In International Conference on Machine Learning (pp. 34654–34669). PMLR.
  • Adam, H., Balagopalan, A., Alsentzer, E., Christia, F., & Ghassemi, M. Just Following AI Orders: When Unbiased People Are Influenced By Biased AI. In Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022.
  • Zeng, B., Yin, Y., & Liu, Z. Understanding Bias in Large-Scale Visual Datasets. In The Thirty-eighth Annual Conference on Neural Information Processing Systems.
  • Tian, J. J., Emerson, D., Pandya, D., Seyyed-Kalantari, L., & Khattak, F. (2023). Efficient evaluation of bias in large language models through prompt tuning. In Socially Responsible Language Modelling Research.
  • Zhou, H., Feng, Z., Zhu, Z., Qian, J., & Mao, K. UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation. In The Thirty-eighth Annual Conference on Neural Information Processing Systems.
  • Jeung, W., Jeon, D., Yousefpour, A., & Choi, J. Large Language Models Still Exhibit Bias in Long Text. In Workshop on Socially Responsible Language Modelling Research.
  • Keerthi, L. B. D. S. S., & Kumaraguru, G. S. G. P. Improving Bias Metrics in Vision-Language Models by Addressing Inherent Model Disabilities. https://neurips.cc/virtual/2024/101533
  • Luo, Y., Shi, M., Khan, M. O., Afzal, M. M., Huang, H., Yuan, S., … & Wang, M. (2024). Fairclip: Harnessing fairness in vision-language learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 12289–12301).
  • Hamidieh, K., Zhang, H., Gerych, W., Hartvigsen, T., & Ghassemi, M. (2024, October). Identifying implicit social biases in vision-language models. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (Vol. 7, pp. 547–561).
  • Kim, Y., Mo, S., Kim, M., Lee, K., Lee, J., & Shin, J. (2024). Discovering and mitigating visual biases through keyword explanation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 11082–11092).
  • Teo, C., Abdollahzadeh, M., & Cheung, N. M. M. (2023). On measuring fairness in generative models. Advances in Neural Information Processing Systems, 36, 10644–10656.
  • Bias as a Feature — ICML 2025. (n.d.). ICML. https://icml.cc/virtual/2024/39188
  • Liu, K., Ding, Z., Isik, B., & Koyejo, S. On Fairness Implications and Evaluations of Low-Rank Adaptation of Large Models. In ICLR 2024 Workshop on Reliable and Responsible Foundation Models.
  • Teo, C., Abdollahzadeh, M., Ma, X., & Cheung, N. M. M. (2024). FairQueue: Rethinking Prompt Learning for Fair Text-to-Image Generation. Advances in Neural Information Processing Systems, 37, 22878–22926.
  • Jain, A., Nobahari, R., Baratin, A., & Sarao Mannelli, S. (2024). Bias in motion: Theoretical insights into the dynamics of bias in sgd training. Advances in Neural Information Processing Systems, 37, 24435–24471.
  • Bias in Legal Data for Generative AI — ICML 2025. (n.d.). ICML. https://icml.cc/virtual/2024/39169

II. Other Reputed Academic Publications and Institutions:

  • Mehrabi, N., & others. (2024). Bias and Fairness in Large Language Models: A Survey. Computational Linguistics, 50(3), 1097–1139. (30, 32)
  • Yang, Y., Liu, X., Jin, Q., Huang, F., & Lu, Z. (2024). Unmasking and quantifying racial bias of large language models in medical report generation. Communications Medicine, 4(1), 176.
  • Bartl, M., Mandal, A., Leavy, S., & Little, S. (2025). Gender Bias in Natural Language Processing and Computer Vision: A Comparative Survey. ACM Computing Surveys, 57(6), 1–36.
  • Bias Detection in Computer Vision: A Comprehensive Guide — viso.ai. (n.d.). Viso.ai. Retrieved from https://viso.ai/computer-vision/bias-detection/
  • Ktena, I., Wiles, O., Albuquerque, I., Rebuffi, S. A., Tanno, R., Roy, A. G., … & Gowal, S. (2024). Generative models improve fairness of medical classifiers under distribution shifts. Nature Medicine, 30(4), 1166–1173.
  • Lin, S., Pandit, S., Tritsch, T., Levy, A., & Shoja, M. M. (2024). What goes in, must come out: generative artificial intelligence does not present algorithmic bias across race and gender in medical residency specialties. Cureus, 16(2).

III. arXiv Preprints:

  • Guo, Y., Guo, M., Su, J., Yang, Z., Zhu, M., Li, H., … & Liu, S. S. (2024). Bias in large language models: Origin, evaluation, and mitigation. arXiv preprint arXiv:2411.10915.
  • Zhao, J., Ding, Y., Jia, C., Wang, Y., & Qian, Z. (2024). Gender bias in large language models across multiple languages. arXiv preprint arXiv:2403.00277.
  • Giorgi, T., Cima, L., Fagni, T., Avvenuti, M., & Cresci, S. (2024). Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets. arXiv preprint arXiv:2410.07991.
  • Singh, S., Keshari, S., Jain, V., & Chadha, A. (2024). Born With a Silver Spoon? Investigating Socioeconomic Bias in Large Language Models. arXiv preprint arXiv:2403.14633.

Disclaimers and Disclosures

This article combines the theoretical insights of leading researchers with practical examples, and offers my opinionated exploration of AI’s ethical dilemmas, and may not represent the views or claims of my present or past organizations and their products or my other associations.

Use of AI Assistance: In the preparation for this article, AI assistance has been used for generating/ refining the images, and for styling/ linguistic enhancements of parts of content.

License: This work is licensed under a CC BY-NC-ND 4.0 license.
Attribution Example: “This content is based on ‘[Title of Article/ Blog/ Post]’ by Dr. Mohit Sewak, [Link to Article/ Blog/ Post], licensed under CC BY-NC-ND 4.0.”

Follow me on: | Medium | LinkedIn | SubStack | X | YouTube |

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.