Organizers:
Omer Reingold, Stanford University
Speakers:
Avrim Blum Toyota Technological Institute at Chicago (TTIC)
Sarah H. Cen, Carnegie Mellon University
Natalie Collina, University of Pennsylvania
Cynthia Dwork, Harvard University
Noémie Elhadad, Columbia University
Allison Koenecke, Cornell Tech
Frauke Kreuter, LMU Munich and University of Maryland,
Charlotte Peale, Stanford University
Meeting Goals:
The Simons Collaboration on the Theory of Algorithmic Fairness continues its mission of establishing firm mathematical foundations, through the lens of computer science theory, for the evolving field of algorithmic fairness.
The fifth annual meeting will highlight recent progress across the collaboration and the wider research community, drawing together theoretical advances and insights from related disciplines. Alongside presenting new results, the meeting offers space for reflection on ongoing challenges and opportunities, recognizing how the field continues to develop in response to both technological and societal change.
As in previous years, the meeting is designed to foster exchange of ideas, inspire collaborations, and deepen our shared understanding of fairness in algorithms.
Visit the Simons Collaboration on the Theory of Algorithmic Fairness Website:
Previous Meetings:
• 2022 Annual Meeting
• 2023 Annual Meeting
• 2024 Annual Meeting
• 2025 Annual Meeting
-
The fifth annual meeting of the Simons Collaboration on the Theory of Algorithmic Fairness explored recent advances in the field from a multidisciplinary perspective. The eight invited talks combined core Theory of Computing research with work from outside the collaboration to inspire new directions and connections. This year continued the investigation of large language models (LLMs), which raise new concerns about discrimination that must be rigorously formalized and addressed. Another recurring theme was the impact of data collection on fairness, particularly the importance of relying on comprehensive sets of individual features.
The meeting opened with Avrim Blum’s talk on pessimism traps and replicability in sequential decision-making. Pessimism traps were modeled through information cascades as a rational phenomenon in which communities become locked into suboptimal behavior due to self-reinforcing beliefs about the success of more ambitious actions. Blum showed how algorithmic interventions can break these traps in both single- and multi-community settings, even without knowing which actions are best for each community. He also extended the notion of replicability to adversarial online decision-making, proposing algorithms that achieve both replicability and sublinear regret, while leaving open the question of optimal regret under such constraints.
Charlotte Peale addressed a central reliability question in modern machine learning: how to quantify uncertainty in decision-relevant ways. She highlighted two limitations of calibration: it does not decompose uncertainty into epistemic and aleatoric components, and it may yield weaker error predictions than externally trained loss predictors. Peale introduced higher-order calibration as a framework for principled uncertainty decomposition and established an equivalence between multicalibration and competitiveness with external loss predictors, clarifying when models can reliably assess their own limitations.
Sarah H. Cen directly addressed the gap between research and policy in AI safety and accountability. She examined AI supply chains and how the growing ecosystem of actors complicates objectives and accountability. Turning to audits and evidentiary burdens, Cen proposed performance–fairness Pareto frontiers as tools for assessing tradeoffs, using closed-form expressions to help overcome lack of access in legal and auditing contexts. She concluded with a longitudinal study of LLM behavior during the 2024 U.S. election season, underscoring the need for ongoing monitoring as models and policies evolve.
In a complementary alignment direction, Frauke Kreuter examined how LLMs implicitly prioritize certain cultural values, often reflecting dominant viewpoints while underrepresenting diversity and value change. She introduced adaptive alignment, an approach aimed at incorporating diverse societal values and acknowledging normative tradeoffs. A central proposal was to leverage decades of population-level social science data to create dynamic, “living” benchmarks.
Allison Koenecke examined algorithmic decisions in the SNAP benefits pipeline, highlighting the role of social workers across outreach, eligibility, and benefits adjustments. Her first project studied biases in online advertising for SNAP benefits due to cost differences in reaching English- versus Spanish-speaking recipients, proposing a framework for more equitable ad allocation and finding bipartisan support for incorporating equity alongside efficiency. Her second project evaluated an LLM-enabled chatbot assisting social workers. Randomized experiments showed that access to LLM suggestions improved performance, but gains plateaued, suggesting an LLM-distrust effect.
Noémie Elhadad focused on foundation models in medicine and healthcare, noting their growing deployment despite limited understanding of their effects on patient outcomes and clinician wellbeing. She highlighted limitations of current LLMs, particularly in complex reasoning and causal inference, and outlined research directions for foundation models tailored to clinical and electronic health record data, integrating empirical observations with principles of human physiology. The talk situated fairness within clinical validity and safety, emphasizing reliability in high-stakes environments.
Several talks addressed a central challenge for the next phase of algorithmic fairness: many deployments involve humans interacting with, relying on, or competing through AI systems rather than being passively evaluated by them.
Natalie Collina examined learning and incentives in human–AI collaboration. Even when human and AI knowledge is partial and incomparable, repeated interaction can yield distribution-free guarantees that joint predictions outperform either alone. She then considered settings with misaligned incentives among AI providers and showed that under a “market alignment” assumption, competition can yield outcomes comparable to those of a perfectly aligned provider.
Finally, Cynthia Dwork addressed equitable evaluation when self-presentation differs. Equally qualified individuals may vary in confidence and communication style, making direct evaluation from self-descriptions problematic. Dwork presented an interactive AI for skill elicitation that determines skills while allowing individuals to speak in their own voice. Using LLMs as synthetic humans for training, the approach mitigates endogenous bias from self-report differences and enforces a rigorous equitability condition bounding the covariance between self-presentation manner and evaluation error.
Across eight talks, the 2026 meeting highlighted how algorithmic fairness research continues to expand while remaining grounded in rigorous theory. Fairness increasingly intersects with reliability and uncertainty; it is deeply shaped by institutional realities such as supply chains and evidentiary constraints; and it often emerges from human–AI interaction rather than from algorithms in isolation.
-
Thursday, February 5, 2026
8:30 AM CHECK-IN & BREAKFAST 9:30 AM Avrim Blum | Pessimism Traps, Algorithmic Interventions, and Replicability 11:00 AM Charlotte Peale | Uncertainty Quantification Beyond Calibration 1:00 PM Sarah H. Cen | Bridging the Gap Between Research and Policy in AI Safety and Accountability 2:30 PM Frauke Kreuter | Adaptive Integrity of AI Models Aligned with Society 4:00 PM Noémie Elhadad | Foundation Models in Medicine and Healthcare Friday, February 6, 2026
8:30 AM CHECK-IN & BREAKFAST 9:30 AM Allison Koenecke | Algorithmic Decisions in the SNAP Benefits Pipeline 11:00 AM Natalie Collina | Learning and Incentives in Human -AI Collaboration 1:00 PM Cynthia Dwork | Equitable Evaluation via Elicitation -
Avrim Blum
Toyota Technological Institute at ChicagoPessimism Traps, Algorithmic Interventions, and Replicability
View Slides (PDF)In this talk, Avrim Blum will discuss two lines of work involving sequential decision-making. The first involves “pessimism traps,” a societal phenomenon in which a community gets locked into a cycle of suboptimal decisions due to a self-reinforcing pessimism about their chance of success at more ambitious activities. In this work, we use information cascades to model this phenomenon as rational behavior under uncertainty, and examine how algorithmic interventions can be used to break these traps. We show this for both single-community and multi-community models, even when the intervening entity does not know which actions are best for which community. The second line of work involves replicability in single-agent sequential decision-making. Replicability has generally been studied in the iid data setting. Here, we propose a natural definition for replicability in the context of adversarial online decision-making, and provide algorithms that achieve both replicability and sublinear regret. We leave open the question of what is the optimal regret achievable in this setting.
This work is joint with Saba Ahmadi, Siddharth Bhandari, Emily Diana, Kavya Ravichandran, and Alexander Williams Tolbert.
Sarah H. Cen
Carnegie Mellon UniversityBridging the Gap Between Research and Policy in AI Safety and Accountability
As AI becomes increasingly integrated into both the private and public sectors, challenges around AI safety and accountability have arisen. There is a growing, compelling body of work around the legal and societal challenges that come with AI, but there is a gap in our rigorous understanding of these problems.
In this talk, Sarah H. Cen will dive deep into a few topics in AI safety and accountability. We will discuss AI supply chains (the increasingly complex ecosystem of AI actors and components that contribute to AI products) and study how AI supply chains complicate machine learning objectives. We’ll then shift our discussion to AI audits and evidentiary burdens in cases involving AI. Using Pareto frontiers as a tool for assessing performance-fairness tradeoffs, we will show how a closed-form expression for performance-fairness Pareto frontiers can help plaintiffs (or auditors) overcome evidentiary burdens or a lack of access in AI contexts. Cen will conclude with a longitudinal study of LLMs during the 2024 US election season. If time permits, we may touch on formal notions of trustworthiness.
Natalie Collina
University of PennsylvaniaLearning and Incentives in Human -AI Collaboration
As AI systems become more capable, a central challenge is designing them to work effectively with humans. Natalie Collina will first consider collaborative prediction, motivated by a doctor consulting an AI that shares the goal of accurate diagnosis. Even when the doctor and AI have only partial and incomparable knowledge, repeated interaction enables richer forms of collaboration: we give distribution-free guarantees that their combined predictions are strictly better than either alone, with regret bounds against benchmarks defined on their joint information. Natalie Collina will then revisit the alignment assumption itself. If an AI is developed by, say, a pharmaceutical company with its own incentives, how can we encourage helpful behavior? A natural scenario is that the doctor has access to multiple models, each from a different provider. Under a milder “market alignment” assumption—that the doctor’s utility lies in the convex hull of the providers’ utilities—we show that in Nash equilibrium of this competition, the doctor can achieve the same outcomes as if a perfectly aligned provider were present.
Based on joint work: Tractable Agreement Protocols (STOC’25), Collaborative Prediction (SODA’26), and Emergent Alignment via Competition (in submission).
Cynthia Dwork
Harvard UniversityEquitable Evaluation via Elicitation
Individuals with similar qualifications and skills may vary in their demeanor, or outward manner: some tend toward self-promotion while others are modest to the point of omitting crucial information. Equitable evaluation based on the self-descriptions of equally qualified job-seekers with different self-presentation styles is therefore problematic.
We build an interactive AI for skill elicitation that provides accurate determination of skills while simultaneously allowing individuals to speak in their own voice. Such a system can be deployed, for example, when a new user joins a professional networking platform, or when matching employees to needs during a company reorganization. To obtain sufficient training data, we train an LLM to act as synthetic humans.
Elicitation mitigates endogenous bias arising from individuals’ own self‑reports. To address systematic model bias we enforce a mathematically rigorous notion of equitability ensuring that the covariance between self-presentation manner and skill evaluation error is small.
Noémie Elhadad
Columbia UniversityFoundation Models in Medicine and Healthcare
Large language models have rapidly emerged as a transformative force in medicine and have inspired a range of applications now deployed at the point of patient care. Despite their growing adoption in clinical practice and research, our understanding of their capabilities remains incomplete—particularly with respect to their impact on patient outcomes and clinician wellbeing. Moreover, there is increasing recognition that current LLMs exhibit fundamental limitations in domains such as complex reasoning and causal inference.
In this presentation, Noémie Elhadad will outline current research directions in foundation models, beyond conventional large language models, that are tailored to the unique structure of clinical and electronic health record data and that integrate both empirical observations and the underlying principles of human physiology.
Allison Koenecke
Cornell TechAlgorithmic Decisions in the SNAP Benefits Pipeline
America’s Supplemental Nutrition Assistance Program (SNAP), formerly known as food stamps, helps low-income households buy nutritious food. Social workers provide pivotal support at many points of the SNAP pipeline: from spreading awareness of the program, to helping applicants determine eligibility and fill out forms, to advising on changes to benefits. In this talk, I describe two projects that ask how algorithms can play a role in easing social worker burdens without perpetuating algorithmic biases. First, we study biases in online advertising for SNAP benefits, arising from the cost differentials between English-speaking and Spanish-speaking ad recipients. We propose a methodological framework for advertisers to determine a demographically equitable allocation for ads, and find broad consensus across political identities for some degree of equity over pure efficiency in this context. Second, we study the efficiency-bias tradeoffs of a chatbot for assisting social workers in answering SNAP eligibility questions. In a randomized experiment varying the quality of LLM-generated responses, we find that social workers with access to LLM suggestions perform significantly better than those without; however, while human accuracy increases with LLM response accuracy, improvement plateaus—likely due to an LLM-distrust effect. Taken together, these studies can inform government service providers on procurement strategies in the AI age while ameliorating administrative burdens and biases.
Frauke Kreuter
LMU Munich and University of MarylandAdaptive Integrity of AI Models Aligned with Society
Consider disparate cases: a teenager asking an AI assistant whether it is “okay to have sex before marriage,” or a research institution using AI to test the cultural sensitivity of new employment surveys. In both scenarios, these systems’ responses implicitly prioritize certain cultural values over others. Large language models (LLMs) reproduce common viewpoints from training data while failing to represent population diversity or value change. As LLMs shape the most value-laden areas of our lives, this risks profound individual and societal harm. Social scientists have developed population-level datasets for studying diverse attitudes and behaviors, yet alignment methods remain disconnected from this resource, staying static and opaque.
This talk introduces the idea of adaptive alignment: a novel approach to improving generative AI that represents diverse societal values and normative principles while acknowledging trade-offs and conflicts. With the aim to create dynamic, living benchmarks, we propose a method to leverage decades of carefully selected social science data currently unused in LLMs due to privacy and format challenges. This talk will present pilots in public health and labor, domains with contested values and cross-cultural variation. The goal of this presentation is to foster an interdisciplinary discussion on usefulness, scalability, and challenges of this approach.
Charlotte Peale
Stanford UniversityUncertainty Quantification Beyond Calibration
View Slides (PDF)Calibration has emerged as a standard approach to uncertainty quantification, providing valuable insights into model reliability. However, for modern machine learning, calibration exhibits two fundamental limitations that restrict its utility:
- Inability to decompose uncertainty: Calibration fails to distinguish between epistemic uncertainty (model-based) and aleatoric uncertainty (data-based). This distinction is vital for understanding prediction errors, especially in complex, subjective tasks (e.g., language modeling), and for determining whether collecting more data could improve performance.
- Suboptimal error prediction: The uncertainty estimates produced by a calibrated model can be substantially worse than those derived from externally trained models specifically designed to predict a model’s error. This gap suggests that stronger notions of uncertainty quantification performance are required to guarantee a model’s ability to accurately self-assess its limitations.
This talk will overview two research contributions that address these fundamental limitations. First, we introduce higher-order calibration, a rigorous theoretical foundation for decomposing a model’s total uncertainty into its aleatoric and epistemic components, with formal guarantees relating the decomposition to real-world data distributions. We demonstrate the practical utility of this decomposition in uncertainty-aware model routing, where estimates are used to efficiently route queries to small models, larger models, or human experts. Second, we establish an equivalence between a model’s level of multicalibration and its competitiveness with externally-trained loss predictors. This equivalence reveals the precise conditions under which models can—or cannot—accurately assess their own limitations.
-
Watch a playlist of all presentations from this meeting here.