2024 Simons Collaboration on the Theory of Algorithmic Fairness Annual Meeting

Name: 2024 Simons Collaboration on the Theory of Algorithmic Fairness Annual Meeting
Start: 2024-02-01T08:30:00-05:00
End: 2024-02-02T14:00:00-05:00
Location: Gerald D. Fischbach Auditorium

February 1 - 2, 2024

Organizers:
Omer Reingold, Stanford University

Speakers:
Ran Balicer, Clalit Research Institute and Ben-Gurion University
Parikshit Gopalan, Apple
Toni Pitassi, University of Toronto
Omer Reingold, Stanford University
Aaron Roth, University of Pennsylvania
Guy Rothblum, Weizmann Institute of Science
Jessica Sorrell, University of Pennsylvania
Ashia Wilson, MIT

Meeting Goals:
The third annual meeting of the Simons Collaboration on the Theory of Algorithmic Fairness served as an opportunity for introspective reflection on the collaboration’s journey so far and the envisioned path ahead. Committed to establishing firm mathematical foundations for the emerging area of algorithmic fairness through the lens of computer science theory, the collaboration has made significant advances. These include foundational work, community building, and outreach to other communities within this highly multidisciplinary domain. This year’s meeting spotlighted selected threads of our research and explored focus areas for the future.

The third annual meeting of the Simons Collaboration on the Theory of Algorithmic Fairness was organized to reflect on the collaboration’s journey so far and the envisioned path ahead. As such, the meeting started with a survey talk by the collaboration’s director, Omer Reingold, who formulated the collaboration’s mission of establishing firm mathematical foundations for the emerging area of algorithmic fairness through the lens of computer science theory. Reingold discussed the pivotal role CS theory has had within this highly multidisciplinary area and the important advances carried out by the collaboration. One focus of this talk was the area of multi-group fairness and the theory of multicalibration that has grown dramatically through the close collaboration between the PIs and is now the focus of a major research endeavor that exceeds the realm of the collaboration. The talk emphasized the new perspective to fairness as well as paradigm shifts to major approaches in machine learning (loss minimization) and statistics (propensity scoring). The talk also discussed the contributions of multicalibration toproblems in complexity theory (such as stronger regularity lemmas and dense-model theorems). The talk also discussed three additional research threads by the collaboration. The first two — the work on algorithmic monoculture and the work on strategic learning — give initial, yet powerful, efforts in formalizing and understanding major challenges. For example, the work on algorithmic monoculture studies the impact of using the same algorithmic tools (e.g., CV screening) by a wide array of users and how it impacts fairness towards individuals and utility to society. Finally, Reingold discussed the breakthrough result by PI Lin on program obfuscation.

The following two talks got into more details on the theory of multicalibration and its implications. Parikshit Gopalan gave a fascinating survey of the recent notion of omnipredictors that grown from the study of multicalibration. Omnipredictors suggest an alternative, and very powerful, paradigm to traditional loss minimization that allows learning one predictor that could later be used to optimize any of a large collection of objectives. PI Roth’s talk discussed insights drawn from multicalibration to giving new meaning to trustworthy machine learning. On a technical level, Roth showed how to use specialized notions of multicalibration in extremely high-dimensional domains to give efficient algorithms for a variety of adversarial learning problems. The technique relies on making predictions that are statistically unbiased, subject to a modest number of conditioning events, in a way that gives downstream decision makers strong guarantees when acting optimally as if the predictions were correct.

The next two talks, by PI Pitassi and by Jessica Sorrell, a collaboration postdoctoral fellow, discussed a new theory of reproduceable algorithms initiated in their work. Informally, a reproduceable learning algorithm is resilient to variations in its samples — with high probability, it returns the exact same output when run on different samples from the same underlying distribution. Pitassi discussed the new concept and its relation to notions of differential privacy. Sorrell discussed reproduceable learning algorithms in the highly motivated but challenging setting of reinforcement learning algorithms. Sorrell discuss the first formal replicability results for control problems, giving algorithms that converge to the same policy with high probability, even when exploring stochastic environments.

PI Rothblum described a research agenda he initiated on verifiable data science. The goal is to provide interactive proof systems to prove that a complicated data analysis was performed correctly. While verifying that the algorithm itself was applied correctly falls more closely with the theory of verification, a unique challenge of data analysis is in verifying that the algorithm relied on correct and representative data. Hence, Rothblum demonstrated the potential of systems to verify properties of distributions where the verification is super-efficient in terms of data access, communication and runtime.

The next talk was by Ran Balicer, who is an expert in public medicine, with a remarkable record of academic and applied impact. Balicer gave a fascinating account of the growing crisis facing the healthcare industry that is already having grave consequences, the urgency of wide-scale deployment of AI in health care and the risks in such deployment, particularly in terms of fairness. Balicer discussed the impact of the collaboration on an organization-level self-regulation framework, in ongoing use when introducing AI-driven tools in Israel’s largest integrated healthcare organization. Balicer raised additional real-world needs that merit the attention by the collaboration.

Finally, Ashia Wilson, who is joining the collaboration, gave a thought-provoking talk about the fairness challenges in foundation models of generative AI, such as ChatGPT. Given the transformative nature of this technology, the collaboration will continue to study the unique challenges and opportunities discussed by Wilson.

Agenda

9:30 AM	Omer Reingold \| ToC for Fairness
11:00 AM	Parikshit Gopalan \| Omniprediction and Multigroup Fairness
1:00 PM	Aaron Roth \| What Should We Trust in "Trustworthy ML"?
2:30 PM	Toni Pitassi \| Replicability
4:00 PM	Jessica Sorrell \| Replicable Reinforcement Learning

9:30 AM	Guy Rothblum \| Verifiable Data Science: Why and How
11:00 AM	Ran Balicer \| Algorithms and Fairness in Healthcare Practice
1:00 PM	Ashia Wilson \| Societal Challenges and Opportunities in Generative AI

Ran Balicer
Clalit Research Institute and Ben-Gurion University

Algorithms and Fairness in Healthcare Practice
View Slides (PDF)

Healthcare organizations have been watching the global race in recent years to develop more powerful AI tools with interest, hope and sometimes fear — fear of missing out, but also fear of introducing error and bias. While some medical-domain machine learning tools have become mainstream, most notably in the domain of imaging, most other potential use cases have been largely lagging. Evidence of unfairness introduced by use of decision support tools is now abundant in many domains, but less so in healthcare. In absence of clear concrete regulation for use of AI in healthcare, or of relevant real-world case studies, healthcare organizations are facing a significant challenge as they consider frameworks of self-regulation and risk management of AI-powered solutions. Such frameworks can serve as an opportunity for confluence of issues discussed by the fairness collaboration and real-world needs of practitioners and healthcare organizations.

In this talk, Ran Balicer will discuss the strategic view of a large healthcare organization when faced with the challenge of using AI-driven tools in practice — aims, priorities and risk management. We will provide case studies from practice at scale, early assessments of impact, and ongoing work to address issues of fairness and societal impact. Ran Balicer will also present an implemented organization-level self-regulation framework, in ongoing use when introducing AI-driven tools in Israel’s largest integrated healthcare organization. This framework, heavily impacted by the work of the fairness collaboration and annual meetings, will serve to discuss potential for high-impact further work of this collaborative, as well as additional real-world needs that merit further attention in the future.

Parikshit Gopalan
Apple

Omniprediction and Multigroup Fairness
View Slides (PDF)

Consider a scenario where we are learning a predictor, whose predictions will be evaluated by their expected loss. What if we do not know the precise loss at the time of learning, beyond some generic properties like convexity? What if the same predictor will be used in several applications in the future, each with their own loss function? Can we learn predictors that have strong guarantees?

This motivates the notion of omnipredictors: predictors with strong loss minimization guarantees across a broad family of loss functions, relative to a benchmark hypothesis class. Omniprediction turns out to be intimately connected to multigroup fairness notions such as multicalibration, and also to other topics like boosting, swap regret minimization, and the approximate rank of matrices. This talk will survey some recent work in this area, emphasizing these connections.

Toni Pitassi
University of Toronto

Replicability
View Slides (PDF)

Replicability is vital to ensuring scientific conclusions are reliable, but failures of replicability have been a major issue in nearly all scientific areas of study in recent decades. A key issue underlying the replicability crisis is the explosion of methods for data generation, screening, testing, and analysis, where, crucially, only the combinations producing the most significant results are reported. Such practices (e.g., p-hacking, data dredging) can lead to erroneous findings that appear to be significant but that don’t hold up when other researchers attempt to replicate them.

In this talk, Toni Pitassi will initiate a theory of replicable algorithms. Informally, a replicable learning algorithm is resilient to variations in its samples — with high probability, it returns the exact same output when run on different samples from the same underlying distribution. Pitassi will begin by unpacking the definition, clarifying how randomness is instrumental in balancing accuracy and reproducibility. Secondly, Pitassi will demonstrate the utility of the concept by giving sample-efficient learning algorithms for a variety of problems, including standard statistical tasks and any problem that can be learned differentially privately.

Finally, Pitassi will discuss the computational limitations imposed by enforcing replicability for machine-learning algorithms and argue that they are in tandem with the goals of differential privacy and generalization. Toni Pitassi will conclude with a discussion of recent developments in reproducibility and open problems.

Omer Reingold
Stanford University

ToC for Fairness
View Slides (PDF)

The Simons Collaboration on the Theory of Algorithmic Fairness aims at establishing firm mathematical foundations, through the lens of computer science (CS) theory, for algorithmic fairness. Omer Reingold will discuss the pivotal role CS theory has within this highly multidisciplinary area. Reingold will also consider the collaboration’s journey so far and the envisioned path ahead.

Aaron Roth
University of Pennsylvania

What Should We Trust in “Trustworthy ML”?
View Slides (PDF)

“Trustworthy” machine learning has become a buzzword in recent years. But what exactly are the semantics of the promise that we are supposed to trust? In this talk, we will make a proposal, through the lens of downstream decision makers using machine learning predictions of payoff relevant states: Predictions are “trustworthy” if it is in the interests of the downstream decision makers to act as if the predictions are correct, as opposed to gaming the system in some way. We will find that this is a fruitful idea. For many kinds of downstream tasks, predictions of the payoff relevant state that are statistically unbiased, subject to a modest number of conditioning events, suffice to give downstream decision makers strong guarantees when acting optimally as if the predictions were correct — and it is possible to efficiently produce predictions (even in adversarial environments!) that satisfy these bias properties. This methodology also gives an algorithm design principle that turns out to give new, efficient algorithms for a variety of adversarial learning problems, including obtaining subsequence regret in online combinatorial optimization problems and extensive form games, and for obtaining sequential prediction sets for multiclass classification problems that have strong, conditional coverage guarantees — directly from a black-box prediction technology, avoiding the need to choose a “score function” as in conformal prediction.

This is joint work with Georgy Noarov, Ramya Ramalingam and Stephan Xie, and builds on many foundational works arising from the Simons Collaboration on the Theory of Algorithmic Fairness.

Guy Rothblum
Apple and Weizmann Institute of Science

Verifiable Data Science: Why and How
View Slides (PDF)

With the explosive growth and impact of machine learning and data analysis algorithms, there are growing concerns that these algorithms might be corrupted. How can we guarantee that a complicated data analysis was performed correctly? Interactive proof systems, originally studied in the context of cryptography, are protocols that allow a weak verifier to verify that a complex computation was performed correctly. Guy Rothblum will survey a line of recent work, showing how to use such proof systems to verify the results of complex data analyses, where verification is super-efficient in terms of data access, communication and runtime, and generating the proof is not much more expensive than simply running the computation.

Jessica Sorrell
University of Pennsylvania

Replicable Reinforcement Learning
View Slides (PDF)

The replicability crisis in the social, behavioral and data sciences has led to the formulation of algorithm frameworks for replicability — i.e., a requirement that an algorithm produce identical outputs (with high probability) when run on two different samples from the same underlying distribution. While still in its infancy, provably replicable algorithms have been developed for many fundamental tasks in machine learning and statistics, including statistical query learning, the heavy hitters problem and distribution testing. In this talk, Jessica Sorrell will introduce the study of replicable reinforcement learning and discuss the first formal replicability results for control problems, giving algorithms that converge to the same policy with high probability, even when exploring stochastic environments.

This talk is based on joint work with Eric Eaton, Marcel Hussing and Michael Kearns.

Ashia Wilson
MIT

Societal Challenges and Opportunities in Generative AI
View Slides (PDF)
Ashia Wilson will outline both the promising opportunities and the critical challenges relating to new generative AI technologies that are of current interest. Wilson will then turn the spotlight on the audience and delve into perspectives on where researchers should channel their efforts, particularly as it relates to aligning generative AI technologies with social values like equity and transparency. The primary aim of this presentation is to foster an interactive discussion.

2024 Simons Collaboration on the Theory of Algorithmic Fairness Annual Meeting

Thursday

Friday

Videos

February 1, 2024

February 2, 2024