Simons Collaboration on the Theory of Algorithmic Fairness Annual Meeting 2022
Share This
Date & Time
Organizer:
Omer Reingold, Stanford University
Meeting Goals:
The first annual meeting of the Simons Collaboration on the Theory of Algorithmic Fairness will bring together as many of the participants of the collaboration as well as others in the areas of algorithmic fairness, theoretical computer-science and in other relevant fields. The meeting will expose the participants to the unique challenges and opportunities of this field, which presents an area of research that is complex, subtle and multidisciplinary in an intrinsic fashion. No model or definition can fully capture such complexities and partial models and solutions carry the risk of legitimizing discrimination and talks will highlight some of these challenges.
The rising importance of this area and the significant attention given to unfairness of algorithms provide opportunities for exceptional societal impact which is rare in theoretical research while the early state of this field offers the prospects of intellectual impact. The meeting will expose participants to some of the ways in which both sorts of impact have started to materialize, and to explore avenues for additional impact.
Speakers:
Noam Barda, Ben-Gurion University
Noa Dagan, Ben Gurion University and Clalit Research Institute
Avrim Blum, Toyota Technological Institute at Chicago
Cynthia Dwork, Harvard University
Shafi Goldwasser, Simons Institute for the Theory of Computing
Michael Kim, UC Berkeley
Irene Lo, Stanford University
Jamie Morgenstern, University of Washington
Helen Nissenbaum, Cornell Tech
Avrim Blum
Toyota Technological Institute at Chicago
On Learning in the Presence of Biased Data and Strategic Behavior View Slides (PDF)
In this talk Blum will discuss two lines of work involving learning in the presence of biased data and strategic behavior. In the first, he will ask whether fairness constraints on learning algorithms can actually improve the accuracy of the classifier produced when training data is unrepresentative or corrupted due to bias. Typically, fairness constraints are analyzed as a tradeoff with classical objectives such as accuracy. Blum’s results show there are natural scenarios that can be win-win, helping to improve overall accuracy. In the second line of work, he will consider strategic classification: settings where the entities being measured and classified wish to be classified as positive (e.g., college admissions) and will try to modify their observable features if possible to make that happen. Blum will consider this in the online setting, where a particular challenge is that updates made by the learning algorithm will change how the inputs behave as well.
Noam Barda
Ben-Gurion University and Noa Dagan
Ben-Gurion University & Clalit Research Institute
Analytic tasks of prediction and causal inference in health care often focus on average effects across entire study populations. These estimates ignore the heterogeneity that exists in different subgroups, eventually resulting in erroneous decisions being made for these subgroups. This talk will focus on the experience relating to improving fairness toward subgroups as experienced in Clalit Health Services, a large health care organization in Israel.
Barda and Dagan will focus first on prediction and describe the application and performance of an algorithm for improving subpopulation calibration in predictive models. The algorithm is meant to address concerns regarding potential unfairness of the models toward groups which are underrepresented in the training data set and thus might receive uncalibrated scores. The algorithm was implemented on widely used risk models, including the ACC/AHA 2013 model for cardiovascular disease and the FRAX model for osteoporotic fractures. This algorithm also played a major role in the development of a COVID-19 mortality risk prediction model at a time when individual-level data of COVID-19 patients was not yet available in Israel. The development process for this predictor will be presented. The resulting predictor was widely used within Clalit Health Services for prevention purposes, with an intention to notify high-risk members of their increased risk for mortality should they get infected, for prioritization of COVID-19 RT-PCR tests and for treatment decisions in confirmed cases.
They will then shift to causality. The effectiveness and safety of SARS-CoV-2 vaccines is a major concern during the COVID-19 pandemic. Evidence generated by randomized clinical trials (RCTs), on which regulatory bodies rely, is usually aimed at mixed populations that are not representative of the composition of the target population for the vaccines. Barda and Dagan will discuss the importance and challenges of applying causal inference methodologies to post-marketing observational data to bridge this gap and generate valuable evidence for subgroups that are not well represented in RCTs such as pregnant women, older individuals and those with multiple comorbidities. This subgroup-relevant clinical information is crucial to helping individuals within these subgroups to make informative decisions.
This talk is based on several joint works with Guy Rothblum, Gal Yona, Uri Shalit, Eitan Bachmat, Marc Lipsitch, Miguel Hernán, Ben Reis and Ran Balicer among others.
Cynthia Dwork
Harvard University
Scaffolding Sets
Predictors map individual instances in a population to the interval [0, 1]. For a collection C of subsets of a population, a predictor is multicalibrated with respect to C if it is simultaneously calibrated on each set in C. Dwork will initiate the study of the construction of scaffolding sets, a small collection S of sets with the property that multicalibration with respect to S ensures correctness, and not just calibration, of the predictor. Dwork and her collaborators’ approach is inspired by the wisdom that the intermediate layers of a neural net learn a highly structured and useful data representation.
This talk is based on joint work with Maya Burhanpurkar, Zhun Deng and Linjun Zhang.
Shafi Goldwasser
Simons Institute for the Theory of Computing
The Ability to Deny in an Online World
Plausible deniability seems like the ultimate get-out-of-jail-free card. But how can we make it work when it comes to digital information sent in a public network. Deniable encryption, defined by Canetti et al (Crypto 1997), suggests a method to achieve deniability by the sender of encrypted messages to overcome this problem. The idea is especially interesting in the context of electronic elections to eliminate the threat of vote buying after a vote has been cast, except that their encryption scheme is not homomorphic which is necessary for encrypted vote adding and achieves only a 1/poly probability of being able to successfully deny unless the ciphertexts are not compact.
Goldwasser will present two new works on the subject.
1) With Agarwal and S. Mossel (Crypto21) we define and construct sender Deniable Fully Homomorphic Encryption based on the Learning With Errors (LWE) polynomial hardness assumption. Deniable FHE enables storing encrypted data in the cloud to be processed securely without decryption, maintaining deniability of the encrypted data, as well the prevention of vote-buying in electronic voting schemes where encrypted votes can be tallied without decryption. The construction achieves compact ciphertext independently of the level of deniability- both the size of the public key and the size of the ciphertexts are bounded by a fixed polynomial, independent of the detection probability achieved by the scheme. However ,the offline running time of our encryption algorithm depends on the inverse of the detection probability, thus the scheme falls short of achieving simultaneously compactness, negligible deniability and polynomial encryption time. The online running time is polynomial.
2) With Coladangelo and Vazirani, more recently, we show a sender deniable encryption scheme where the encryption scheme is a quantum algorithm but the ciphertext is classical which is secure under the LWE polynomial hardness assumption. This scheme achieves for the first time simultaneously compactness, negligible deniability and polynomial encryption timeunder LWE. Furthermore, it is possible to extend the scheme so that coercion in an election cannot take place when the coercer is able to dictate all inputs to the deniable encryption algorithm even prior to encryption.
A key challenge in modern statistics is to ensure statistically valid inferences across diverse target populations from a fixed source of training data. Statistical techniques that guarantee this type of adaptability not only make statistical conclusions more robust to sampling bias but can also extend the benefits of evidence-based decision-making to communities that do not have the resources to collect high-quality data or computationally intensive estimations on their own. In this talk, Kim will describe a surprising technical connection between the statistical inference problem and multicalibration, a technique developed in the context of algorithmic fairness.
Concretely, the approach derives a correspondence between the fairness goal “to protect subpopulations from miscalibrated predictions” and the statistical goal “to ensure unbiased estimates on downstream target populations.” Exploiting this connection, Kim derives a single-source estimator that provides inferences that are universally adaptable to any downstream target population. The performance of the estimator is comparable to the performance of propensity score reweighting, a widely used technique that explicitly models the underlying source-target shift for every target.
This is based on joint work with Shafi Goldwasser, Christoph Kern, Frauke Kreuter and Omer Reingold.
Irene Lo
Stanford University
Improving Diversity and Equity in San Francisco School Choice
More than 65 years after school segregation was ruled unconstitutional, public schools across the United States are resegregating along racial and socioeconomic lines. Many cities have attempted to disentangle school and neighborhood segregation and improve equitable access using policies for city-wide choice. However, these policies have largely not improved patterns of segregation and inequity. From 2018 to 2020, we worked with the San Francisco Unified School District (SFUSD) to design a new policy for student assignment system that meets the district’s goals of diversity, predictability, proximity and equity of access. We used optimization techniques to augment and operationalize district-proposed policy concepts and compared the proposed policy to approaches typically suggested by the school choice literature. Using predictive-choice models developed using historical choice data, we show that the policy can improve on the status quo regarding all district goals, including decreasing the percentage of racial minorities in high-poverty schools from 29 percent to 11 percent. The policy design was enabled by close collaboration with SFUSD and approved in 2020 for use starting the 2024–25 school year.
This talk is based on joint work with Itai Ashlagi, Katherine Mentzer, and Max Allman.
Many real-world ML problems violate the assumptions that (1) training data is generated iid from the same distribution, and (2) that the training and test distributions are the same. For example, many different hospitals in different locations, serving different populations, might collect data about their patients. These datasets would ideally be combined and used to train a model which would perform well on many distributions. In this talk, Morgenstern will describe one approach to formalizing how to use data generated from multiple distributions, and on what test distributions we can guarantee this method will perform well.
This talk is based on joint work with Christopher Jung and Pranjal Awasthi.
By clicking to watch this video, you agree to our privacy policy.
Subscribe to MPS announcements and other foundation updates
We use cookies to analyze web traffic and to improve your browsing experience; full details are in our privacy policy. Please agree to accept cookies from our site.AGREE