Greg Bryan, Columbia University
The Simons Collaboration on Learning the Universe has made considerable progress in its second year, developing a new framework to create accelerated forward models of cosmological structure formation, and use these techniques, along with implicit and explicit descriptions of the likelihood, to infer the initial state of the observable universe.
In detail, work on this has progressed on many fronts. Collaboration scientists are developing, implementing and testing subgrid models for star formation and feedback while also laying the foundations for a deeper understanding of black hole accretion, dynamics and feedback.
Collaboration members have also made progress on accelerated forward models for both dark matter and baryons, while exploring new ways to carry out inference in high-dimensional parameter spaces. However, one of the key aspects of the last year is the push to generate a demonstration platform for our methodology using real data, based on the LtU Express idea that came out of last year’s collaboration gathering.
The 2023 annual meeting brought together collaboration members to hear about and discuss these advances, as well as plan for next year’s work.
The Learning the Universe Collaboration’s second annual meeting, held September 2023 in New York City, brought together 75 researchers from a dozen institutions across the globe. We met to discuss our progress, challenges and plans after two years of collaboration work. The second year of the Learning the Universe collaboration has seen an explosion of work and connections across all of our scientific focus areas. The collaboration has settled into a set of eight working groups which each encompass a set of closely intertwined projects. During the meeting, we heard summaries and plans of each of these groups (see abstracts for more details of each group).
We had many results to discuss, but in this report we focus on three milestones, each of which required contributions from different working groups and could not have been accomplished by any group alone, confirming the key role played by Simons Foundation collaboration funding.
The first achievement is the very first development and implementation of a “sub-grid” model for galaxy formation in a large-scale cosmological simulation that is based on accurate small-scale models. The physics of star formation and its associated feedback on the rest of the galaxy through radiation from young stars and supernovae requires very high-resolution numerical simulations to resolve accurately. Therefore, to include it in large-scale cosmological simulations required a sub-grid treatment, which in the past has been treated with ad hoc models and free parameters, resulting in uncertainty in cosmological constraints. The first of these, the “Arkenstone” wind model, has now been developed based on high-resolution small-scale simulations for inclusion in the next generation of cosmological runs that we are planning.
The second important milestone is the creation of an end-to-end pipeline for implicit-likelihood based inference of cosmological parameters that goes from cosmological parameters all the way to observations, and its first application to large-scale data (which we call the “Go Big” initiative). This tool gives us the ability, for the first time, to make full use of the constraining power of the observations, since we are no longer forced to be able to write down the full form of the likelihood for the forward models we will use. This pipeline is modular and flexible and will be able to include the next generation of forward models and observational probes that the rest of the collaboration is developing.
The third key outcome is the creation of the largest and most accurate reconstruction of the initial conditions of the observable universe carried out to date. This effort, spearheaded by the BORG Explicit Likelihood working group, has involved new forward modeling and inference techniques developed by other members of the collaboration. This map of the initial state of the universe marks a significant achievement that is already being used to advance cosmological science in other areas.
These three achievements are just some of the remarkable science that the collaboration is doing and is a down payment on improved tools for cosmology, enhanced galactic models and new inference and AI techniques that our members are hard at work on. The annual meeting was proceeded by a three-day focused meeting of the working groups, held largely in parallel to push forward the collaborative efforts and make concrete plans for each of the focus areas. In addition, we laid out a clear set of timelines and deliverables for the coming years which promises to change the way cosmological inference and galaxy formation is done.
9:30 AM Volker Springel | Interfacing Galaxy Formation with Precision Cosmology 11:00 AM Rachel Somerville | Creating Synthetic Observations for Learning the Universe 1:00 PM Jens Jasche | BORG: Inferring the Three-Dimensional Initial Conditions of the Universe 2:30 PM Eve Ostriker | Star Formation, the Interstellar Medium, and Galactic Winds: Developing Physics-Based Subgrid Models and Interfacing with Cosmological Simulations 4:00 PM Pablo Lemos | Accelerated Forward Models of the Universe
9:30 AM Shy Genel | An Expanding Set of Cosmological Simulations for Training and Testing LtU Machines 11:00 AM Ben Wandelt | Towards The Origins of the Universe with Galaxy Surveys and CMB Data 1:00 PM Greg Bryan | Learning the Universe after Two Years -- Where We Are and Where We Are Going
Abstracts & Slides
Learning the Universe after Two Years — Where We Are and Where We Are Going
View Slides (PDF)
In this talk, Greg Bryan will first provide an overview of the collaboration efforts on modeling black holes in cosmological simulations, focusing on their accretion, feedback and dynamics. Bryan will present an overview of the LtU projects that explore various aspects of this program, highlighting recent successes and challenges in understanding this multiscale problem. Bryan will emphasize a collaboration-wide effort to develop a subgrid model that evolves the black hole global properties (mass, spin, etc.) as a function of properties that can be measured in cosmological simulations of the resolution we can afford to carry out. As the final talk in this series, Bryan will present an overview of the current status of the Learning the Universe in terms of our original collaboration goals and discuss both how much we have accomplished and how much more we need to do.
An Expanding Set of Cosmological Simulations for Training and Testing LtU Machines
View Slides (PDF)
In this presentation, Shy Genel will review the work done over the past year to expand the suite of cosmological simulations that can serve as basis for generating training sets for various parts of the LtU collaboration. The expansions cover a larger number of simulation codes, larger galaxy formation model parameter spaces and larger halo masses, thereby pushing towards the limit of modeling galaxy distributions and CMB secondary anisotropies while fully taking into account the potential effects of uncertain baryonic physics. Genel will further discuss ongoing and upcoming simulation campaigns as well as opportunities that lie in deepening connections with different LtU working groups.
BORG: Inferring the Three-Dimensional Initial Conditions of the Universe
View Slides (PDF)
The primordial initial conditions and the laws of nature uniquely determine the phenomenology of observed cosmic structures. While the initial 3D phase distribution determines the specific configuration of matter, orientations of galaxy spins and velocity directions, cosmological parameters impact the average magnitude of effects like clustering strength and amplitudes of velocities and angular momentum. Despite tremendous progress in cosmology during recent decades, there still needs to be a joint analysis of phases and cosmological parameters that would result in a detailed and comprehensive model of the universe and specific causal formation history of its structure. Constructing such physically plausible digital twins of our universe from data is the driving motivation behind the BORG working group of the Simons Collaboration on Learning the Universe.
Leveraging sophisticated nonlinear physics-based structure formation models, our project endeavors to accurately reconstruct the initial conditions and trace the evolutionary trajectories of cosmic structures. Specifically, we pursue a Bayesian physical forward modeling approach to match cosmological observations at the field level and describe the universe on an object-by-object basis.
In this talk, Jens Jasche will describe a significant milestone in this endeavor. For the first time, the project aims to provide the largest, most detailed and most comprehensive inference of three-dimensional cosmic initial conditions and dynamics of cosmic structures over an unprecedented cosmological volume of (6 Gpc)^3. The project uses the existing gold-standard galaxy catalogs, including the 2M++, SDSS main galaxy, BOSS CMASS and LOWZ galaxy samples, covering a redshift range of 0< z < 0.8. While traditional cosmological simulations only reproduce random realizations of the universe, our project constructs a plausible universe representation mirroring the observed matter distribution, enabling an in-depth field-level comparison of cosmic structures. Besides providing a stringent test for the LCDM paradigm, this comprehensive analysis facilitates cross-correlations, particularly with the cosmic microwave background, and tests predictions for physics phenomena such as neutrinos and dark matter.
Jasche will further present extensions of our field-level inference approach to analyzing the dynamic large-scale structure with next-generation transient observations. Significant challenges in field-level inference consist of accurate data models for the observed galaxy distribution and sufficiently fast physics models to scale the analysis to larger volumes with higher accuracy. To address these challenges, Jasche and collaborators are developing deep learning models to overcome the galaxy biasing issue at non-linear scales, and we will showcase the promise of neural emulators of cosmological simulations and show their performance gains for field-level inference. In summary, the current state of the project opens a promising and exciting avenue forward to bridge the gap between theoretical predictions and observational data using Bayesian inference and physical posterior simulations, enabling a deeper understanding of structure formation in the universe.
MILA & Université de Montréal
Accelerated Forward Models of the Universe
Pablo Lemos will present updates from the Accelerated Forward Modelling group. Lemos will first present the first, working forward model, the SimBIG mode, describe the model and show its first results. Lemos will then present multiple projects that aim to improve and accelerate the SimBIG model, with the goal of generating fast, reliable forward models of the Universe.
Star Formation, the Interstellar Medium and Galactic Winds: Developing Physics-Based Subgrid Models and Interfacing with Cosmological Simulations
In this talk, Eve Ostriker will review the past year’s accomplishments and future year’s plans for the working group (WG) that focuses on Star Formation (SF), the Interstellar Medium (ISM) and Galactic Winds (GW). Within LtU, the main responsibilities of this WG are to develop, test and deploy new subgrid models that are crucial ingredients for galaxy formation modeling in a cosmological context. The directly observable properties of galaxies from which we infer cosmological quantities are primarily their stellar contents, and the processes that directly regulate SF to produce galaxies’ stellar content take place on scales (~parsec) that are extremely small relative to cosmologically observable volumes, and a factor ~ 10^-4 smaller than galaxies themselves. It is therefore not possible to directly resolve the physics of SF within cosmological simulations. In real galaxies, the gas content and state are affected by SF through “feedback” — the return of energy and the products of nucleosynthesis (in shorthand, “metals”) by stars to their environment primarily via stellar radiation and supernova explosions, where the latter also are responsible for generation of cosmic rays via shock acceleration. Feedback sets the ambient ISM pressure, limiting gravitational collapse, and also produces high velocity, multiphase winds that escape the galaxy, which may contain comparable or greater amounts of material to that locked up in stars.
In cosmological simulations, since it is not possible to follow these processes directly, it is necessary to apply subgrid models to treat both SF and GW, but traditionally these have relied on a combination of ad-hoc prescriptions and observational tuning. To advance the frontier of galaxy formation modeling, it is necessary to replace traditional approaches with physics-based subgrid models. With high-resolution numerical simulations that directly follow gravitational collapse and feedback processes, members of our WG are studying and characterizing the ISM, SF and GW. The simulations cover a wide variety of conditions, representing the range of properties that have prevailed in galaxies over the history of the universe. For the models, we are using a range of computational codes with differing algorithmic implementations and domains that range in size from galactic patches to whole galaxies. The results of the simulations have motivated new mathematical representations of thermal and kinetic energy distributions in GW and new approaches to the effective equation of state and rate determination of SF, and enabled the calibration of dependence on parameters that are resolution-insensitive and hence accessible in cosmological simulations. Ostriker and collaborators have completed an initial comparison between results employing traditional and new subgrid models for SF in cosmological simulations. They have also completed tests that confirm the robustness of our wind results across different computational platforms and domains. Finally, Ostriker and collaborators have completed and tested a pathfinder implementation of a new algorithm for multiphase wind launching and propagation in cosmological simulations. They have begun collaborating with the Cosmological Simulation WG on implementing new subgrid models, and looking ahead, they are very excited to see the new predictions of cosmic star formation history that will emerge from their joint work.
Creating Synthetic Observations for Learning the Universe
View Slides (PDF)
It is critical to understand how observable tracers are related to underlying physical quantities, and how observational selection effects impact our attempts to constrain cosmological and astrophysical parameters from data. Rachel Somerville will discuss the progress made by the Synthetic Observations working group in creating mock galaxy surveys and mock CMB skies.
Max Planck Institute for Astrophysics
Interfacing Galaxy Formation with Precision Cosmology
View Slides (PDF)
Cosmological inference with large galaxy surveys requires theoretical models that combine precise predictions for large-scale structure with robust and flexible galaxy formation modeling throughout a sufficiently large cosmic volume. The MillenniumTNG (MTNG) simulations make progress on this goal by combining the hydrodynamical galaxy formation model of IllustrisTNG with the large volume of the Millennium simulation. High time-resolution merger trees and direct lightcone outputs facilitate the construction of a new generation of semi-analytic galaxy formation models that can be calibrated against both the hydro simulation and observations, and then be applied to even larger volumes, as done in this project for a calculation with 1.1 trillion dark matter particles and massive neutrinos in a volume 3,000 Mpc across.
In his talk, Volker Springel will describe the methodology and selected results of MTNG, as well as the current developments within LtU to realize a next generation of cosmological simulations that aim to improve the physical fidelity of the modeling of baryonic processes related to galaxy formation.
Institut d’Astrophysique de Paris
Towards the Origins of the Universe with Galaxy Surveys and CMB Data
The Learning the Universe collaboration has achieved a series of advances that pave the way to determining the initial conditions of the cosmos. These methods extract optimal information from cosmological data, massively accelerate simulations, and perform full forward-modeling Bayesian inference that was previously intractable. Using advanced neural density estimation methods, complex multivariate posterior distributions orders of magnitude faster than traditional Markov Chain Monte Carlo approaches can now be captured. The suite of techniques has successfully reconstructed spatial matter fluctuations in the early universe from simulated late-time observations. These techniques have also allowed Ben Wandelt and collaborators to calibrate hydrodynamical simulations against observed galaxy demographics with unprecedented fidelity. Importantly, they have invented new ways to validate and compare our computational models. The breakthroughs underscore the remarkable potential of synergistically combining cosmological modeling with cutting-edge techniques from machine learning, statistical inference and high-performance computing. Wandelt and collaborators are charting a clear path to unlocking the origins of the cosmos and providing insights into the origins and evolution of the universe as we advance towards extracting joint constraints from galaxy survey and CMB data.