Record-Breaking Suite of Cosmic Simulations Aims to Identify Universe’s Parameters

The CAMELS project is the largest set of detailed cosmological simulations designed to train machine-learning algorithms.

A screenshot of a collection of gas density maps taken from 80 different simulations run by the CAMELS project using code taken from SIMBA. — **Simulating density:** A collection of gas density maps taken from 80 different simulations run by the CAMELS project using code taken from SIMBA. Each panel represents a region 120 million light-years square. F. Villaescusa-Navarro, D. Angles-Alcazar, S. Genel *et al.*/*Astrophysical Journal* 2021

A handful of numbers rule the universe. These parameters determine how the universe behaves and how it looks. They tabulate how much of the universe is matter, the curvature of space and the nature of the dark energy pulling the universe apart. Understanding these numbers is key to understanding the universe.

A new project led by researchers at the Flatiron Institute‘s Center for Computational Astrophysics (CCA) in New York City aims to improve the estimation of those cosmic parameters drastically. The researchers are doing this by using machine learning to bridge the physics of the cosmos and the physics of individual galaxies. Dubbed the Cosmology and Astrophysics with MachinE Learning Simulations, or CAMELS, the project comprises a collection of thousands of cosmological simulations, each a cube measuring 120 million light-years on each side.

“This is a new way of doing cosmology — people have been trying to avoid it for years,” says CAMELS project co-leader Daniel Anglés-Alcázar, an associate research scientist at the CCA and an assistant professor at the University of Connecticut. “It requires combining the expertise of people that traditionally don’t talk to each other — in this case, the galaxy formation experts and the cosmologists.”

The plan is to feed the CAMELS simulations to machine-learning algorithms, which will then generate detailed full-scale universe simulations billions of light-years across. Those simulations will let scientists extract information across all scales of the universe, maximizing the scientific return of observational missions and increasing the chances of new discoveries. With 4,233 simulations, half of them including the smaller-scale physics important for galaxies, CAMELS is the largest suite of detailed cosmological simulations designed to train machine-learning algorithms.

CAMELS pulls everything together, and “machine learning glues it all together,” says project co-leader Francisco Villaescusa-Navarro, an associate research scholar at Princeton University.

Villaescusa-Navarro and Anglés-Alcázar lead the CAMELS project alongside CCA associate research scientist Shy Genel. The three researchers and their colleagues present the CAMELS project in a paper published July 7 in the Astrophysical Journal.

CAMELS is preparing for the slew of cutting-edge astronomical observatories that will come online in the coming years, including the Nancy Grace Roman Space Telescope, the Euclid spacecraft and the Simons Observatory. Those big-budget missions aim in part to improve estimations of cosmic and astrophysical parameters.

Indeed, the traditional way to extract information from such studies leaves a lot on the table. Scientists generate large-scale simulations of the universe with different values for the cosmic parameters. They then check the observations to see which values best reflect reality. But these large-scale simulations purposefully ignore the smaller-scale processes that shape the evolution of galaxies, such as the effects of supermassive black holes. Such processes have been just too complex and computationally demanding to model at large scales. Leaving them out, though, limits how much information astrophysicists can glean from observations. That’s an omission CAMELS is remedying.

A screenshot of two diagrams representing the large-scale distribution of dark matter for thousands of simulations performed with the IllustrisTNG (left) and SIMBA (right) galaxy formation models as part of the CAMELS project. Four smaller panels below each diagram compare the distributions of dark matter, galaxies (and their stars), gas density and gas temperature for one representative simulation as performed by each model with the same initial conditions. — The top panels show the large-scale distribution of dark matter for thousands of simulations performed with the IllustrisTNG (left) and SIMBA (right) galaxy formation models as part of the CAMELS project. The bottom panels compare the distributions of dark matter, galaxies (and their stars), gas density and gas temperature for one representative simulation as performed by each model with the same initial conditions. F. Villaescusa-Navarro, D. Angles-Alcazar, S. Genel *et al.*/*Astrophysical Journal* 2021

In the new paper, the researchers focus on two cosmic parameters: what fraction of the universe is matter and how evenly mass is distributed throughout the cosmos. Of course, CAMELS can also handle other cosmological parameters — such as the total mass of the lightest particles in nature, called neutrinos — but the researchers first wanted “an illustrative demonstration that the method works,” says Villaescusa-Navarro.

In the CAMELS suite, the values of the two chosen cosmological parameters differ from simulation to simulation. This variation provides examples of how the universe might look under different conditions, which scientists can then compare to the real, observable universe.

Traditionally, most universe-sized cosmological simulations have considered only gravity. That choice makes sense, as gravity is by far the dominant force on the universe’s largest scales and therefore provides the best bang for the buck computationally. The smaller-scale hydrodynamic forces and other processes responsible for star formation and black hole behavior are left out. But those forces are essential: Supernova explosions and supermassive black holes can blast material across a galaxy, significantly changing the galaxy’s properties and the distribution of matter on even larger scales.

“Galaxy formation physics is ignored in these gravity-only cosmological models,” Anglés-Alcázar says. “For cosmologists, that’s noise. That’s what they don’t want. For others, like me, that’s what we love.”

Genel adds, “In addition to the cosmological parameters, each CAMELS simulation has its own unique set of parameters that control the behavior and interaction of stars and black holes with their small-scale environment. This diversity of galaxy formation physics, which is a unique feature of CAMELS, allows us to study the response of cosmic properties to the assumptions in our models in a much more systematic way than previously possible.”

The CAMELS team introduced hydrodynamic forces into their simulations using code taken from two previous projects, IllustrisTNG and Simba. The CAMELS team includes multiple members of both projects, with Genel a part of the core team of IllustrisTNG and Anglés-Alcázar on the team that developed Simba. Although both IllustrisTNG and Simba model galaxy formation, they approach the problem in different ways. The CAMELS project ran sets of simulations using each project’s code to capture a more complete range of predicted possibilities.

Video Thumbnail — This video shows the evolution of dark matter density, gas density and gas temperature across multiple simulations from the CAMELS project. The simulations were run with code from either the IllustrisTNG (left) or SIMBA (right) project. F. Villaescusa-Navarro, D. Angles-Alcazar, S. Genel *et al.*/*Astrophysical Journal* 2021

The simulations required a herculean amount of computational muscle to run. Each CAMELS simulation models over 16 million dark matter particles (which interact only through gravity), with about half of the simulations also modeling over 16 million gas-like particles (which interact through gravity and hydrodynamic forces). It took months altogether for a supercomputer to churn out all the simulations.

The team’s future goal is to use the new simulations to train machine-learning algorithms to produce full-scale universe simulations that incorporate galaxy formation physics. The team will then use machine learning to extract useful information about galaxy formation and the universe from those simulations and observational data. Though that hasn’t happened yet, the team has successfully tested their new simulations with a few quick sample applications. For instance, they successfully used machine learning to predict the density of star formation in the universe accurately.

“We have gone to very, very small scales and tried to extract cosmological information, and it seems to be working,” Villaescusa-Navarro says, referring to ongoing work with the CAMELS suite.

Villaescusa-Navarro says that the CAMELS simulations will be freely available for anyone to download and use in their projects. “It will be a powerful and exciting dataset for the whole community,” he says. The openness will allow scientific breakthroughs far beyond what the team originally envisioned, he adds.

CAMELS’s success is born out of the CCA’s diverse expertise across cosmology, galaxy formation and machine learning, and its ample computational resources, Anglés-Alcázar says. “The CCA is a natural place for this to happen,” he says.

Other CCA coauthors of the new paper are former director David Spergel, group leader Rachel Somerville, research fellow Yin Li, guest researcher Valentina La Torre, former CCA research analyst Ana Maria Delgado, interim director Shirley Ho, research fellow Sultan Hassan, associate research scientist Blakesley Burkhart and research fellow Gabriella Contardo.