Taking a Fresh Look at the Cosmos

Astrophysicist Shy Genel is leveraging machine learning to reveal the most basic properties of the universe.

Shy Genel is an associate research scientist at the Flatiron Institute’s Center for Computational Astrophysics. Credit: John Smock

The universe is full of mysteries. What is it made of? Why is it expanding so fast? Are there fundamental particles and physics we’re missing?

These are some of the questions astrophysicist Shy Genel hopes to answer with his cutting-edge simulations. Though he started his career studying the formation and evolution of galaxies, he now runs and designs simulations of mini universes. Using a unique technique that leverages machine learning, he aims to provide researchers the tools that could help answer some of the biggest questions in cosmology.

Genel is an associate research scientist at the Flatiron Institute’s Center for Computational Astrophysics (CCA). Prior to that, he was a Hubble Fellow at Columbia University and completed a postdoctoral fellowship at the Harvard-Smithsonian Center for Astrophysics. Genel has a doctorate in astrophysics from the Max Planck Institute for Extraterrestrial Physics and a bachelor’s degree in physics and electrical and electronic engineering from Tel Aviv University.

Genel recently spoke to the Simons Foundation about his work and what it will take to understand the universe. The conversation has been edited for length and clarity.


What is your current research focus?

In the past couple of years, I’ve been focusing on a project called CAMELS, which stands for Cosmology and Astrophysics with MachinE Learning Simulations. It’s a very large suite of simulations of the universe.

A single cosmological volume simulated with five different galaxy formation models. The cosmic web is seen in cold gas filaments (blue) that are interspersed with hot gas jets and halos (red). The various models produce different results due to different assumptions and implementations of the galaxy formation physics. Credit: Francisco Villaescusa-Navarro and the CAMELS project

The unique feature of these simulations is that we vary the assumptions we make in the simulation. Usually, cosmological simulations use one set of “best” assumptions to create a digital universe. We’re using a different approach. We first simulate a very large number of universes, all with different assumptions. We then use machine learning to effectively average out the assumptions that we made.


What are some examples of these assumptions?

Let’s say we want to measure basic components of our universe, like the amount of dark matter it contains. To do that, we have to make assumptions about processes in the universe, like the total energy released when a supernova explodes, or how fast black holes swallow gas around them. Both of these processes affect not only the object, but its surrounding environment, which in turn can create effects on galactic and intergalactic scales. These things are not easy to measure, and there are entire subfields in astronomy dedicated to understanding them. If we want to do these types of universe-scale simulations, we have to make these kinds of assumptions. But by leveraging machine learning, we can run thousands of simulations with different assumptions about the energy of supernovae to find the best solution. Machine learning can also help illuminate connections between combinations of parameters that we might not normally see.

What does the day-to-day work on this project look like?

This project was started with two of my CCA colleagues, Francisco Villaescusa-Navarro and Daniel Anglés-Alcázar, but it’s really grown into this extended community of researchers who are interested in improving the simulations and developing extensions for their own research interests. As such, there’s a lot of organizational work. We have millions of files taking up close to a petabyte of data, which all needs to be cataloged, organized, and maintained so that it can be used effectively.

We’re also continually running new simulations, so we have to be sure we’re utilizing our supercomputer efficiently and that the simulations are running smoothly. And then there’s designing new simulations, which requires thinking about which parameters to vary, what size of universe to simulate, and the overall setup of the simulation. For this, I spent a lot of time on calls with various research teams, figuring out what would be most useful for them. Overall, it’s a nice combination of some really hands-on technical work, project organization, and then the core scientific research itself.


What do you hope researchers will ultimately be able to learn about the universe from these simulations?

We ultimately want to measure the basic properties of our universe, like how much dark matter there is or how fast the universe is expanding. Those properties are important because they can provide insight on other, larger problems, such as whether the current description of our universe, the Standard Model, is accurate. These basic properties could ultimately tell us if there are new particles or new physics missing from that description.

Currently, observations alone can’t give us these answers. That’s because there’s a lot of analysis and assumptions you have to make to go from the things you can observe — like the distribution of galaxies in the sky — to how much dark matter is in the universe. You can do this with observations plus analytical theories, but you’re limited to studying only the largest scales. If we want to get precise measurements, we need to go to smaller scales, which requires simulations.

At this point, the project is in a proof-of-concept phase, showing that this new approach is effective. We’re still developing our methods, but the idea is to compare the simulations to observations and eventually use the simulations to train sophisticated neural networks to determine the basic properties of the universe given observations from even a small part of the sky. And hopefully someday we can answer some of these big, fundamental questions about our universe.