13 Publications

Automated Machine Learning Pipeline: Large Language Models-Assisted Automated Data set Generation for Training Machine-Learned Interatomic Potentials

Adam Lahouari, J. Rogal, Mark E. Tuckerman

Machine learning interatomic potentials (MLIPs) have become powerful tools to extend molecular simulations beyond the limits of quantum methods, offering near-quantum accuracy at much lower computational cost. Yet, developing reliable MLIPs remains difficult because it requires generating high-quality datasets, preprocessing atomic structures, and carefully training and validating models. In this work, we introduce an Automated Machine Learning Pipeline (AMLP) that unifies the entire workflow from dataset creation to model validation. AMLP employs large-language-model agents to assist with electronic-structure code selection, input preparation, and output conversion, while its analysis suite (AMLP-Analysis), based on ASE supports a range of molecular simulations. The pipeline is built on the MACE architecture and validated on acridine polymorphs, where, with a straightforward fine-tuning of a foundation model, mean absolute errors of 1.7 meV/atom in energies and 7.0 meV/Å in forces are achieved. The fitted MLIP reproduces DFT geometries with sub-Å accuracy and demonstrates stability during molecular dynamics simulations in the microcanonical and canonical ensembles.

Show Abstract

Size-Consistent Adiabatic Connection Functionals via Orbital-Based Matrix Interpolation

We introduce a size-consistent and orbital-invariant formalism for constructing correlation functionals based on the adiabatic connection for density functional theory (DFT). By constructing correlation energy matrices for the weak and strong correlation limits in the space of occupied orbitals, our method, which we call orbital-based size-consistent matrix interpolation (OSMI), avoids previous difficulties in the construction of size-consistent adiabatic connection functionals. We design a simple, nonempirical adiabatic connection and a one-parameter strong-interaction limit functional, and we show that the resulting method reproduces the correlation energy of the uniform electron gas over a wide range of densities. When applied to subsets of the GMTKN55 thermochemistry database, OSMI is more accurate on average than MP2 and nonempirical density functionals. Most notably, OSMI provides excellent predictions of the barrier heights we tested, with average errors of less than 2 kcal mol

Show Abstract

Modifying electronic and structural properties of 2D van der Waals materials via cavity quantum vacuum fluctuations: a first-principles QEDFT study

Hang Liu, Simone Latini, I-Te Lu, Dongbin Shin, A. Rubio

Structuring the photon density of states and light-matter coupling in optical cavities has emerged as a promising approach to modifying the equilibrium properties of materials through strong light-matter interactions. In this article, we employ state-of-the-art quantum electrodynamical density functional theory (QEDFT) to study the modifications of the electronic and structural properties of two-dimensional (2D) van der Waals (vdW) layered materials by the cavity vacuum field fluctuations. We find that cavity photons modify the electronic density through localization along the photon polarization directions, a universal effect observed for all the 2D materials studied here. This modification of the electronic structure tunes the material properties, such as the shifting of energy valleys in monolayer h-BN and 2H-MoS2, enabling tunable band gaps. Also, it tunes the interlayer spacing in bilayer 2H-MoS2 and Td-MoTe2, allowing for adjustable ferroelectric, nonlinear Hall effect, and optical properties, as a function of light-matter coupling strength. Our findings open an avenue for engineering a broad range of 2D layered quantum materials by tuning vdW interactions through fluctuating cavity photon fields.

Show Abstract

Driven Similarity Renormalization Group with a Large Active Space: Applications to Oligoacenes, Zeaxanthin, and Chromium Dimer

Chenyang Li, Xiaoxue Wang, H. Zhai, Wei-Hai Fang

We present a new implementation of the driven similarity renormalization group (DSRG) based on a density matrix renormalization group (DMRG) reference. The explicit build of high-order reduced density matrices is avoided by forming matrix-product-state compressed intermediates. This algorithm facilitates the application of DSRG second- and third-order perturbation theories to dodecacene with an active space of 50 electrons in 50 orbitals. This active space appears the largest employed to date within the framework of internally contracted multireference formalism. The DMRG-DSRG approach is applied to several challenging systems, including the singlet-triplet gaps ($\Delta_{\rm ST}$) of oligoacenes ranging from naphthalene to dodecacene, the vertical excitation energies of zeaxanthin, and the ground-state potential energy curve (PEC) of Cr$_2$ molecule. Our best estimate for the vertical $\Delta_{\rm ST}$ of dodecacene is 0.22 eV, showing an excellent agreement with that of the linearized adiabatic connection method (0.24 eV). For zeaxanthin, all DSRG schemes suggest the order of $\rm 2\, ^1 A_g^- < 1\, ^1 B_u^+ < 1\, ^1 B_u^-$ for excited states. Both the equilibrium and the shoulder regions of the Cr$_2$ PEC are reasonably reproduced by the linearized DSRG with one- and two-body operators.

Show Abstract

The Good, the Bad, and the Ugly of Atomistic Learning for “Clusters-to-Bulk” Generalization

Mikołaj J. Gawkowski, Mingjia Li, B. Shi, Venkat Kapil

Generalizing atomistic machine learning models from small molecular clusters to bulk systems is a significant challenge in computational chemistry and materials science. While models trained on clusters can leverage high-accuracy quantum chemical data, their performance in bulk environments often deteriorates. In this work, we systematically investigate the factors influencing "clusters-to-bulk" generalization for several state-of-the-art atomistic learning architectures. We identify "the good"—effective strategies for data selection and representation that enhance transferability; "the bad"—common pitfalls such as overfitting to cluster-specific motifs and neglecting long-range interactions; and "the ugly"—inherent limitations of local descriptors in capturing bulk emergent properties. Our findings provide a detailed assessment of current methodologies and offer practical recommendations for developing more robust and generalizable atomistic potentials for complex condensed-phase systems.

Show Abstract

Improved energies and wave function accuracy with Weighted Variational Monte Carlo

Huan Zhang, Robert J. Webber, Michael Lindsey, T. Berkelbach, Jonathan Weare

Neural network parametrizations have increasingly been used to represent the ground and excited states in variational Monte Carlo (VMC) with promising results. However, traditional VMC methods only optimize the wave function in regions of peak probability. The wave function is uncontrolled in the tails of the probability distribution, which can limit the accuracy of the trained wavefunction approximation. To improve the approximation accuracy in the probability tails, this paper interprets VMC as a gradient flow in the space of wave functions, followed by a projection step.

Show Abstract

Reaction dynamics of lithium-mediated electrolyte decomposition using machine learning potentials

Sohang Kundu, Diana Chamaki, Hong-Zhou Ye, Garvit Agarwal, T. Berkelbach

The solid-electrolyte interphase (SEI) is a complex, multicomponent film that forms on the lithium metal anode in lithium-ion batteries. The SEI is formed through the decomposition of the electrolyte, which is a process that is mediated by the lithium metal surface. In this work, we use machine learning potentials to study the reaction dynamics of lithium-mediated electrolyte decomposition. We use a combination of active learning and enhanced sampling to efficiently explore the reaction pathways and calculate the free energy profiles of the decomposition reactions. Our results show that the decomposition of the electrolyte is a complex process that involves multiple steps and intermediate species. We also find that the lithium metal surface plays a crucial role in the decomposition process, as it provides a platform for the reactions to occur and stabilizes the reaction intermediates. Our work provides new insights into the mechanism of SEI formation and highlights the power of machine learning potentials for studying complex chemical reactions in battery systems.

Show Abstract

Efficient Implementation of the Random Phase Approximation with Domain-based Local Pair Natural Orbitals

Yu Hsuan Liang, Xing Zhang, G. K. Chan, T. Berkelbach, Hong-Zhou Ye

We present an efficient implementation of the random phase approximation (RPA) for molecular systems within the domain-based local pair natural orbital (DLPNO) framework. With optimized parameters, DLPNO-RPA achieves approximately 99.9% accuracy in the total correlation energy compared to a canonical implementation, enabling highly accurate reaction energies and potential energy surfaces to be computed while substantially reducing computational costs. As an application, we demonstrate the capability of DLPNO-RPA to efficiently calculate basis set-converged binding energies for a set of large molecules, with results showing excellent agreement with high-level reference data from both coupled cluster and diffusion Monte Carlo. This development paves the way for the routine use of RPA-based methods in molecular quantum chemistry.

Show Abstract

Diabatic states of charge transfer with constrained charge equilibration

Sohang Kundu, Hong-Zhou Ye, T. Berkelbach

Charge transfer (CT) processes that are electronically non-adiabatic are ubiquitous in chemistry, biology, and materials science, but their theoretical description requires diabatic states or adiabatic excited states. For complex systems, these latter states are more difficult to calculate than the adiabatic ground state. Here, we propose a simple method to obtain diabatic states, including energies and charges, by constraining the atomic charges within the charge equilibration framework. For two-state systems, the exact diabatic coupling can be determined, from which the adiabatic excited-state energy can also be calculated. The method can be viewed as an affordable alternative to constrained density functional theory (CDFT), and so we call it constrained charge equilibration (CQEq). We test the CQEq method on the anthracene-tetracyanoethylene CT complex and the reductive decomposition of ethylene carbonate on a lithium metal surface. We find that CQEq predicts diabatic energies, charges, and adiabatic excitation energies in good agreement with CDFT, and we propose that CQEq is promising for combination with machine learning force fields to study non-adiabatic CT in the condensed phase.

Show Abstract

Periodic Local Coupled-Cluster Theory for Insulators and Metals

Hong-Zhou Ye, T. Berkelbach

We describe the implementation details of periodic local coupled-cluster theory with single and double excitations (CCSD) and perturbative triple excitations [CCSD(T)] using local natural orbitals (LNOs) and $k$-point symmetry. We discuss and compare several choices for orbital localization, fragmentation, and LNO construction. By studying diamond and lithium, we demonstrate that periodic LNO-CC theory can be applied with equal success to both insulators and metals, achieving speedups of two to three orders of magnitude even for moderately sized $k$-point meshes. Our final predictions of the equilibrium cohesive energy, lattice constant, and bulk modulus for diamond and lithium are in good agreement with previous theoretical predictions and experimental results.

Show Abstract
  • Previous Page
  • Viewing
  • Next Page
Advancing Research in Basic Science and MathematicsSubscribe to Flatiron Institute announcements and other foundation updates