Representation Learning for Hard Data Problems in Quantum Matter
ABSTRACT
Correlated quantum materials pose a distinctive class of data problems: the relevant degrees of freedom are often only indirectly observed, measurements underconstrain the underlying electronic or structural state, and the resulting datasets are heterogeneous, high-dimensional, and history-dependent. In this setting, the central challenge is not simply prediction, but the construction of representations that connect complex experimental data to physically meaningful inference. In this talk, I will discuss how machine learning can be used in this more physics-native way across several problems in quantum matter. I will highlight attention-based approaches to many-body state characterization, X-TEC for unsupervised analysis of evolving X-ray diffraction data, and an interpretable Gaussian-process framework for superconductivity discovery that incorporates chemistry- and structure-aware graphlet representations. Across these examples, the emphasis is on models that respect the character of the data, expose organizing principles rather than only correlations, and return uncertainties that can guide experiments. I will conclude with the case of superconductivity, where structure-aware and interpretable learning not only achieves strong predictive performance, but also helps identify promising materials and supports a closed-loop path from data to synthesis to experimental confirmation.