FPD Seminar

Machine-Learning Scaling Laws for LHC Physics: has the Bitter Lesson caught up to HEP?

by Matthias Vigl (Technical University of Munich)

America/Los_Angeles
48/2-224 - Madrone (SLAC)

48/2-224 - Madrone

SLAC

28
Description
High Energy Physics and deep learning have increasingly diverged in their routes to data processing: In collider physics, performance has been driven by deep, hand-engineered pipelines that encode decades of domain knowledge, while modern machine learning has advanced primarily through scale, leveraging large datasets and increasingly generic model architectures, with performance driven by predictable scaling laws as a function of compute. While ML has long been embedded in the HEP analysis pipeline, the rate of improvement has remained slower than the rapid, scale-driven progress observed in industry. 
In this talk, I contrast physics-driven and scale-driven approaches, and show how foundation-model principles (scaling laws, transfer learning, and end-to-end optimization) can be applied to LHC analyses. Using the ATLAS GN3 jet flavor tagger as a case study, I demonstrate that scaling model size and training compute yields substantial gains in jet tagging performance, following predictable power-law behaviour analogous to what has been observed in language and vision models, leading to order of magnitude improvements in background rejection when extrapolating to frontier industry scale.
 
https://stanford.zoom.us/j/98973156241?pwd=cEU5RFdlVXoyc0JTeTlDMkozKzQ5UT09