15–19 Jun 2026
UC Irvine
America/New_York timezone

Adapting Vision-Language Models for Neutrino Event Classification in High-Energy Physics

17 Jun 2026, 11:00
20m
The Interdisciplinary Science and Engineering Building (UC Irvine)

The Interdisciplinary Science and Engineering Building

UC Irvine

419 Physical Sciences Quad, Irvine, CA 92697

Speaker

Mr Dikshant Sagar (University of California, Irvine)

Description

Recent advances in machine learning, particularly in multimodal models, have created new opportunities for analyzing complex data in high-energy physics, where accurate identification of particle interactions is critical for scientific discovery. However, existing approaches rely heavily on convolutional neural networks, which lack interpretability and do not fully leverage multimodal reasoning capabilities. Here we show that a fine-tuned Vision Language Model (VLM) based on LLaMA 3.2 can effectively identify neutrino interactions in pixelated detector data, outperforming both a state-of-the-art convolutional neural network and a Vision Transformer baseline in classification accuracy and robustness. In addition, the VLM provides improved explainability through reasoning-based, interpretable predictions and supports integration of auxiliary semantic information. These results demonstrate the potential of multimodal transformer architectures as general-purpose tools for physics event classification, paving the way for more transparent, flexible, and scalable analysis methods in future high-energy physics experiments.

Author

Mr Dikshant Sagar (University of California, Irvine)

Co-authors

Presentation materials

There are no materials yet.