Theoretical Physics Benchmark (TPBench) - a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics

Published in arxiv, 2025

This paper presents a unique research work on testing the reasoning capabilities of state-of-the-art AI models on advanced theoretical physics (TP) problems. To this end, we develop a new TP Benchmark data set consisting of problems ranging from easy undergrad to research-level difficulty.

Download paper here