General scales unlock AI evaluation with explanatory and predictive power.

Saved in:
Bibliographic Details
Title: General scales unlock AI evaluation with explanatory and predictive power.
Authors: Zhou L; Princeton University, Princeton, NJ, USA. lz5066@princeton.edu.; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK. lz5066@princeton.edu.; Microsoft Research Asia, Beijing, China. lz5066@princeton.edu.; Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València, València, Spain. lz5066@princeton.edu., Pacchiardi L; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK., Martínez-Plumed F; Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València, València, Spain., Collins KM; Department of Engineering, University of Cambridge, Cambridge, UK., Moros-Daval Y; Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València, València, Spain., Zhang S; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK.; Department of Psychology, University of Cambridge, Cambridge, UK., Zhao Q; Microsoft Research Asia, Beijing, China., Huang Y; Microsoft Research Asia, Beijing, China., Sun L; The Psychometrics Centre, University of Cambridge, Cambridge, UK., Prunty JE; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK., Li Z; Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK., Sánchez-García P; KU Leuven, Leuven, Belgium., Jiang-Chen K; Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València, València, Spain., Casares PAM; Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València, València, Spain., Zu J; Educational Testing Service, Princeton, NJ, USA., Burden J; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK., Mehrbakhsh B; Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València, València, Spain., Stillwell D; The Psychometrics Centre, University of Cambridge, Cambridge, UK., Cebrian M; Center for Automation and Robotics (CAR), Spanish National Research Council (CSIC-UPM), Madrid, Spain., Wang J; William & Mary, Williamsburg, VA, USA., Henderson P; Princeton University, Princeton, NJ, USA., Wu ST; Carnegie Mellon University, Pittsburgh, PA, USA., Kyllonen PC; Educational Testing Service, Princeton, NJ, USA., Cheke L; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK.; Department of Psychology, University of Cambridge, Cambridge, UK., Xie X; Microsoft Research Asia, Beijing, China. xing.xie@microsoft.com., Hernández-Orallo J; Leverhulme Centre for the Future of Intelligence, University of Cambridge, Cambridge, UK. josephorallo@gmail.com.; Valencian Research Institute for Artificial Intelligence (VRAIN), Universitat Politècnica de València, València, Spain. josephorallo@gmail.com.
Source: Nature [Nature] 2026 Apr; Vol. 652 (8108), pp. 58-67. Date of Electronic Publication: 2026 Apr 01.
Publication Type: Journal Article; Research Support, Non-U.S. Gov't
Journal Info: Publisher: Nature Publishing Group Country of Publication: England NLM ID: 0410462 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1476-4687 (Electronic) Linking ISSN: 00280836 NLM ISO Abbreviation: Nature Subsets: MEDLINE
Database: MEDLINE Ultimate
Description
ISSN:1476-4687
DOI:10.1038/s41586-026-10303-2