Kavli Affiliate: Max Tegmark
| First 5 Authors: Wes Gurnee, Max Tegmark, , ,
| Summary:
The capabilities of large language models (LLMs) have sparked debate over
whether such systems just learn an enormous collection of superficial
statistics or a coherent model of the data generating process — a world model.
We find evidence for the latter by analyzing the learned representations of
three spatial datasets (world, US, NYC places) and three temporal datasets
(historical figures, artworks, news headlines) in the Llama-2 family of models.
We discover that LLMs learn linear representations of space and time across
multiple scales. These representations are robust to prompting variations and
unified across different entity types (e.g. cities and landmarks). In addition,
we identify individual “space neurons” and “time neurons” that reliably
encode spatial and temporal coordinates. Our analysis demonstrates that modern
LLMs acquire structured knowledge about fundamental dimensions such as space
and time, supporting the view that they learn not merely superficial
statistics, but literal world models.
| Search Query: ArXiv Query: search_query=au:”Max Tegmark”&id_list=&start=0&max_results=3