The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks

Kavli Affiliate: Max Tegmark

| First 5 Authors: Ziqian Zhong, Ziming Liu, Max Tegmark, Jacob Andreas,

| Summary:

Do neural networks, trained on well-understood algorithmic tasks, reliably
rediscover known algorithms for solving those tasks? Several recent studies, on
tasks ranging from group arithmetic to in-context linear regression, have
suggested that the answer is yes. Using modular addition as a prototypical
problem, we show that algorithm discovery in neural networks is sometimes more
complex. Small changes to model hyperparameters and initializations can induce
the discovery of qualitatively different algorithms from a fixed training set,
and even parallel implementations of multiple such algorithms. Some networks
trained to perform modular addition implement a familiar Clock algorithm;
others implement a previously undescribed, less intuitive, but comprehensible
procedure which we term the Pizza algorithm, or a variety of even more complex
procedures. Our results show that even simple learning problems can admit a
surprising diversity of solutions, motivating the development of new tools for
characterizing the behavior of neural networks across their algorithmic phase
space.

| Search Query: ArXiv Query: search_query=au:”Max Tegmark”&id_list=&start=0&max_results=10

Read More