Kavli Affiliate: Xiang Zhang | First 5 Authors: Xiang Zhang, Juntai Cao, Chenyu You, , | Summary: Transformers, the backbone of modern large language models (LLMs), face inherent architectural limitations that impede their reasoning capabilities. Unlike recurrent networks, Transformers lack recurrent connections, confining them to constant-depth computation. This restriction places them in the complexity class […]
Continue.. Counting Ability of Large Language Models and Impact of Tokenization