Kavli Affiliate: Ke Wang | First 5 Authors: Yuxuan Hu, Ke Wang, Xiaokang Zhang, Fanjin Zhang, Cuiping Li | Summary: Large Language Models (LLMs) have revolutionized natural language processing by unifying tasks into text generation, yet their large parameter sizes and autoregressive nature limit inference speed. SAM-Decoding addresses this by introducing a novel retrieval-based speculative […]
Continue.. SAM Decoding: Speculative Decoding via Suffix Automaton