Kavli Affiliate: Feng Yuan | First 5 Authors: Hui Wu, Yi Gan, Feng Yuan, Jing Ma, Wei Zhu | Summary: Transformer based Large Language Models (LLMs) have been widely used in many fields, and the efficiency of LLM inference becomes hot topic in real applications. However, LLMs are usually complicatedly designed in model structure with […]
Continue.. Efficient LLM inference solution on Intel GPU