Kavli Affiliate: Wei Gao | First 5 Authors: Qiaoling Chen, Shenggui Li, Wei Gao, Peng Sun, Yonggang Wen | Summary: In recent years, Large Language Models (LLMs) have exhibited remarkable capabilities, driving advancements in real-world applications. However, training LLMs on increasingly long input sequences imposes significant challenges due to high GPU memory and computational demands. […]
Continue.. SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading