External Large Foundation Model: How to Efficiently Serve Trillions of Parameters for Online Ads Recommendation

Kavli Affiliate: Xian Chen

| First 5 Authors: Mingfu Liang, Xi Liu, Rong Jin, Boyang Liu, Qiuling Suo

| Summary:

Ads recommendation is a prominent service of online advertising systems and
has been actively studied. Recent studies indicate that scaling-up and advanced
design of the recommendation model can bring significant performance
improvement. However, with a larger model scale, such prior studies have a
significantly increasing gap from industry as they often neglect two
fundamental challenges in industrial-scale applications. First, training and
inference budgets are restricted for the model to be served, exceeding which
may incur latency and impair user experience. Second, large-volume data arrive
in a streaming mode with data distributions dynamically shifting, as new
users/ads join and existing users/ads leave the system. We propose the External
Large Foundation Model (ExFM) framework to address the overlooked challenges.
Specifically, we develop external distillation and a data augmentation system
(DAS) to control the computational cost of training/inference while maintaining
high performance. We design the teacher in a way like a foundation model (FM)
that can serve multiple students as vertical models (VMs) to amortize its
building cost. We propose Auxiliary Head and Student Adapter to mitigate the
data distribution gap between FM and VMs caused by the streaming data issue.
Comprehensive experiments on internal industrial-scale applications and public
datasets demonstrate significant performance gain by ExFM.

| Search Query: ArXiv Query: search_query=au:”Xian Chen”&id_list=&start=0&max_results=3

Read More