Kavli Affiliate: Ke Wang
| First 5 Authors: Zhuoran Liu, Leqi Zou, Xuan Zou, Caihua Wang, Biao Zhang
| Summary:
Building a scalable and real-time recommendation system is vital for many
businesses driven by time-sensitive customer feedback, such as short-videos
ranking or online ads. Despite the ubiquitous adoption of production-scale deep
learning frameworks like TensorFlow or PyTorch, these general-purpose
frameworks fall short of business demands in recommendation scenarios for
various reasons: on one hand, tweaking systems based on static parameters and
dense computations for recommendation with dynamic and sparse features is
detrimental to model quality; on the other hand, such frameworks are designed
with batch-training stage and serving stage completely separated, preventing
the model from interacting with customer feedback in real-time. These issues
led us to reexamine traditional approaches and explore radically different
design choices. In this paper, we present Monolith, a system tailored for
online training. Our design has been driven by observations of our application
workloads and production environment that reflects a marked departure from
other recommendations systems. Our contributions are manifold: first, we
crafted a collisionless embedding table with optimizations such as expirable
embeddings and frequency filtering to reduce its memory footprint; second, we
provide an production-ready online training architecture with high
fault-tolerance; finally, we proved that system reliability could be traded-off
for real-time learning. Monolith has successfully landed in the BytePlus
Recommend product.
| Search Query: ArXiv Query: search_query=au:”Ke Wang”&id_list=&start=0&max_results=10