Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models

Kavli Affiliate: Jia Liu | First 5 Authors: Changxin Tian, Changxin Tian, , , | Summary: Mixture-of-Experts (MoE) has become a dominant architecture for scaling Large Language Models (LLMs) efficiently by decoupling total parameters from computational cost. However, this decoupling creates a critical challenge: predicting the model capacity of a given MoE configurations (e.g., expert […]


Continue.. Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models

Improving Multislice Electron Ptychography with a Generative Prior

Kavli Affiliate: David A. Muller | First 5 Authors: Christian K. Belardi, Christian K. Belardi, , , | Summary: Multislice electron ptychography (MEP) is an inverse imaging technique that computationally reconstructs the highest-resolution images of atomic crystal structures from diffraction patterns. Available algorithms often solve this inverse problem iteratively but are both time consuming and […]


Continue.. Improving Multislice Electron Ptychography with a Generative Prior

WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training

Kavli Affiliate: Jia Liu | First 5 Authors: Changxin Tian, Changxin Tian, , , | Summary: Recent advances in learning rate (LR) scheduling have demonstrated the effectiveness of decay-free approaches that eliminate the traditional decay phase while maintaining competitive performance. Model merging techniques have emerged as particularly promising solutions in this domain. We present Warmup-Stable […]


Continue.. WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training

HLFormer: Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning

Kavli Affiliate: Long Zhang | First 5 Authors: Li Jun, Li Jun, , , | Summary: Partially Relevant Video Retrieval (PRVR) addresses the critical challenge of matching untrimmed videos with text queries describing only partial content. Existing methods suffer from geometric distortion in Euclidean space that sometimes misrepresents the intrinsic hierarchical structure of videos and […]


Continue.. HLFormer: Enhancing Partially Relevant Video Retrieval with Hyperbolic Learning

Explicit Formulas for Estimating Trace of Reduced Density Matrix Powers via Single-Circuit Measurement Probabilities

Kavli Affiliate: Jing Wang | First 5 Authors: , , , , | Summary: In the fields of quantum mechanics and quantum information science, the traces of reduced density matrix powers play a crucial role in the study of quantum systems and have numerous important applications. In this paper, we propose a universal framework to […]


Continue.. Explicit Formulas for Estimating Trace of Reduced Density Matrix Powers via Single-Circuit Measurement Probabilities

Mapple: A Domain-Specific Language for Mapping Distributed Heterogeneous Parallel Programs

Kavli Affiliate: Ke Wang | First 5 Authors: Anjiang Wei, Anjiang Wei, , , | Summary: Optimizing parallel programs for distributed heterogeneous systems remains a complex task, often requiring significant code modifications. Task-based programming systems improve modularity by separating performance decisions from core application logic, but their mapping interfaces are often too low-level. In this […]


Continue.. Mapple: A Domain-Specific Language for Mapping Distributed Heterogeneous Parallel Programs

mGluR4-Npdc1 complex mediates α-synuclein fibril-induced neurodegeneration

Kavli Affiliate: Stephen Strittmatter | Authors: Azucena Perez-Canamas, Mingming Chen, Leire Almandoz-Gil, Nabab Khan, Si Jie Tang, Allyson Ho, Erik C Gunther and Stephen M. Strittmatter | Summary: Fibrils of misfolded α-synuclein (α-syn) accumulate in Parkinson’s disease and other synucleinopathies, spreading between cells to template further misfolding and drive neurodegeneration. α-syn fibril entry into healthy […]


Continue.. mGluR4-Npdc1 complex mediates α-synuclein fibril-induced neurodegeneration

StreamME: Simplify 3D Gaussian Avatar within Live Stream

Kavli Affiliate: Yi Zhou | First 5 Authors: Luchuan Song, Luchuan Song, , , | Summary: We propose StreamME, a method focuses on fast 3D avatar reconstruction. The StreamME synchronously records and reconstructs a head avatar from live video streams without any pre-cached data, enabling seamless integration of the reconstructed appearance into downstream applications. This […]


Continue.. StreamME: Simplify 3D Gaussian Avatar within Live Stream

On-chip stencil lithography for superconducting qubits

Kavli Affiliate: Gary A. Steele | First 5 Authors: Roudy Hanna, Roudy Hanna, , , | Summary: Improvements in circuit design and more recently in materials and surface cleaning have contributed to a rapid development of coherent superconducting qubits. However, organic resists commonly used for shadow evaporation of Josephson junctions (JJs) pose limitations due to […]


Continue.. On-chip stencil lithography for superconducting qubits

On-chip stencil lithography for superconducting qubits

Kavli Affiliate: Gary A. Steele | First 5 Authors: Roudy Hanna, Roudy Hanna, , , | Summary: Improvements in circuit design and more recently in materials and surface cleaning have contributed to a rapid development of coherent superconducting qubits. However, organic resists commonly used for shadow evaporation of Josephson junctions (JJs) pose limitations due to […]


Continue.. On-chip stencil lithography for superconducting qubits