Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation

Kavli Affiliate: Yi Zhou | First 5 Authors: Bingzhi Liu, Yin Cao, Haohe Liu, Yi Zhou, | Summary: Diffusion models have demonstrated promising results in text-to-audio generation tasks. However, their practical usability is hindered by slow sampling speeds, limiting their applicability in high-throughput scenarios. To address this challenge, progressive distillation methods have been effective in […]


Continue.. Balanced SNR-Aware Distillation for Guided Text-to-Audio Generation

Room temperature ferromagnetic semiconductors through metal-semiconductor transition in monolayer MnSe2

Kavli Affiliate: Bo Gu | First 5 Authors: Jia-Wen Li, Gang Su, Bo Gu, , | Summary: To realize room temperature ferromagnetic semiconductors is still a challenge in spintronics. Recent experiments have obtained two-dimensional (2D) room temperature ferromagnetic metals, such as monolayer MnSe2. In this paper, we proposed a way to obtain room temperature ferromagnetic […]


Continue.. Room temperature ferromagnetic semiconductors through metal-semiconductor transition in monolayer MnSe2

Room temperature ferromagnetic semiconductors through metal-semiconductor transition in monolayer MnSe2

Kavli Affiliate: Gang Su | First 5 Authors: Jia-Wen Li, Gang Su, Bo Gu, , | Summary: To realize room temperature ferromagnetic semiconductors is still a challenge in spintronics. Recent experiments have obtained two-dimensional (2D) room temperature ferromagnetic metals, such as monolayer MnSe2. In this paper, we proposed a way to obtain room temperature ferromagnetic […]


Continue.. Room temperature ferromagnetic semiconductors through metal-semiconductor transition in monolayer MnSe2

Generative Pretraining at Scale: Transformer-Based Encoding of Transactional Behavior for Fraud Detection

Kavli Affiliate: Zheng Zhu | First 5 Authors: Ze Yu Zhao, Zheng Zhu, Guilin Li, Wenhan Wang, Bo Wang | Summary: In this work, we introduce an innovative autoregressive model leveraging Generative Pretrained Transformer (GPT) architectures, tailored for fraud detection in payment systems. Our approach innovatively confronts token explosion and reconstructs behavioral sequences, providing a […]


Continue.. Generative Pretraining at Scale: Transformer-Based Encoding of Transactional Behavior for Fraud Detection

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

Kavli Affiliate: Yi Zhou | First 5 Authors: Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu | Summary: Recent advancements in the text-to-3D task leverage finetuned text-to-image diffusion models to generate multi-view images, followed by NeRF reconstruction. Yet, existing supervised finetuned (SFT) diffusion models still suffer from multi-view inconsistency and the resulting NeRF […]


Continue.. Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

Kavli Affiliate: Yi Zhou | First 5 Authors: Desai Xie, Jiahao Li, Hao Tan, Xin Sun, Zhixin Shu | Summary: Multi-view diffusion models, obtained by applying Supervised Finetuning (SFT) to text-to-image diffusion models, have driven recent breakthroughs in text-to-3D research. However, due to the limited size and quality of existing 3D datasets, they still suffer […]


Continue.. Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning

Hyperuniformity on Mars: Pebbles scattered on sand

Kavli Affiliate: Zheng Zhu | First 5 Authors: Zheng Zhu, Bernard Hallet, András A. Sipos, Gábor Domokos, Quan-Xing Liu | Summary: In Gale Crater near Mars’ equator, dunes and ripples of sand stand out from the general orderless, rocky terrain. In addition, images from Curiosity, the Mars Science Laboratory rover, reveal more subtle orderly forms: […]


Continue.. Hyperuniformity on Mars: Pebbles scattered on sand

Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting

Kavli Affiliate: Cheng Peng | First 5 Authors: Junwu Zhang, Zhenyu Tang, Yatian Pang, Xinhua Cheng, Peng Jin | Summary: Recent one image to 3D generation methods commonly adopt Score Distillation Sampling (SDS). Despite the impressive results, there are multiple deficiencies including multi-view inconsistency, over-saturated and over-smoothed textures, as well as the slow generation speed. […]


Continue.. Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting

Learning Subject-Aware Cropping by Outpainting Professional Photos

Kavli Affiliate: Matthew Fisher | First 5 Authors: James Hong, Lu Yuan, Michaël Gharbi, Matthew Fisher, Kayvon Fatahalian | Summary: How to frame (or crop) a photo often depends on the image subject and its context; e.g., a human portrait. Recent works have defined the subject-aware image cropping task as a nuanced and practical version […]


Continue.. Learning Subject-Aware Cropping by Outpainting Professional Photos

Learning Subject-Aware Cropping by Outpainting Professional Photos

Kavli Affiliate: Matthew Fisher | First 5 Authors: James Hong, Lu Yuan, Michaël Gharbi, Matthew Fisher, Kayvon Fatahalian | Summary: How to frame (or crop) a photo often depends on the image subject and its context; e.g., a human portrait. Recent works have defined the subject-aware image cropping task as a nuanced and practical version […]


Continue.. Learning Subject-Aware Cropping by Outpainting Professional Photos