Lost in Translation: When GPT-4V(ision) Can’t See Eye to Eye with Text. A Vision-Language-Consistency Analysis of VLLMs and Beyond

Kavli Affiliate: Xiang Zhang | First 5 Authors: Xiang Zhang, Senyu Li, Zijun Wu, Ning Shi, | Summary: Recent advancements in multimodal techniques open exciting possibilities for models excelling in diverse tasks involving text, audio, and image processing. Models like GPT-4V, blending computer vision and language modeling, excel in complex text and image tasks. Numerous […]


Continue.. Lost in Translation: When GPT-4V(ision) Can’t See Eye to Eye with Text. A Vision-Language-Consistency Analysis of VLLMs and Beyond

Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning

Kavli Affiliate: Xiang Zhang | First 5 Authors: Xiang Zhang, Changhao Wang, Lingfeng Sun, Zheng Wu, Xinghao Zhu | Summary: Learning contact-rich manipulation skills is essential. Such skills require the robots to interact with the environment with feasible manipulation trajectories and suitable compliance control parameters to enable safe and stable contact. However, learning these skills […]


Continue.. Efficient Sim-to-real Transfer of Contact-Rich Manipulation Skills with Online Admittance Residual Learning

Semi-Supervised End-To-End Contrastive Learning For Time Series Classification

Kavli Affiliate: Xiang Zhang | First 5 Authors: Huili Cai, Xiang Zhang, Xiaofeng Liu, , | Summary: Time series classification is a critical task in various domains, such as finance, healthcare, and sensor data analysis. Unsupervised contrastive learning has garnered significant interest in learning effective representations from time series data with limited labels. The prevalent […]


Continue.. Semi-Supervised End-To-End Contrastive Learning For Time Series Classification

Terahertz phonon engineering and spectroscopy with van der Waals heterostructures

Kavli Affiliate: Feng Wang | First 5 Authors: Yoseob Yoon, Zheyu Lu, Can Uzundal, Ruishi Qi, Wenyu Zhao | Summary: Phononic engineering at GHz frequencies form the foundation of microwave acoustic filters, high-speed acousto-optic modulators, and quantum transducers. THz phononic engineering could lead to acoustic filters and modulators at higher bandwidth and speed, as well […]


Continue.. Terahertz phonon engineering and spectroscopy with van der Waals heterostructures

Terahertz phonon engineering with van der Waals heterostructures

Kavli Affiliate: Feng Wang | First 5 Authors: Yoseob Yoon, Zheyu Lu, Can Uzundal, Ruishi Qi, Wenyu Zhao | Summary: Phononic engineering at gigahertz (GHz) frequencies form the foundation of microwave acoustic filters, acousto-optic modulators, and quantum transducers. Terahertz (THz) phononic engineering could lead to acoustic filters and modulators at higher bandwidth and speed, as […]


Continue.. Terahertz phonon engineering with van der Waals heterostructures

Certifiably Robust Graph Contrastive Learning

Kavli Affiliate: Xiang Zhang | First 5 Authors: Minhua Lin, Teng Xiao, Enyan Dai, Xiang Zhang, Suhang Wang | Summary: Graph Contrastive Learning (GCL) has emerged as a popular unsupervised graph representation learning method. However, it has been shown that GCL is vulnerable to adversarial attacks on both the graph structure and node attributes. Although […]


Continue.. Certifiably Robust Graph Contrastive Learning

Dual Prompt Tuning for Domain-Aware Federated Learning

Kavli Affiliate: Feng Wang | First 5 Authors: Guoyizhe Wei, Feng Wang, Anshul Shah, Rama Chellappa, | Summary: Federated learning is a distributed machine learning paradigm that allows multiple clients to collaboratively train a shared model with their local data. Nonetheless, conventional federated learning algorithms often struggle to generalize well due to the ubiquitous domain […]


Continue.. Dual Prompt Tuning for Domain-Aware Federated Learning

Dual Prompt Tuning for Domain-Aware Federated Learning

Kavli Affiliate: Feng Wang | First 5 Authors: Guoyizhe Wei, Feng Wang, Anshul Shah, Rama Chellappa, | Summary: Federated learning is a distributed machine learning paradigm that allows multiple clients to collaboratively train a shared model with their local data. Nonetheless, conventional federated learning algorithms often struggle to generalize well due to the ubiquitous domain […]


Continue.. Dual Prompt Tuning for Domain-Aware Federated Learning