Kavli Affiliate: Jiansheng Chen
| First 5 Authors: Yu Shang, Jiansheng Chen, Hangyu Fan, Jingtao Ding, Jie Feng
| Summary:
Cities, as the most fundamental environment of human life, encompass diverse
physical elements such as buildings, roads and vegetation with complex
interconnection. Crafting realistic, interactive 3D urban environments plays a
crucial role in constructing AI agents capable of perceiving, decision-making,
and acting like humans in real-world environments. However, creating
high-fidelity 3D urban environments usually entails extensive manual labor from
designers, involving intricate detailing and accurate representation of complex
urban features. Therefore, how to accomplish this in an automatical way remains
a longstanding challenge. Toward this problem, we propose UrbanWorld, the first
generative urban world model that can automatically create a customized,
realistic and interactive 3D urban world with flexible control conditions.
UrbanWorld incorporates four key stages in the automatical crafting pipeline:
3D layout generation from openly accessible OSM data, urban scene planning and
designing with a powerful urban multimodal large language model (Urban MLLM),
controllable urban asset rendering with advanced 3D diffusion techniques, and
finally the MLLM-assisted scene refinement. The crafted high-fidelity 3D urban
environments enable realistic feedback and interactions for general AI and
machine perceptual systems in simulations. We are working on contributing
UrbanWorld as an open-source and versatile platform for evaluating and
improving AI abilities in perception, decision-making, and interaction in
realistic urban environments.
| Search Query: ArXiv Query: search_query=au:”Jiansheng Chen”&id_list=&start=0&max_results=3