Oral Presentation World Sustainable Built Environment Conference 2026

SYNBUILD-3D: A large and semantically rich synthetic dataset for multi-modal 3D building generation at Level of Detail 4 (132348)

Kevin Mayer 1 , Alex Vesel 1 , Xinyi Zhao 1 , Martin Fischer 1
  1. Stanford University, San Francisco, CA, United States

Today, energy auditors spend substantial time and effort manually converting floor plan images into 3D building models for energy simulations. However, automating this process remains a major challenge due to the scarcity of large-scale annotated datasets in the public domain. Inspired by the success of synthetic data in computer vision, we introduce SYNBUILD-3D, a large, diverse, and multi-modal dataset of over 6.2 million synthetic 3D residential buildings at Level of Detail (LoD) 4, designed to enable AI-driven 3D building generation. In the dataset, each building is represented through three distinct modalities: a semantically enriched 3D wireframe graph at LoD 4 (Modality I), the corresponding floor plan images (Modality II), and a LiDAR-like roof point cloud (Modality III). The semantic annotations for each building wireframe are derived from the corresponding floor plan images and include information on rooms, doors, and windows. Through its tri-modal nature, future work can use SYNBUILD-3D to develop novel generative AI algorithms that automate the creation of 3D building models at LoD 4, subject to predefined floor plan layouts and roof geometries, while enforcing semantic–geometric consistency. Dataset and code samples are publicly available at \url{TBA}.