Generating complete 360° panoramas from narrow field of view images is ongoing research as omnidirectional RGB data is not readily available. Existing GAN-based approaches face some barriers to achieving higher quality output, and have poor generalization performance over different mask types. In this paper, we present our 360° indoor RGB panorama outpainting model using latent diffusion models (LDM), called PanoDiffusion. We introduce a new bi-modal latent diffusion structure that utilizes both RGB and depth panoramic data during training, which works surprisingly well to outpaint depth-free RGB images during inference. We further propose a novel technique of introducing progressive camera rotations during each diffusion denoising step, which leads to substantial improvement in achieving panorama wraparound consistency. Results show that our PanoDiffusion not only significantly outperforms state-of-the-art methods on RGB-D panorama outpainting by producing diverse well-structured results for different types of masks, but can also synthesize high-quality depth panoramas to provide realistic 3D indoor models.
The overall pipeline of our proposed PanoDiffusion method
PanoDiffusion effectively generates semantically meaningful content and plausible appearances on various masks with multiple and diverse solutions.
RGB input
Depth GT
PanoDiffusion output
Given complete RGB images, our PanoDiffusion can correspondingly generate accurate absolute depth images.
We provide some synthesized RGB-D panorama examples where RGB is partially visible and depth is fully masked. The results show that our PanoDiffusion can outpainting plausible and consistent RGB-D panoramas simultaneously
This website is adapted from GLIGEN.
@misc{wu2023ipoldm,
title={PanoDiffusion: Depth-aided 360-degree Indoor RGB Panorama Outpainting via Latent Diffusion Model},
author={Tianhao Wu and Chuanxia Zheng and Tat-Jen Cham},
year={2023},
eprint={2307.03177},
archivePrefix={arXiv},
primaryClass={cs.CV}
}