Joint Camera-LiDAR Scene Synthesis and Perception for Autonomous Driving
Document Type
Article
Publication Title
IEEE Access
Abstract
The advancement of autonomous driving and embedded AI systems has intensified the need for large-scale, richly annotated multimodal datasets encompassing RGB images, semantic labels, and 3D LiDAR data. Manual collection and annotation of such datasets remain costly and time-consuming, especially when temporal and cross-modal consistency is required. The proposed method introduces Joint Camera-LiDAR Scene Synthesis and Perception (JCLSP), a unified generative framework that simultaneously synthesizes photorealistic RGB images, semantic segmentation maps, and LiDAR range images through a compact and optimized diffusion process. Unlike prior approaches that employ separate diffusion branches, JCLSP fuses image and LiDAR modalities early in the pipeline and leverages a shared latent space for coherent multimodal generation. The architecture integrates three key elements: BKSDM, which streamlines the diffusion process by eliminating redundant blocks, a joint image-LiDAR diffusion module that applies the BKSDM framework to enable depth-aware synthesis with geometric fidelity, and modality-specific decoders that extract semantic masks, LiDAR range images, and image scenes from the shared latent representation. Experimental results on synthetic datasets indicate that JCLSP captures meaningful cross-modal correlations and preserves spatial features. By generating joint representations from camera and LiDAR views along with semantic segmentation annotations, the method demonstrates promising potential for cross-modal representation learning with labeled data.
First Page
166740
Last Page
166759
DOI
10.1109/ACCESS.2025.3613054
Publication Date
1-1-2025
Recommended Citation
Raghavendra, S.; Abhilash, S. K.; Madhav Nookala, Venu; and Arun Kumar, P. V., "Joint Camera-LiDAR Scene Synthesis and Perception for Autonomous Driving" (2025). Open Access archive. 14283.
https://impressions.manipal.edu/open-access-archive/14283