LiteReality

Graphics-Ready 3D Scene Reconstruction from RGB-D Scans

1University of Cambridge, 2The University of Hong Kong, 3Technical University of Munich

We are excited to present LiteReality ✨, an automatic pipeline that converts RGB-D scans of indoor environments into graphics-ready 🏠 scenes. In these scenes, all objects are represented as high-quality meshes with PBR materials 🎨 that match their real-world appearance. The scenes also include articulated objects 🔧 and are ready to integrate into graphics pipelines for rendering 💡 and physics-based interactions 🕹️.




Video

🎬 Watch our explainer video now and understand everything in just 4 minutes! ⏱️💡✨

Abstract

We propose LiteReality, a novel pipeline that converts RGB-D scans of indoor environments into compact, realistic, and interactive 3D virtual replicas. LiteReality not only reconstructs scenes that visually resemble reality but also supports key features essential for graphics pipelines—such as object individuality, articulation, high-quality physically based rendering materials, and physically based interaction. At its core, LiteReality first performs scene understanding and parses the results into a coherent 3D layout and objects, with the help of a structured scene graph. It then reconstructs the scene by retrieving the most visually similar 3D artist-crafted models from a curated asset database. Later, the Material Painting module enhances the realism of retrieved objects by recovering high-quality, spatially varying materials. Finally, the reconstructed scene is integrated into a simulation engine with basic physical properties applied to enable interactive behavior. The resulting scenes are compact, editable, and fully compatible with standard graphics pipelines, making them suitable for applications in AR/VR, gaming, robotics, and digital twins. In addition, LiteReality introduces a training-free object retrieval module that achieves state-of-the-art similarity performance, as benchmarked on the Scan2CAD dataset, along with a robust Material Painting module capable of transferring appearances from images of any style to 3D assets—even in the presence of severe misalignment, occlusion, and poor lighting. We demonstrate the effectiveness of LiteReality on both real-life scans and public datasets.

Side-by-Side Comparison

🏠 Left: Rendered 3D scene reconstruction
📸 Right: Original RGB capture
Swipe through the carousel above to see more examples. 🔄

Pipeline of LiteReality

LiteReality Pipeline

With input RGB-D scans, the process begins with scene perception and parsing, where room layouts and 3D object bounding box are detected and parsed into a structured and physically plausible arrangement with the help of a scene graph. In the object reconstruction stage, identical clustering first finds identical objects, and a hierarchical retrieval approach is conducted to match 3D models retrieved from the LiteReality database. The material painting stage retrieves and optimizes PBR materials by referencing observed images. Finally, procedural reconstruction assembles all the information into a graphics-ready environment that features realism and interactivity.

Material Painting

One of the key challenges in the pipeline—which prior work often struggles with—is reliably recovering PBR materials at scale. While single-image methods perform well on clear views, they frequently fail on room-level scans under conditions of occlusion, poor lighting, and geometric misalignment between retrieved objects and input images. To address these limitations, we introduce an MLLM-based retrieval & optimization framework for robust, scalable material recovery.

👇 Examples:

  • 🖌️📸 Left: Model painted using multiple reference images
  • 🎨✨ Right: Single reference image used to paint three different models

Applications

Creating interactive, graphics-ready scenes unlocks a wide range of application scenarios. Here, we showcase a few examples, including flexible relighting, physics-based interactions, and object-level manipulation. These capabilities open the door to even more applications, such as VR/AR experiences, robotics simulation, interior design, and digital twin systems.

Limitations and Future Improvements

At present, LiteReality aims for typical indoor environments without extreme design variations. To generalize to more diverse scenes, several key limitations need to be addressed:

  • 🔄 Generalization to complex object relationships:
    Object relationships are currently inferred from bounding boxes in the detection stage, and scene parsing is performed agnostically to ensure physically plausible layouts. This prevents the system from handling more intricate configurations—e.g., a sink embedded in a cabinet, which is common in kitchen environments.
  • 💡 Lack of lighting estimation:
    Our material-painting stage skips any explicit lighting estimation. While this simplifies the pipeline, it also reduces the photorealism of the rendered scenes compared to the original captures.
  • 🐁 Exclusion of small objects:
    LiteReality currently focuses on reconstructing room-defining elements (similar to Apple’s RoomPlan); smaller objects are not yet included in the pipeline.

More Details?

For more details on methodologies, evaluation metrics, comparison with baselines, and the LiteReality database, please refer to our paper.

Read the Paper

Citation

@misc{huang2025literealitygraphicsready3dscene,
  title={LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans},
  author={Zhening Huang and Xiaoyang Wu and Fangcheng Zhong and Hengshuang Zhao and Matthias Nießner and Joan Lasenby},
  year={2025},
  eprint={2507.02861},
  url={https://arxiv.org/abs/2507.02861}
}