Point-Guided Latent Diffusion Model for Novel View Synthesis in Laparoscopic Liver Surgery.

Despite recent progress in diffusion-based video synthesis, synthesizing accurate novel views from sparse input frames in laparoscopic liver surgery remains challenging due to occlusions, complex shape of anatomical structures and limited field of views. We propose point-guided latent diffusion model, specifically designed for generating high-quality intermediate frames in laparoscopic liver surgery from only the first and last video frames. Our method leverages the powerful generative capability of latent diffusion models combined with geometric cues from 3D point clouds reconstructed via dense stereo matching. To robustly handle occlusions and shape deformation, we use an adaptive camera trajectory planning strategy based on next-best-view algorithms. Furthermore, we introduce a spatial-transformer enhanced decoder to effectively preserve detailed anatomical features from reference frames and minimize visual artefacts in generated views. Extensive experiments on the clinically relevant P2ILF challenge dataset validate our method's effectiveness and superior performance in producing visually coherent and structurally accurate novel views, highlighting its ability for enhancing the quality of surgical scene reconstruction.

Journal

Healthcare Technology Letters

Permalink to this Record

https://hdl.handle.net/20.500.14838/2449

Link to Publisher Site (DOI)

10.1049/htl2.70032

Collections

External Publications

Full item page

Point-Guided Latent Diffusion Model for Novel View Synthesis in Laparoscopic Liver Surgery.

All Authors

LTHT Author

LTHT Department

Contributor Profession (Non Medical)

Publication Date

Item Type

Language

Subject

Subject Headings

Journal Title

Journal ISSN

Volume Title

Abstract

Journal

Permalink to this Record

Link to Publisher Site (DOI)

Collections

Endorsement

Review

Supplemented By

Referenced By