LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang

Yuqing Zhang, Hangqi Li, Shengyu Zhang*, Runzhong Wang, Baoyi He, Huaiyong Dou, Junchi Yan*, Yongquan Zhang, Fei Wu ;

Abstract


"Restoring ancient manuscripts fragments, such as those from Dunhuang, is crucial for preserving human historical culture. However, their worldwide dispersal and the shifts in cultural and historical contexts pose significant restoration challenges. Traditional archaeological efforts primarily focus on manually piecing major fragments together, yet the small and more intricate pieces remain largely unexplored, which is technically due to their irregular shapes, sparse textual content, and extensive combinatorial space for reassembly. In this paper, we formalize the task of restoring the ancient manuscript from fragments as a cardinality-constrained combinatorial optimization problem, and propose a framework named LLMCO4MR: (Multimodal) Large Language Model-aided Combinatorial Optimization Neural Networks for Ancient Manuscript Restoration. Specifically, LLMCO4MR encapsulates a neural combinatorial solver equipped with a differentiable optimal transport (OT) layer, to efficiently predict the Top-K likely mutual reassembly candidates. Multimodal Large Language Model (MLLM) is then adopted and prompted to yield pairwise matching confidence and relative directions for final restoration. Experiments on synthetic data and cases studies with real-world famous Dunhuang fragments demonstrate our approach’s practical potential in assisting archaeologists. Our method provides a novel perspective for ancient manuscript restoration."

Related Material


[pdf] [supplementary material] [DOI]