Enhancing Cross-Subject fMRI-to-Video Decoding with Global-Local Functional Alignment
Chong Li*, Xuelin Qian, Yun Wang, Jingyang Huo, Xiangyang Xue*, Yanwei Fu*, Jianfeng Feng
;
Abstract
"Advancements in brain imaging enable the decoding of thoughts and intentions from neural activities. However, the fMRI-to-video decoding of brain signals across multiple subjects encounters challenges arising from structural and coding disparities among individual brains, further compounded by the scarcity of paired fMRI-stimulus data. Addressing this issue, this paper introduces the fMRI Global-Local Functional Alignment (GLFA) projection, a novel approach that aligns fMRI frames from diverse subjects into a unified brain space, thereby enhancing cross-subject decoding. Additionally, we present a meticulously curated fMRI-video paired dataset comprising a total of 75k fMRI-stimulus paired samples from 8 individuals. This dataset is approximately 4.5 times larger than the previous benchmark dataset. Building on this, we augment a transformer-based fMRI encoder with a diffusion video generator, delving into the realm of cross-subject fMRI-based video reconstruction. This innovative methodology faithfully captures semantic information from diverse brain signals, resulting in the generation of vivid videos and achieving an impressive average accuracy of 84.7% in cross-subject semantic classification tasks."
Related Material
[pdf]
[supplementary material]
[DOI]