Nymeria: A Massive Collection of Egocentric Multi-modal Human Motion in the Wild

Lingni Ma*, Yuting Ye, Rowan Postyeni, Alexander J Gamino, Vijay Baiyya, Luis Pesqueira, Kevin M Bailey, David Soriano Fosas, Fangzhou Hong, Vladimir Guzov, Yifeng Jiang, Hyo Jin Kim, Jakob Engel, Karen Liu, Ziwei Liu, Renzo De Nardi, Richard Newcombe ;

Abstract


"We introduce - a large-scale, diverse, richly annotated human motion dataset collected in the wild with multiple multimodal egocentric devices. The dataset comes with a) full-body ground-truth motion; b) multiple multimodal egocentric data from Project Aria devices with videos, eye tracking, IMUs and etc; and c) an third-person perspective by an additional “observer”. All devices are precisely synchronized and localized in one metric 3D world. We derive hierarchical protocol to add in-context language descriptions of human motion, from fine-grain motion narrations, to simplified atomic actions and high-level activity summarization. To the best of our knowledge, dataset is the world’s largest human motion in the wild; first of its kind to provide synchronized and localized multi-device multimodal egocentric data; and the world’s largest motion-language dataset. It provides hours of daily activities from participants across locations, total travelling distance over . The language descriptions contain sentences in words from a vocabulary size of 6545. To demonstrate the potential of the dataset we evaluate several SOTA algorithms for egocentric body tracking, motion synthesis, and action recognition."

Related Material


[pdf] [DOI]