Kavli Affiliate: Jiansheng Chen
| First 5 Authors: Bochao Zou, Zizheng Guo, Jiansheng Chen, Huimin Ma,
| Summary:
Remote photoplethysmography (rPPG) is a non-contact method for detecting
physiological signals based on facial videos, holding high potential in various
applications such as healthcare, affective computing, anti-spoofing, etc. Due
to the periodicity nature of rPPG, the long-range dependency capturing capacity
of the Transformer was assumed to be advantageous for such signals. However,
existing approaches have not conclusively demonstrated the superior performance
of Transformer over traditional convolutional neural network methods, this gap
may stem from a lack of thorough exploration of rPPG periodicity. In this
paper, we propose RhythmFormer, a fully end-to-end transformer-based method for
extracting rPPG signals by explicitly leveraging the quasi-periodic nature of
rPPG. The core module, Hierarchical Temporal Periodic Transformer,
hierarchically extracts periodic features from multiple temporal scales. It
utilizes dynamic sparse attention based on periodicity in the temporal domain,
allowing for fine-grained modeling of rPPG features. Furthermore, a fusion stem
is proposed to guide self-attention to rPPG features effectively, and it can be
easily transferred to existing methods to enhance their performance
significantly. RhythmFormer achieves state-of-the-art performance with fewer
parameters and reduced computational complexity in comprehensive experiments
compared to previous approaches. The codes are available at
https://github.com/zizheng-guo/RhythmFormer.
| Search Query: ArXiv Query: search_query=au:”Jiansheng Chen”&id_list=&start=0&max_results=3