Abstract: In this paper, we present our work for Visual Speech Recognition (VSR) in the Mandarin Audio-Visual Speech Recognition (MAVSR) Challenge 2025, with a particular focus on improving lipreading ...
Abstract: Visual Speech Recognition (VSR), also known as lip-reading, requires accurate mouth tracking and alignment to achieve high performance. Conventional approaches rely on complex pipelines ...