QUICK REVIEW

[论文解读] Indian Sign Language Recognition Using Mediapipe Holistic

Velmathi Guruviah, Kaushal Goyal|arXiv (Cornell University)|Apr 20, 2023

Hand Gesture Recognition Systems被引用 14

一句话总结

本文提出一个利用 Mediapipe Holistic 的印度手语识别系统，并比较 CNN 与 LSTM 模型在静态手语和手势手语中的表现，突出 CNN 在两种情景中的有效性。

ABSTRACT

Deaf individuals confront significant communication obstacles on a daily basis. Their inability to hear makes it difficult for them to communicate with those who do not understand sign language. Moreover, it presents difficulties in educational, occupational, and social contexts. By providing alternative communication channels, technology can play a crucial role in overcoming these obstacles. One such technology that can facilitate communication between deaf and hearing individuals is sign language recognition. We will create a robust system for sign language recognition in order to convert Indian Sign Language to text or speech. We will evaluate the proposed system and compare CNN and LSTM models. Since there are both static and gesture sign languages, a robust model is required to distinguish between them. In this study, we discovered that a CNN model captures letters and characters for recognition of static sign language better than an LSTM model, but it outperforms CNN by monitoring hands, faces, and pose in gesture sign language phrases and sentences. The creation of a text-to-sign language paradigm is essential since it will enhance the sign language-dependent deaf and hard-of-hearing population's communication skills. Even though the sign-to-text translation is just one side of communication, not all deaf or hard-of-hearing people are proficient in reading or writing text. Some may have difficulty comprehending written language due to educational or literacy issues. Therefore, a text-to-sign language paradigm would allow them to comprehend text-based information and participate in a variety of social, educational, and professional settings. Keywords: deaf and hard-of-hearing, DHH, Indian sign language, CNN, LSTM, static and gesture sign languages, text-to-sign language model, MediaPipe Holistic, sign language recognition, SLR, SLT

研究动机与目标

在教育、工作和社交场景中，促进聋人和听障人士的无障碍沟通。
开发一个健壮的手语识别系统，将印度手语转化为文本或语音。
评估并比较 CNN 和 LSTM 模型在静态手语和手势手语上的性能。

提出的方法

使用 Mediapipe Holistic 提取包括手、脸和姿态在内的全局特征。
在静态手语识别（字母/字符）上训练并比较 CNN 与 LSTM 模型。
扩展评估至使用全局线索的手势手语短语和句子。
评估模型在文本到手语范式下的性能，以帮助沟通。

实验结果

研究问题

RQ1是否可以在使用全局特征的情况下，CNN 在识别静态印度手语符号上优于 LSTM？
RQ2CNN 是否能利用全局线索（手、脸、姿态）来提升手势手语识别？
RQ3文本到手语范式是否可行且有益于聋哑人群的文本信息获取？

主要发现

CNN 在捕捉静态手语字母/字符方面优于 LSTM。
CNN 利用手、脸和姿态等全局线索可提升手势手语短语和句子的识别。
该研究凸显了文本到手语框架在促进基于手语的沟通中的潜力。

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。