Machine translation of sign languages

Last updated

The machine translation of sign languages has been possible, albeit in a limited fashion, since 1977. When a research project successfully matched English letters from a keyboard to ASL manual alphabet letters which were simulated on a robotic hand. These technologies translate signed languages into written or spoken language, and written or spoken language to sign language, without the use of a human interpreter. Sign languages possess different phonological features than spoken languages, which has created obstacles for developers. Developers use computer vision and machine learning to recognize specific phonological parameters and epentheses [1] unique to sign languages, and speech recognition and natural language processing allow interactive communication between hearing and deaf people.

Contents

Limitations

Sign language translation technologies are limited in the same way as spoken language translation. None can translate with 100% accuracy. In fact, sign language translation technologies are far behind their spoken language counterparts. This is, in no trivial way, due to the fact that signed languages have multiple articulators. Where spoken languages are articulated through the vocal tract, signed languages are articulated through the hands, arms, head, shoulders, torso, and parts of the face. This multi-channel articulation makes translating sign languages very difficult. An additional challenge for sign language MT is the fact that there is no formal written format for signed languages. There are notations systems but no writing system has been adopted widely enough, by the international Deaf community, that it could be considered the 'written form' of a given sign language. Sign Languages then are recorded in various video formats. There is no gold standard parallel corpus that is large enough for SMT, for example.

History

The history of automatic sign language translation started with the development of hardware such as finger-spelling robotic hands. In 1977, a finger-spelling hand project called RALPH (short for "Robotic Alphabet") created a robotic hand that can translate alphabets into finger-spellings. [2] Later, the use of gloves with motion sensors became the mainstream, and some projects such as the CyberGlove and VPL Data Glove were born. [3] The wearable hardware made it possible to capture the signers' hand shapes and movements with the help of the computer software. However, with the development of computer vision, wearable devices were replaced by cameras due to their efficiency and fewer physical restrictions on signers. [3] To process the data collected through the devices, researchers implemented neural networks such as the Stuttgart Neural Network Simulator [4] for pattern recognition in projects such as the CyberGlove. Researchers also use many other approaches for sign recognition. For example, Hidden Markov Models are used to analyze data statistically, [3] and GRASP and other machine learning programs use training sets to improve the accuracy of sign recognition. [5] Fusion of non-wearable technologies such as cameras and Leap Motion controllers have shown to increase the ability of automatic sign language recognition and translation software. [6]

Technologies

VISICAST

http://www.visicast.cmp.uea.ac.uk/Visicast_index.html

eSIGN project

http://www.visicast.cmp.uea.ac.uk/eSIGN/index.html

The American Sign Language Avatar Project at DePaul University

http://asl.cs.depaul.edu/

Spanish to LSE

SignAloud

SignAloud is a technology that incorporates a pair of gloves made by a group of students at University of Washington that transliterate [7] American Sign Language (ASL) into English. [8] In February 2015 Thomas Pryor, a hearing student from the University of Washington, created the first prototype for this device at Hack Arizona, a hackathon at the University of Arizona. Pryor continued to develop the invention and in October 2015, Pryor brought Navid Azodi onto the SignAloud project for marketing and help with public relations. Azodi has a rich background and involvement in business administration, while Pryor has a wealth of experience in engineering. [9] In May 2016, the duo told NPR that they are working more closely with people who use ASL so that they can better understand their audience and tailor their product to the needs of these people rather than the assumed needs. [10] However, no further versions have been released since then. The invention was one of seven to win the Lemelson-MIT Student Prize, which seeks to award and applaud young inventors. Their invention fell under the "Use it!" category of the award which includes technological advances to existing products. They were awarded $10,000. [11] [12]

The gloves have sensors that track the users hand movements and then send the data to a computer system via Bluetooth. The computer system analyzes the data and matches it to English words, which are then spoken aloud by a digital voice. [10] The gloves do not have capability for written English input to glove movement output or the ability to hear language and then sign it to a deaf person, which means they do not provide reciprocal communication. The device also does not incorporate facial expressions and other nonmanual markers of sign languages, which may alter the actual interpretation from ASL. [13]

ProDeaf

ProDeaf (WebLibras) [14] is a computer software that can translate both text and voice into Portuguese Libras (Portuguese Sign Language) "with the goal of improving communication between the deaf and hearing." [15] There is currently a beta edition in production for American Sign Language as well. The original team began the project in 2010 with a combination of experts including linguists, designers, programmers, and translators, both hearing and deaf. The team originated at Federal University of Pernambuco (UFPE) from a group of students involved in a computer science project. The group had a deaf team member who had difficulty communicating with the rest of the group. In order to complete the project and help the teammate communicate, the group created Proativa Soluções and have been moving forward ever since. [16] The current beta version in American Sign Language is very limited. For example, there is a dictionary section and the only word under the letter 'j' is 'jump'. If the device has not been programmed with the word, then the digital avatar must fingerspell the word. The last update of the app was in June 2016, but ProDeaf has been featured in over 400 stories across the country's most popular media outlets. [17]

The application cannot read sign language and turn it into word or text, so it only serves as a one-way communication. Additionally, the user cannot sign to the app and receive an English translation in any form, as English is still in the beta edition.

Kinect Sign Language Translator

Since 2012, researchers from the Chinese Academy of Sciences and specialists of deaf education from Beijing Union University in China have been collaborating with Microsoft Research Asian team to create Kinect Sign Language Translator. [18] The translator consists of two modes: translator mode and communication mode. The translator mode is capable of translating single words from sign into written words and vice versa. The communication mode can translate full sentences and the conversation can be automatically translated with the use of the 3D avatar. The translator mode can also detect the postures and hand shapes of a signer as well as the movement trajectory using the technologies of machine learning, pattern recognition, and computer vision. The device also allows for reciprocal communication because the speech recognition technology allows the spoken language to be translated into the sign language and the 3D modeling avatar can sign back to the deaf people. [19]

The original project was started in China based on translating Chinese Sign Language. In 2013, the project was presented at Microsoft Research Faculty Summit and Microsoft company meeting. [20] Currently, this project is also being worked by researchers in the United States to implement American Sign Language translation. [21] As of now, the device is still a prototype, and the accuracy of translation in the communication mode is still not perfect.

SignAll

SignAll [22] is an automatic sign language translation system provided by Dolphio Technologies [23] in Hungary. The team is "pioneering the first automated sign language translation solution, based on computer vision and natural language processing (NLP), to enable everyday communication between individuals with hearing who use spoken English and deaf or hard of hearing individuals who use ASL." The system of SignAll uses Kinect from Microsoft and other web cameras with depth sensors connected to a computer. The computer vision technology can recognize the handshape and the movement of a signer, and the system of natural language processing converts the collected data from computer vision into a simple English phrase. The developer of the device is deaf and the rest of the project team consists of many engineers and linguist specialists from deaf and hearing communities. The technology has the capability of incorporating all five parameters of ASL, which help the device accurately interpret the signer. SignAll has been endorsed by many companies including Deloitte and LT-innovate and has created partnerships with Microsoft Bizspark and Hungary's Renewal. [24] This technology is currently being used at Fort Bend Christian Academy in Sugar Land, Texas and at Sam Houston State University. [25]

MotionSavvy

MotionSavvy [26] was the first sign language to voice system. The device was created in 2012 by a group from Rochester Institute of Technology / National Technical Institute for the Deaf and "emerged from the Leap Motion accelerator AXLR8R." [27] The team used a tablet case that leverages the power of the Leap Motion controller. The entire six person team was created by deaf students from the schools deaf-education branch. [28] The device is currently one of only two reciprocal communication devices solely for American Sign Language. It allows deaf individuals to sign to the device which is then interpreted or vice versa, taking spoken English and interpreting that into American Sign Language. The device is shipping for $198. Some other features include the ability to interact, live time feedback, sign builder, and crowdsign.

The device has been reviewed by everyone from technology magazines to Time. Wired said, "It wasn't hard to see just how transformative a technology like [UNI] could be" and that "[UNI] struck me as sort of magical."Katy Steinmetz at TIME said, "This technology could change the way deaf people live." Sean Buckley at Engadget mentioned, "UNI could become an incredible communication tool."

Related Research Articles

<span class="mw-page-title-main">American Sign Language</span> Sign language used predominately in the United States

American Sign Language (ASL) is a natural language that serves as the predominant sign language of Deaf communities in the United States and most of Anglophone Canada. ASL is a complete and organized visual language that is expressed by employing both manual and nonmanual features. Besides North America, dialects of ASL and ASL-based creoles are used in many countries around the world, including much of West Africa and parts of Southeast Asia. ASL is also widely learned as a second language, serving as a lingua franca. ASL is most closely related to French Sign Language (LSF). It has been proposed that ASL is a creole language of LSF, although ASL shows features atypical of creole languages, such as agglutinative morphology.

<span class="mw-page-title-main">Sign language</span> Language that uses manual communication and body language to convey meaning

Sign languages are languages that use the visual-manual modality to convey meaning, instead of spoken words. Sign languages are expressed through manual articulation in combination with non-manual markers. Sign languages are full-fledged natural languages with their own grammar and lexicon. Sign languages are not universal and are usually not mutually intelligible, although there are also similarities among different sign languages.

Sutton SignWriting, or simply SignWriting, is a system of writing sign languages. It is highly featural and visually iconic, both in the shapes of the characters, which are abstract pictures of the hands, face, and body, and in their spatial arrangement on the page, which does not follow a sequential order like the letters that make up written English words. It was developed in 1974 by Valerie Sutton, a dancer who had, two years earlier, developed DanceWriting. Some newer standardized forms are known as the International Sign Writing Alphabet (ISWA).

A sign language glove is an electronic device which attempts to convert the motions of a sign language into written or spoken words. Some critics of such technologies have argued that the potential of sensor-enabled gloves to do this is commonly overstated or misunderstood, because many sign languages have a complex grammar that includes use of the sign space and facial expressions.

Nicaraguan Sign Language is a form of sign language which developed largely spontaneously among deaf children in a number of schools in Nicaragua in the 1980s. It is of particular interest to linguists as it offers them a unique opportunity to study what they believe to be the birth of a new language.

Auslan is the sign language used by the majority of the Australian Deaf community. The term Auslan is a portmanteau of "Australian Sign Language", coined by Trevor Johnston in the 1980s, although the language itself is much older. Auslan is related to British Sign Language (BSL) and New Zealand Sign Language (NZSL); the three have descended from the same parent language, and together comprise the BANZSL language family. Auslan has also been influenced by Irish Sign Language (ISL) and more recently has borrowed signs from American Sign Language (ASL).

Signing Exact English is a system of manual communication that strives to be an exact representation of English language vocabulary and grammar. It is one of a number of such systems in use in English-speaking countries. It is related to Seeing Essential English (SEE-I), a manual sign system created in 1945, based on the morphemes of English words. SEE-II models much of its sign vocabulary from American Sign Language (ASL), but modifies the handshapes used in ASL in order to use the handshape of the first letter of the corresponding English word.

The American Manual Alphabet (AMA) is a manual alphabet that augments the vocabulary of American Sign Language.

<span class="mw-page-title-main">Gesture recognition</span> Topic in computer science and language technology

Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to interpret gestures.

Tactile signing is a common means of communication used by people with deafblindness. It is based on a sign language or another system of manual communication.

<span class="mw-page-title-main">Wired glove</span> Input device for human–computer interaction

A wired glove is an input device for human–computer interaction worn like a glove.

A contact sign language, or contact sign, is a variety or style of language that arises from contact between deaf individuals using a sign language and hearing individuals using an oral language. Contact languages also arise between different sign languages, although the term pidgin rather than contact sign is used to describe such phenomena.

Bimodal bilingualism is an individual or community's bilingual competency in at least one oral language and at least one sign language, which utilize two different modalities. An oral language consists of a vocal-aural modality versus a signed language which consists of a visual-spatial modality. A substantial number of bimodal bilinguals are children of deaf adults (CODA) or other hearing people who learn sign language for various reasons. Deaf people as a group have their own sign language(s) and culture that is referred to as Deaf, but invariably live within a larger hearing culture with its own oral language. Thus, "most deaf people are bilingual to some extent in [an oral] language in some form". In discussions of multilingualism in the United States, bimodal bilingualism and bimodal bilinguals have often not been mentioned or even considered. This is in part because American Sign Language, the predominant sign language used in the U.S., only began to be acknowledged as a natural language in the 1960s. However, bimodal bilinguals share many of the same traits as traditional bilinguals, as well as differing in some interesting ways, due to the unique characteristics of the Deaf community. Bimodal bilinguals also experience similar neurological benefits as do unimodal bilinguals, with significantly increased grey matter in various brain areas and evidence of increased plasticity as well as neuroprotective advantages that can help slow or even prevent the onset of age-related cognitive diseases, such as Alzheimer's and dementia.

<span class="mw-page-title-main">Kinect</span> Motion-sensing input device for the Xbox 360 and Xbox One

Kinect is a line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB cameras, and infrared projectors and detectors that map depth through either structured light or time of flight calculations, which can in turn be used to perform real-time gesture recognition and body skeletal detection, among other capabilities. They also contain microphones that can be used for speech recognition and voice control.

American Sign Language literature is one of the most important shared cultural experiences in the American deaf community. Literary genres initially developed in residential Deaf institutes, such as American School for the Deaf in Hartford, Connecticut, which is where American Sign Language developed as a language in the early 19th century. There are many genres of ASL literature, such as narratives of personal experience, poetry, cinematographic stories, folktales, translated works, original fiction and stories with handshape constraints. Authors of ASL literature use their body as the text of their work, which is visually read and comprehended by their audience viewers. In the early development of ASL literary genres, the works were generally not analyzed as written texts are, but the increased dissemination of ASL literature on video has led to greater analysis of these genres.

Language acquisition is a natural process in which infants and children develop proficiency in the first language or languages that they are exposed to. The process of language acquisition is varied among deaf children. Deaf children born to deaf parents are typically exposed to a sign language at birth and their language acquisition follows a typical developmental timeline. However, at least 90% of deaf children are born to hearing parents who use a spoken language at home. Hearing loss prevents many deaf children from hearing spoken language to the degree necessary for language acquisition. For many deaf children, language acquisition is delayed until the time that they are exposed to a sign language or until they begin using amplification devices such as hearing aids or cochlear implants. Deaf children who experience delayed language acquisition, sometimes called language deprivation, are at risk for lower language and cognitive outcomes. However, profoundly deaf children who receive cochlear implants and auditory habilitation early in life often achieve expressive and receptive language skills within the norms of their hearing peers; age at implantation is strongly and positively correlated with speech recognition ability. Early access to language through signed language or technology have both been shown to prepare children who are deaf to achieve fluency in literacy skills.

<span class="mw-page-title-main">Black American Sign Language</span> Dialect of American Sign Language

Black American Sign Language (BASL) or Black Sign Variation (BSV) is a dialect of American Sign Language (ASL) used most commonly by deaf African Americans in the United States. The divergence from ASL was influenced largely by the segregation of schools in the American South. Like other schools at the time, schools for the deaf were segregated based upon race, creating two language communities among deaf signers: black deaf signers at black schools and white deaf signers at white schools. As of the mid 2010s, BASL is still used by signers in the South despite public schools having been legally desegregated since 1954.

<span class="mw-page-title-main">Nonmanual feature</span> Sign language syntax

A nonmanual feature, also sometimes called nonmanual signal or sign language expression, are the features of signed languages that do not use the hands. Nonmanual features are gramaticised and a necessary component in many signs, in the same way that manual features are. Nonmanual features serve a similar function to intonation in spoken languages.

Disability dongles are devices designed for individuals with disabilities. The term was introduced in 2019 by Liz Jackson, contributing to the discourse on assistive technology within the broader framework of social constructs. Coined with satirical intent, the term illuminates the paradoxical nature of these devices designed for individuals with disabilities. While originating as a critical descriptor, the concept has evolved beyond its satirical roots to encapsulate the complex intersection of innovation, societal expectations, and the genuine needs of the disabled community. This article explores the historical context, media representation, and the challenges associated with disability dongles, shedding light on their impact within the broader landscape of social constructs and assistive technology.

References

  1. Mocialov, Boris; Turner, Graham; Lohan, Katrin; Hastie, Helen (2017). "Towards Continuous Sign Language Recognition with Deep Learning" (PDF). Creating Meaning with Robot Assistants: The Gap Left by Smart Devices (IEEE-RAS International Conference on Humanoid Robots). S2CID   5525834. Archived from the original (PDF) on 2021-01-10. Retrieved 2020-05-04.
  2. Jaffe, DL (August 1994). "Evolution of mechanical fingerspelling hands for people who are deaf-blind". Journal of Rehabilitation Research and Development. 31 (3): 236–244. PMID   7965881.
  3. 1 2 3 Parton, B. S. (12 October 2005). "Sign Language Recognition and Translation: A Multidisciplined Approach From the Field of Artificial Intelligence". Journal of Deaf Studies and Deaf Education. 11 (1): 94–101. doi: 10.1093/deafed/enj003 . PMID   16192405.
  4. Weissmann, J.; Salomon, R. (1999). "Gesture recognition for virtual reality applications using data gloves and neural networks". IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339). Vol. 3. pp. 2043–2046. doi:10.1109/IJCNN.1999.832699. ISBN   978-0-7803-5529-3. S2CID   18434944.
  5. Bowden, Richard; Zisserman, Andrew; Windridge, Dave; Kadir, Timor; Brady, Mike (2003). "Vision based Interpretation of Natural Sign Languages" (PDF). S2CID   67094263.{{cite journal}}: Cite journal requires |journal= (help)
  6. Bird, Jordan J.; Ekárt, Anikó; Faria, Diego R. (9 September 2020). "British Sign Language Recognition via Late Fusion of Computer Vision and Leap Motion with Transfer Learning to American Sign Language". Sensors. 20 (18): 5151. Bibcode:2020Senso..20.5151B. doi: 10.3390/s20185151 . PMC   7571093 . PMID   32917024.
  7. "What is the difference between translation and transliteration". english.stackexchange.com. Retrieved 2017-04-06.
  8. "SignAloud". Archived from the original on 2020-09-21. Retrieved 2017-02-28.
  9. "Thomas Pryor and Navid Azodi | Lemelson-MIT Program". lemelson.mit.edu. Archived from the original on 2020-09-21. Retrieved 2019-07-04.
  10. 1 2 "These Gloves Offer A Modern Twist On Sign Language". All Tech Considered. NPR. 17 May 2016.
  11. "Collegiate Inventors Awarded Lemelson-MIT Student Prize". Lemelson-MIT Program. Archived from the original on 2021-01-13. Retrieved 2017-03-09.
  12. "UW undergraduate team wins $10,000 Lemelson-MIT Student Prize for gloves that translate sign language". University of Washington. 2016-04-12. Retrieved 2017-04-09.
  13. "Nonmanual markers in American Sign Language (ASL)". www.lifeprint.com. Retrieved 2017-04-06.
  14. "ProDeaf". prodeaf.net. Archived from the original on 2021-03-12. Retrieved 2017-04-09.
  15. "ProDeaf". www.prodeaf.net. Retrieved 2017-03-09.
  16. "ProDeaf". www.prodeaf.net. Retrieved 2017-03-16.
  17. "ProDeaf Tradutor para Libras on the App Store". App Store. Retrieved 2017-03-09.
  18. Chen, Xilin; Li, Hanjing; Pan, Tim; Tansley, Stewart; Zhou, Ming. "Kinect Sign Language Translator expands communication possibilities" (PDF). Microsoft Research Connections. Archived from the original (PDF) on 29 March 2014.
  19. Chai, Xiujuan; Li, Guang; Lin, Yushun; Xu, Zhihao; Tang, Y. B.; Chen, Xilin (2013). "Sign Language Recognition and Translation with Kinect" (PDF). CiteSeerX   10.1.1.711.4714 . S2CID   17957882.{{cite journal}}: Cite journal requires |journal= (help)
  20. "Kinect Sign Language Translator". Microsoft . 29 October 2013.
  21. Zafrulla, Zahoor; Brashear, Helene; Starner, Thad; Hamilton, Harley; Presti, Peter (2011). "American sign language recognition with the kinect". Proceedings of the 13th international conference on multimodal interfaces - ICMI '11. p. 279. doi:10.1145/2070481.2070532. ISBN   978-1-4503-0641-6. S2CID   5488882.
  22. "SignAll. We translate sign language. Automatically". www.signall.us. Archived from the original on 2021-02-02. Retrieved 2017-04-09.
  23. "Dolphio | Unique IT Technologies". www.dolphio.hu. Retrieved 2017-04-06.
  24. "SignAll. We translate sign language. Automatically". www.signall.us. Archived from the original on 2021-02-02. Retrieved 2017-03-09.
  25. "Fort Bend Christian Academy American Sign Language Program Pilots New Technology | Fort Bend Focus Magazine" . Retrieved 2023-08-08.
  26. "MotionSavvy UNI: 1st sign language to voice system". Indiegogo. Retrieved 2017-03-09.
  27. "Rochester Institute of Technology (RIT)". Rochester Institute of Technology (RIT). Retrieved 2017-04-06.
  28. Tsotsis, Alexia (6 June 2014). "MotionSavvy Is A Tablet App That Understands Sign Language". TechCrunch. Retrieved 2017-04-09.