Koemiru: ICT Tool for Special Needs Schools
This article introduces Koemiru, our ICT (information and communications technology) system developed for special needs schools. Koemiru, which literally means "see the voice" in Japanese, uses speech recognition technology to support hearing-impaired elementary school pupils. Our system converts utterances into text and displays them on an interactive whiteboard and portable game terminals. Validation experiments were conducted in Okinawa and Tottori to identify the strengths and weaknesses of Koemiru.
1.1 Challenges in using ICT in the education field
The NTT Group is implementing the Education Square × ICT project with the aim of leveraging information and communications technology (ICT) to develop new learning methods. Through this project, we found that there was a compelling need to use ICT to support the teachers and pupils of special needs schools, but that existing solutions were unable to meet their requirements.
1.2 Our solution for special needs schools
We surveyed special needs schools and consulted with NTT CLARUTY Corporation and found there was a desire for ICT to be applied to meet the fundamental needs of their pupils. These needs include a visible voice for hearing-impaired pupils, a hearing character for sight-impaired pupils, and easier conversation for developmentally disabled pupils. Our market research activities clarified the importance of using ICT technologies for conversation support and for providing information for people with disabilities by alternative methods. We also noticed that pupils wanted to use devices that were already popular at home and in their daily life at school as well. NTT has been researching free conversation speech recognition technology – and has already developed a speech auto-answer system that is used over the telephone. In addition, because schools for the deaf are much quieter than conventional schools, their environment is appropriate for the use of speech recognition technology. Accordingly, we decided to create a conversation support system for hearing-impaired pupils by applying speech recognition technology (Fig. 1). We focused on the three key goals of visualization of the teacher’s voice, utterance training, and conversation support in everyday life.
We developed Koemiru (see the voice) to support hearing-impaired pupils in special needs schools . The system consists of various servers and terminals. A speech recognition server and administration server are cloud services (Fig. 2). The teachers use a personal computer (PC) or smartphone as a terminal, the pupils use Nintendo DS portable game players as terminals, and the classroom has an interactive whiteboard. We chose the game terminal for three reasons. First of all, it is highly popular with pupils, and they can use it easily. Second, it is a familiar device in the community and does not look out of place when the pupils use it outside school. Third, it is very durable and easy to reboot.
2.2 Main functions
Koemiru has three functions: classroom mode, training mode, and conversation mode.
Classroom mode is used in school lessons. When the teacher speaks into the wireless microphone, his or her voice is sent to the speech recognition server on the cloud computing platform and is recognized and output as text. The text output is displayed on the interactive whiteboard and on the portable game terminals of authorized pupils. Japanese uses both kanji and kana (hiragana and katakana). Because the pupils can read only kana, teachers can choose either kana alone or both kanji and kana when outputting the recognition results (Fig. 3). This selection is managed by the administration server; the recognition results are displayed only to authorized teachers and pupils. Additionally, the teacher can correct any mistakes in the recognition results via this server.
When hearing-impaired pupils perform voice training, they access the training mode on the Nintendo DS, as shown in Fig. 4. If a pupil utters something, the recognition results are displayed on the DS. Voice-volume training is supported by the visualized volume bar.
When pupils are at home or outside of school, they can use the conversation mode to communicate with other people. They can pose questions on the Nintendo DS using stored questions or by entering the text themselves. When the respondents speak into the Nintendo DS to reply to the hearing-impaired pupils, the recognized text appears on the display of the Nintendo DS.
3. Demonstration experiment
We conducted trials to evaluate the effectiveness of Koemiru. These trials were held from the end of January to early March 2012 in a fifth-grade class at Tottori Prefectural School for the Deaf, Himawari campus, and in a first-grade class at Okinawa Prefectural School for the Deaf. We prepared broadband connections by establishing Wi-Fi links from each terminal to a Wi-Fi router in the classroom, which was linked via optical fiber to the Internet. We optimized the acoustic model of the speech recognition server by using speech samples recorded by the actual teacher in the classroom. We also added the words in the textbooks used in the classroom to the speech recognition dictionary, so the server could accurately recognize the speech of the teacher. The teachers requested that all recognition results be output in hiragana because they wanted to teach the pupils how to read kanji (Fig. 5). We also conducted free-form interviews with the teachers using a prepared questionnaire to determine service acceptability.
The results of the trials (Fig. 6) are summarized as follows.
(1) This system is effective in teaching new words to pupils, since physical objects can be imagined from the words.
(2) This system stores all the speech recognition results of the teacher. Thus, teachers and pupils can review what was said at the end of the class.
(3) All devices are very user-friendly for both teachers and pupils. In particular, the Nintendo DS, one of the most popular portable game terminals, was well received by the pupils because no special instruction is needed to use it. Moreover, the pupils do not need to hold it when using it; they can simply put it on their desks.
(4) This system provided a very easy-to-use way for the teacher to edit the recognized text. The teachers used this function often, even though recognition accuracy was extremely high, to improve the content of their messages.
The results also highlighted the following challenges.
- The response time should be reduced to improve the service acceptability, particularly with short sentences. The teachers found that with short sentences, the current version did not start to return the text until the teacher had finished speaking the sentence, and they felt the waiting time was annoying.
- Output correction required that the teacher access his or her PC. However, because the teacher is often in front of the classroom’s blackboard or whiteboard, it is desirable to enable output correction at the interactive whiteboard.
- It is preferable for teachers to be able to add words to the dictionary by themselves.
- The teachers had difficulty in using the smartphone as the interface. Because they must be able to use sign language for the pupils, they cannot simultaneously hold the phones.
4. Future plans
We developed the Koemiru system based on speech recognition technology to support conversations with and among pupils with hearing disabilities. The results of trials indicated that Koemiru is effective in providing support to pupils in special needs schools. We plan to work on eliminating the aforementioned problems so that we can introduce this system into the market at the earliest possible date.