“Gowajee” — a Thai Speech-Recognition AI from Chula

An engineering professor from Chula has designed “Gowajee”, a Thai-language speech recognition AI capable of delivering speech-to-text/text-to-speech with the accuracy of a native speaker while keeping users’ data secure. Having been rolled out in call centers, and depression patients screening process, Gowajee is set to be adapted to many other functions.

‘OK, Google’

We’re getting used to using our voice commands for AIs like Google or Siri to search or carry out tasks instead of typing them out. But for Thai speakers, have you ever felt that those AI voices don’t seem to understand the Thai tone of voice that we use? Many times, we get a transcription that doesn’t match our words which means we need to adjust our Thai pronunciation to the AI ​​developed by a foreign company that was aimed for multilingual adaptability, mostly standard languages ​​like English.

Realizing this problem, a team led by: Dr. Ekapol Chuangsuwanich of the Department of Computer Engineering, Faculty of Engineering, Chulalongkorn University has developed: “Gowajee” a genuine Thai speech-recognition AI that understands and executes commands in the Thai language more naturally and accurately. Actual usage has shown only a 9% incidence of linguistic inaccuracy compared to 15% for other language-recognition AIs.

The name: Gowajee derives from the word ‘Go’ and: ‘Wajee’ which means words. The word is designed as a command similar to ‘OK Google’ or ‘Hey Siri’. The word Gowajee was designed in such a way as not to replicate any other word being used in the Thai language.

Thai language AI with a Thai sound database

Dr. Ekapol and his team began the task of compiling a Thai sound database from 2017 up to the present. As he recalled,

“…we applied a variety of methods and formats such as creating a website for people to log in and read a text to be stored as a sound database, getting people to engage in a conversation or actors to perform emotional speaking. Altogether, we achieved a compilation totaling five thousand hours which made us confident that we had a big enough database to transcribe Thai accurately.”

This database was enough to enable the Gowajee team to develop an accurate Thai language recognition AI that could be adapted for use in three main features.

1. Automated Speech Recognition (ASR)

which turns speech into text. “For example, if we record a lecture, the AI ​​will transcribe it into texts for us to read without having to transcribe it ourselves,” Dr. Ekapol suggested.

2. Text-to-Speech (TTS)

works by transcribing a passage into spoken words in the same way that we might be familiar with the use of Google or Siri except that Gowajee will deliver more natural speech thanks to a larger Thai database.

3. Automatic Speaker Verification (ASV)

is an identity verification through sound which can be used when contacting a call center or indicating the speaker and time frame.

Gowajee – a perfect solution for call centers

Ever since it was developed, Gowajee has been used by various agencies, like universities, and the public and private sectors, especially at call centers, both for speech-to-text, and text-to-speech functions. Gowajee’s error is only at 9% compared to 15% by other AIs.

“Most clients have been satisfied with Gowajee’s level of accuracy. It is an improved version of what they have previously used and the price is also more affordable. As for the errors, we are certain that they will decrease as the database grows.”

In search of meaning in the voice. Gowajee helps to screen patients with depression

As a result of data gathering of voices that convey various emotions, Gowajee has been able to help develop the systems used in DMIND for screening patients with depression.

“DIMIND proved to be very challenging for us. Aside from transcriptions, a model of classifying and decoding emotions from voices in at-risk groups is also needed. Crying is usually involved which makes voices difficult to transcribe and decode, but Gowajee was able to do considerably well by determining the important keywords for decoding.”

How can Gowajee be adapted for use in other areas?

Gowajee and AI technology can be used in many other areas such as …

  • A dental assistant taking notes while the dentist is doing dental work on the patient and needing to record some notes.
  • It can be used to detect a stroke risk in patients with slurred speech.
  • Act as a life coach by asking questions and analyzing people’s life goals from video interviews, use as part of students’ and employees’ orientation.
  • Modify and amplify sounds for the hard-of-hearing so that they can hear more clearly.

Your data is safe with Gowajee

“Data safety” is what puts Gowajee above other speech-recognition AIs. As Dr. Ekapol tells us “Normally other transcription programs store their data on the cloud or compile them on users’ computer. With Gowajee, all the data is stored on the user’s database ensuring its safety. This is useful for organizations like banks which need high data security.

AIs are becoming increasingly clever with the enhanced linguistic abilities that are getting closer and closer to human beings which have caused many to worry about being replaced by technologies. In terms of AIs for Thai language transcription, Dr. Ekapol only sees them as enablers that will make life easier for us in the present and the future.

“AIs aren’t that disruptive to our lives. We are disrupting ourselves. Aging societies, a shortage of working-age labor are making it necessary for us to create technologies to substitute what we can’t find humans to do.” Dr. Ekapol also concluded by saying “I’m not expecting that my work is going to be helpful to the aged of today but I’m thinking that in the future when I reach an old age I will be making use of these technologies.”

For more information and a trial of Gowajee Thai speech recognition AI, please visit: https://www.gowajee.ai/.

.