Monash Uni to develop AI translation app for diplomatic talks

Researchers at Monash University have been charged to lead a $5 million project, backed by the US Department of Defense's Defense Advanced Research Projects Agency (DARPA), to develop an AI-based smartphone application that could assist with real-time interpretations for diplomatic talks, international business, and tourism. 

The research will be led by the Monash University's Vision and Language Group at the Faculty of IT in collaboration with researchers from the David Nazarian College of Business and Economics at California State University, and King's College London.

According to Monash, the project will involve developing a language processing system to be used together with smart glasses. The hope is the system can recognise and adapt to emotional, social, and cultural cues that vary between different societies and languages.

For instance, the system would be able to recognise an imminent communication breakdown by analysing audio-visual cues in real time, before sending a notification to the user's smart glasses suggesting a more appropriate response to rectify the situation, such as addressing the other person more respectfully to make the other feel more comfortable, Monash said.

“In addition to interpreting the content of the speech, the system will be ‘translating' body language and facial expressions, providing cultural cues to prevent a breakdown in communications and ensuring smoother cross-cultural dialogue. During this project, we will be focussing mainly on negotiation-based dialogues,” the researchers explained.  

Over the next three years, the researchers will work on the initial prototype in two phases, before releasing the prototype by March 2023.

“Current AI-enabled systems are not capable of accurately analysing the many nuances of human communication or of providing useful assistance beyond basic machine translation,” Faculty of IT deputy dean professor Maria Garcia de la Banda said.

“In this project, our researchers will combine sophisticated speech technology with advanced multimedia analysis and cultural knowledge to build systems that provide a holistic solution.”

Related Coverage

Source