Speaking to a virtual agent today usually involves conversing with Alexa over thin air or typing into a chatbox with a picture of an operator above.
The interaction can be improved by making these virtual agents more lifelike, the same way that game characters are becoming more human-like, according to a Singapore-based startup Connectome.
The brainchild of two Japanese co-founders, it wants to create a virtual human agent that can respond with eye contact as well as facial expressions. To do this, it first also has to understand the non-verbal cues from a user – in real time.
The grand challenge, said Atsushi Ishii, one of the co-founders, is to have an agent react to situations in the real world, and follow up by conversing and communicating as a human would.
So, if you say you’re happy but your face is not, then a virtual human agent can detect the subtle difference and understand that complex emotion, he told Techgoondu in an interview last week.
Based in Singapore since late last year, Connectome is a spin-off from Couger, another company that Ishii had set up in Japan, that has worked on top games such as Final Fantasy.
He and fellow Connectome co-founder, Yasunori Motani, now aim to use the artificial intelligence (AI) that has been utilised in games today to make better virtual agents.
The idea is to have human-like intelligent traits, like speech and emotion recognition. A working prototype, moving from early alpha to beta phase now, is called Rachel, a nod to the Bladerunner sci-fi movies.
The technology is being set up in a Taiwanese university’s cafeteria. There, a virtual agent on a monitor will be able to gauge a person’s mood by analysing his face as he walks past.
If a student looks happy, then the agent will ask something to the effect of “what’s up?”. If a person looks downcast, it might ask if he is okay.
The conversations that can be had are still limited now, according to Ishii. However, a lot of the technology that is required is being built at a fast pace, he added.
For example, emotion sensing is used today in interactive advertising displays to gauge how people react to an advertisement. Similarly, computer vision can analyse whether a driver is tired or sleepy at the wheel.
However, combining the inputs from these sources and creating a human-like response in real time is a big challenge, said Ishii.
Language, for example, is one hurdle, he noted. “Especially Japanese sentences… if I say something, the meaning is decided at the end, so it’s hard for the AI to respond in real time because it has to wait until the end of the sentence.”
Still, Connectome is confident that it can iron out these problems as it merges the various technologies, such as natural language and AI, in its tests.
Later this year, the company is set to showcase a proof of concept in Singapore with enterprises here. While the development is still done in Japan, Singapore presents a way to reach the international market.
Japan itself is a big enough market, said Ishii, but he did not want Connectome’s technology to be overtaken by alternatives developed elsewhere.
He pointed to the difficulties that Japanese pioneers in video games, mobile phone services and social networking faced when trying to expand overseas. Eventually, they fell behind, he noted.
“If we start a business in Japan, it has enough of a market,” stressed. “But if we give our product to fit Japan too much, it’s difficult to expand overseas.”