Microsoft has demonstrated Skype carrying out voice translation almost instantaneously. It plans to roll out the feature to the public starting later this year.
Voice translation has been around for several years, with varying degrees of success. In 2010, Google unveiled a voice feature in Google Translate that — in theory at least — allowed two people speaking different languages to converse by passing phones back and forth. It later demonstrated the idea being taken to the logical conclusion of running on two smartphones and working on a phone call.
At a tech conference titled Code, Microsoft demonstrated Skype Translator. Two Microsoft staff used the technology to have a conversation in English and German respectively. The actual translation itself was described by some delegates as not perfect, but impressive and certainly enough to understand the essence of the conversation.
It appears that from a technical standpoint, the system would be able to do literal real-time translation, meaning the translated speech of a sentence would start before the speaker had finished. However, Microsoft says it has chosen to use a set-up where the speaker has to first finish his sentence, then the machine speaks the translation. That would seem more suited to conversational speaking, though it would mean people wouldn’t be able to speak over one another.
Microsoft plans to have the feature available in beta form by the end of the year. It’s probably going to be 2016 before a “finished” version is available, at which point it may become a paid service.
Singh Pall, who took part in the demonstration, noted there may be a battle of technology and privacy in the works. In theory, Microsoft could use the audio from real Skype conversations to get more data on how people really speak and pronounce words, which is very different when chatting to a friend or colleague than it is when deliberately and slowly speaking words with perfect diction to try to improve speech recognition. In practice, Microsoft concedes it would have to get permission to use calls in such a way and this could raise privacy concerns.