Meta announces plans to build an AI-powered ‘universal speech translator’

Meta, the owner of Facebook, Instagram, and WhatsApp, has announced an ambitious new AI research project to create translation software that works for “everyone in the world.” The project was announced as part of an event focusing on the broad range of benefits Meta believes AI can offer the company’s metaverse plans.

“The ability to communicate with anyone in any language — that’s a superpower people have dreamed of forever, and AI is going to deliver that within our lifetimes,” said Meta CEO Mark Zuckerberg in an online presentation.

The company says that although commonly spoken languages like English, Mandarin, and Spanish are well catered to by current translation tools, roughly 20 percent of the world’s population do not speak languages covered by these systems. Often, these under-served languages do not have easily accessible corpuses of written text that are needed to train AI systems or sometimes have no standardized writing system at all.

Meta says it wants to overcome these challenges by deploying new machine learning techniques in two specific areas. The first focus, dubbed No Language Left Behind, will concentrate on building AI models that can learn to translate language using fewer training examples. The second, Universal Speech Translator, will aim to build systems that directly translate speech in real-time from one language to another without the need for a written component to serve as an intermediary (a common technique for many translation apps).

In a blog post announcing the news, Meta researchers did not offer a timeframe for completing these projects or even a roadmap for major milestones in reaching their goal. Instead, the company stressed the utopian possibilities of universal language translation.

“Eliminating language barriers would be profound, making it possible for billions of people to access information online in their native or preferred language,” they write. “Advances in [machine translation] won’t just help those people who don’t speak one of the languages that dominates the internet today; they’ll also fundamentally change the way people in the world connect and share ideas.”

Crucially, Meta also envisions that such technology would hugely benefit its globe-spanning products — furthering their reach and turning them into essential communication tools for millions. The blog post notes that universal translation software would be a killer app for future wearable devices like AR glasses (which Meta is building) and would also break down boundaries in “immersive” VR and AR reality spaces (which Meta is also building). In other words, though developing universal translation tools may have humanitarian benefits, it also makes good business sense for a company like Meta.

It’s certainly true that advances in machine learning in recent years have hugely improved the speed and accuracy of machine translation. A number of big tech companies, from Google to Apple, now offer users free AI translation tools, used for work and tourism, and undoubtedly provide incalculable benefits around the world. But the underlying technology has its problems, too, with critics noting that machine translation misses nuances critical for human speakers, injects gendered bias into its outputs, and is capable of throwing up those weird, unexpected errors only a computer can. Some speakers of uncommon languages also say they fear losing hold of their speech and culture if the ability to translate their words is controlled solely by big tech.

Considering such errors is critical when massive platforms like Facebook and Instagram apply such translations automatically. Consider, for example, a case from 2017 when a Palestinian man was arrested by Israeli police after Facebook’s machine translation software mistranslated a post he shared. The man wrote “good morning” in Arabic, but Facebook translated this as “hurt them” in English and “attack them” in Hebrew.

And while Meta has long aspired to global access, the company’s own products remain biased towards countries that provide the bulk of its revenue. Internal documents published as part of the Facebook Papers revealed how the company struggles to moderate hate speech and abuse in languages other than English. These blind spots can have incredibly deadly consequences, as when the company failed to tackle misinformation and hate speech in Myanmar prior to the Rohingya genocide. And similar cases involving questionable translations occupy Facebook’s Oversight Board to this day.

So while a universal translator is an incredible aspiration, Meta will need to prove not only that its technology is equal to the task but that, as a company, it can apply its research fairly.

Repost: Original Source and Author Link


Microsoft taps AI techniques to bring Translator to 100 languages

Join gaming leaders online at GamesBeat Summit Next this upcoming November 9-10. Learn more about what comes next. 

Today, Microsoft announced that Microsoft Translator, its AI-powered text translation service, now supports more than 100 different languages and dialects. With the addition of 12 new languages including Georgian, Macedonian, Tibetan, and Uyghur, Microsoft claims that Translator can now make text and information in documents accessible to 5.66 billion people worldwide.

Its Translator isn’t the first to support more than 100 languages — Google Translate reached that milestone first in February 2016. (Amazon Translate only supports 71.) But Microsoft says that the new languages are underpinned by unique advances in AI and will be available in the Translator apps, Office, and Translator for Bing, as well as Azure Cognitive Services Translator and Azure Cognitive Services Speech.

“One hundred languages is a good milestone for us to achieve our ambition for everyone to be able to communicate regardless of the language they speak,” Microsoft Azure AI chief technology officer Xuedong Huang said in a statement. “We can leverage [commonalities between languages] and use that … to improve whole language famil[ies].”


As of today, Translator supports the following new languages, which Microsoft says are natively spoken by 84.6 million people collectively:

  • Bashkir
  • Dhivehi
  • Georgian
  • Kyrgyz
  • Macedonian
  • Mongolian (Cyrillic)
  • Mongolian (Traditional)
  • Tatar
  • Tibetan
  • Turkmen
  • Uyghur
  • Uzbek (Latin)

Powering Translator’s upgrades is Z-code, a part of Microsoft’s larger XYZ-code initiative to combine AI models for text, vision, audio, and language in order to create AI systems that can speak, see, hear, and understand. The team comprises a group of scientists and engineers who are part of Azure AI and the Project Turing research group, focusing on building multilingual, large-scale language models that support various production teams.

Z-code provides the framework, architecture, and models for text-based, multilingual AI language translation for whole families of languages. Because of the sharing of linguistic elements across similar languages and transfer learning, which applies knowledge from one task to another related task, Microsoft claims it managed to dramatically improve the quality and reduce costs for its machine translation capabilities.

With Z-code, Microsoft is using transfer learning to move beyond the most common languages and improve translation accuracy for “low-resource” languages, which refers to languages with under 1 million sentences of training data. (Like all models, Microsoft’s learn from examples in large datasets sourced from a mixture of public and private archives.) Approximately 1,500 known languages fit this criteria, which is why Microsoft developed a multilingual translation training process that marries language families and language models.

Techniques like neural machine translation, rewriting-based paradigms, and on-device processing have led to quantifiable leaps in machine translation accuracy. But until recently, even the state-of-the-art algorithms lagged behind human performance. Efforts beyond Microsoft illustrate the magnitude of the problem — the Masakhane project, which aims to render thousands of languages on the African continent automatically translatable, has yet to move beyond the data-gathering and transcription phase. Additionally, Common Voice, Mozilla’s effort to build an open source collection of transcribed speech data, has vetted only dozens of languages since its 2017 launch.

Z-code language models are trained multilingually across many languages, and that knowledge is transferred between languages. Another round of training transfers knowledge between translation tasks. For example, the models’ translation skills (“machine translation”) are used to help improve their ability to understand natural language (“natural language understanding”).

In August, Microsoft said that a Z-code model with 10 billion parameters could achieve state-of-the-art results on machine translation and cross-lingual summarization tasks. In machine learning, parameters are internal configuration variables that a model uses when making predictions, and their values essentially — but not always — define the model’s skill on a problem.

Microsoft is also working to train a 200-billion-parameter version of the aforementioned benchmark-beating model. For reference, OpenAI’s GPT-3, one of the world’s largest language models, has 175 billion parameters.

Market momentum

Chief rival Google is also using emerging AI techniques to improve the language-translation quality across its service. Not to be outdone, Facebook recently revealed a model that uses a combination of word-for-word translations and back-translations to outperform systems for more than 100 language pairings. And in academia, MIT CSAIL researchers have presented an unsupervised model — i.e., a model that learns from test data that hasn’t been explicitly labeled or categorized — that can translate between texts in two languages without direct translational data between the two.

Of course, no machine translation system is perfect. Some researchers claim that AI-translated text is less “lexically” rich than human translations, and there’s ample evidence that language models amplify biases present in the datasets they’re trained on. AI researchers from MIT, Intel, and the Canadian initiative CIFAR have found high levels of bias from language models including BERT, XLNet, OpenAI’s GPT-2, and RoBERTa. Beyond this, Google identified (and claims to have addressed) gender bias in the translation models underpinning Google Translate, particularly with regard to resource-poor languages like Turkish, Finnish, Persian, and Hungarian.

Microsoft, for its part, points to Translator’s traction as evidence of the platform’s sophistication. In a blog post, the company notes that thousands of organizations around the world use Translator for their translation needs, including Volkswagen.

“The Volkswagen Group is using the machine translation technology to serve customers in more than 60 languages — translating more than 1 billion words each year,” Microsoft’s John Roach writes. “The reduced data requirements … enable the Translator team to build models for languages with limited resources or that are endangered due to dwindling populations of native speakers.”


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link