Categories
AI

Synthesia, which is developing AI to generate synthetic videos, secures $50M

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more


Synthesia, a company leveraging AI to generate videos of avatars, today announced that it raised $50 million in a series B round, bringing its total raised to $66.5 million. Kleiner Perkins led the round with participation from GV, FirstMark Capital, LDV Capital, Seedcamp, MCC Ventures, and individual investors, which CEO Victor Riparbelli says will be put toward supporting growth and advancing Synthesia’s technology.

As pandemic restrictions make conventional filming tricky and risky, the benefits of AI-generated video have been magnified. According to Dogtown Media, under normal circumstances, an education campaign might require as many as 20 different scripts to address a business’ worldwide workforce, with each video costing tens of thousands of dollars.

Synthesia says its technology can reduce the expense to as low as $30.

“Synthesia is focused on reducing the friction of video creation and making it possible for anyone to create professional-looking videos in minutes, directly from their browser,” Riparbelli told VentureBeat via email. “Synthesia’s first commercial product, Synthesia Studio, launched in public beta in the summer of 2020. It is now used by thousands of companies, including several Fortune 500 companies.”

Generating synthetic videos

Like rivals Soul Machines, Brud, Wave, Samsung-backed STAR Labs, and others, Synthesia employs a combination of AI techniques to create visual chatbots, product demonstrations, and sales videos for clients without actors, film crews, studios, or cameras. Founded in 2017 by Riparbelli, Steffen Tjerrild, and computer vision researchers Matthias Niessner and Lourdes Agapito, Synthesia claims to have generated more than six million videos for over 4,000 clients — including SAP and Accenture — in the last year alone.

Synthesia

Above: A synthetic avatar created with Synthesia’s tools.

Image Credit: Synthesia

Synthesia customers choose from a gallery of in-house, AI-generated presenters or create their own by recording about 5 to 40 minutes’ worth of voice clips. After typing or pasting in a video script, Synthesia generates a video “in minutes” with custom backgrounds and an avatar that mimics a person’s facial movements and how they pronounce different phonemes, the units of speech distinguishing one word from another.

Synthesia says that client CraftWW used its platform to ideate an advertising campaign for JustEat in the Australian market featuring an AI-manipulated Snoop Dogg. The company also worked with director Ridley Scott’s production studio to create a film for the nonprofit Malaria Must Die, which translated David Beckham’s voice into over nine languages. And it partnered with Reuters to develop a prototype for automated video sport reports.

Synthesia recently made generally available a product that personalizes videos to specific customer segments. Aptly called Personalize, it can translate videos featuring actors or staff members into over 40 languages. Wired reports that more than 35 partners at EY, formerly Ernst & Young, have used Personalize to create what they call “artificial reality identity,” or ARIs — client presentations and emails with synthetic video clips starring virtual body doubles of themselves.

Synthesia

Above: Synthesia’s avatar creation dashboard.

Image Credit: Synthesia

“Our core use case today learning and development and internal communications videos, where the front-facing AI avatars work really well,” Riparbelli said. “The new investment will partly go to expand our core AI platform, which will allow more use cases.”

Deepfake concerns

Some experts have expressed concern that tools like Synthesia’s could be used to create deepfakes, or AI-generated videos that take a person in an existing video and replace them with someone else’s likeness. The fear is that these fakes might be used to do things like sway opinion during an election or implicate a person in a crime.

Recently, a group of fraudsters made off with $35 million after using forged email messages and deepfake audio to convince an employee of a United Arab Emirates company that a director requested the money. And just last month, Japanese police arrested a man for using deepfake technology to effectively unblur censored pornographic videos.

“[Deepfake technology is] becoming cheaper and more accessible every day … Audiographic evidence must be viewed with greater skepticism and must meet higher standards,” researchers in a new study on deepfakes commissioned by the European Parliament’s Technology Assessment Committee wrote. “[Individuals and institutions] will need to develop new skills and methods to construct a trustworthy picture of reality as they will inevitably be confronted with deceptive information.”

Synthesia

For its part, over-60-employee Synthesia says it has posted ethics rules online and vets its customers and their scripts. It also requires formal consent from a person before it will synthesize their appearance and refuses to touch political content.

“Our main competition is text — boring PDFs that people don’t read. Synthesia is driving a paradigm shift in how we create video content,” Riparbelli continued. “Looking into the next decade of Synthesia, we’re building for a future where you can create Hollywood-grade video on a laptop. On our way there, we’ll be solving some of the hardest and most fundamental problems in AI and computer vision. With the new funds, we’ll invest even deeper in advancing our core AI research to accelerate this vision. In parallel, we will also slowly open up some of our research to the world and begin actively contributing to the broader research community.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link

Categories
AI

WellSaid raises $10M to generate synthetic voices

Where does your enterprise stand on the AI adoption curve? Take our AI survey to find out.


WellSaid Labs, a startup developing synthetic voice technology, today announced it has raised $10 million in a series A round led by Fuse, with participation from Voyager, Qualcomm Ventures, and GoodFriends. The round, which was oversubscribed, will support the company’s R&D and grow its team, according to CEO Matt Hocking.

Creating natural-sounding speech from text is considered a grand challenge in the field of AI and has been a research goal for decades. Content creators and product designers have long faced tradeoffs between quality and scalability when using text-to-speech tools versus human voiceovers. But with AI, creators, product developers, and brands have the potential to power experiences with a wide variety of voice styles, accents, and languages at scale. Startups creating virtual beings, or artificial people powered by AI, have collectively raised more than $320 million in venture capital to date.

WellSaid launched in 2018 as a research project at the Allen Institute of Artificial Intelligence, a lab started by Microsoft cofounder Paul Allen with the mission of conducting pivotal AI research and engineering. WellSaid’s team set out to create the most lifelike synthetic voices, with CTO Michael Petrochuck leading R&D to build the key AI.

“What started as a research project … is now a growth-stage startup with thousands of customers in media and advertising, technology, manufacturing, defense, pharmaceuticals, healthcare, and education,” Hocking told VentureBeat via email. “In terms of the fundamentals of the business, [due to the pandemic] our mid-market and enterprise customers [have] accelerated and shifted a substantial amount of their voiceover and media productions from in-person to remote locations. This added more moving pieces and quality issues to their productions.”

AI-powered speech

Using WellSaid, companies can pick from a range of voice avatars and create voiceovers straight from a script, with one or many voices based on style, gender, and production type. They’re able to make edits to the copy, change the pausing, or use a different voice and teach the platform to say terms with unique spellings and pronunciations. WellSaid also allows users to share projects and files with team members, as well as building voice avatars for branded content, creating avatars from the voice of a real person with only a few hours of recordings.

Over two years, WellSaid incrementally improved the naturalness of its synthetic voices, aiming for “human parity,” according to Hocking. In a July 2019 study, the company asked participants to listen to a set of randomized recordings created by WellSaid and by human voice actors and rank them on a scale of 1  to 5, with 5 being the highest quality. The voice actors achieved an average rating of around 4.5, while WellSaid’s voices earned scores close to their human counterparts (4.282).

The current focus for Seattle, Washington-based WellSaid, which has 12 employees, is improving the platform’s handling of different text lengths and styles, as well as speeding up voice generation. The company said it takes about 4 seconds to create a 10-second audio file.

WellSaid

“Enterprises use WellSaid Studio to create voiceovers for training and corporate content. They choose WellSaid to optimize their workflows because of the high-quality voices available and to gain cost efficiencies,” Hocking continued. “Product developers integrate [our] API to their experiences to enable voice across their user experience. They rely on the quality of the voices, scalability of the infrastructure, and real-time rendering unmatched by other providers. [As for] brands and creators, [they] use WellSaid to create their own and exclusive AI voice avatars to spec. We partner with them to design, build, host, and deploy their unique AI voices according to their needs and production specs.”

WellSaid’s technology and comparable offerings from Microsoft, Amazon, Resemble AI, Synthesia, Deepdub, Papercup, and others have fueled concerns around misuse and deepfakes, or synthetic media used for nefarious purposes like imitating executives during earnings calls. But Hocking said WellSaid doesn’t create voice avatars without actors’ permission and subscribes to the “Hippocratic Oath for AI” proposed by Microsoft executives Brad Smith and Harry Shum.

“With WellSaid, companies that might have not been ready to deploy synthetic media can now invest in the technology, as it gives them the ability to continue to produce and publish mission-critical content without sacrificing quality,” Hocking said. “We are proud of what we’ve accomplished and grateful for the business we’ve built.”

This latest round brings WellSaid’s total raised to date to $12 million.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link

Categories
Tech News

This AI lets you generate new verses from your favorite rappers

If you’ve ever dreamed of making songs with Tupac or Jay-Z, an AI tool called Uberduck can take you close to fulfilling your fantasies.

Uberduck is one of a range of tools that lets you choose a celebrity voice and then enter text for them to speak. What sets it apart from the others is it can do a pretty impressive job of replicating a rapper’s flow.

You can synthesize speech into a “calm” or “intense” Tupac verse, for example, or try an Eminem “freestyle” verse or “pre-Eminem Show” flow.

Uberduck’s creator says they started working on the system with the goal of creating an open-ended dialog agent:

I built an interactive audio chatbot over WebRTC that generated text responses with a Transformer model and synthesized them to audio, but I found that speech synthesis was the most exciting part of the project.

The tool blew up on TikTok after a lawsuit forced the app to swap its text-to-speech voice for a different version. Many users were unimpressed by the replacement and made the switch to Uberduck.

[Read: Why entrepreneurship in emerging markets matters]

They’ve gone on to use the tool in a range of creative ways, from adding Biggie verses to their own tracks to making Kanye West rap the lyrics to Bohemian Rhapsody.

Even Linkin Park’s Mike Shinoda has tried it out:

@mshinodaHappy Endings by me feat. Notorious B.I.G. ##Biggie##big##HappyEndings##Ai##Uberduck##texttospeech##deepfake##producer##music##rappersoftiktok##fyp♬ original sound – Mike Shinoda

If you’d rather have a conversation with your favorite MC — or force them to give you a shout-out — that’s also possible.

The synthesized voices are far from perfect, but with tweaks to the text — adding extra vowels, for instance, to extend a syllable — you can generate some pretty accurate imitations of your favorite rappers, like these Tupac bars I made:

To my ears, the voice is more convincing than the Faux-Pac from this classic Dave Chappelle skit:

The tool does, however, have some potential to be used for disinformation and defamation.

Uberduck’s terms attempt to allay these concerns. Users are prohibited from using the outputs for commercial purposes or the production of defamatory material. They must also clearly identify that their creations were generated by AI, and will be banned if they violate the rules. In addition, the tool’s inventor says they’ll remove voices from the site upon request.

Putting words into a replica of someone’s voice without their permission could also be viewed as disrespectful, but these feel more like impersonations than recreations.

HT — mrcomposition09

Greetings Humanoids! Did you know we have a newsletter all about AI? You can subscribe to it right here.

Repost: Original Source and Author Link

Categories
AI

Facebook’s AI reverse-engineers models used to generate deepfakes

Elevate your enterprise data technology and strategy at Transform 2021.


Some experts have expressed concern that machine learning tools could be used to create deepfakes, or media that takes a person in an existing video, photo, or audio file and replaces them with someone else’s likeness. The fear is that these fakes might be used to do things like sway opinion during an election or implicate an innocent person in a crime. Deepfakes have already been abused to generate pornographic material of actors and defraud a major energy producer.

While much of the discussion around deepfakes has focused on social media, pornography, and fraud, it’s worth noting that deepfakes pose a threat to anyone portrayed in manipulated videos and their circle of trust. As a result, deepfakes represent an existential threat to businesses, particularly in industries that depend on digital media to make important decisions. The FBI earlier this year warned that deepfakes are a critical emerging threat targeting businesses.

To address this challenge, Facebook today announced a collaboration with researchers at Michigan State University (MSU) to develop a method of detecting deepfakes that relies on taking an AI-generated image and reverse-engineering the system used to create it. While this approach is not being used in production at Facebook, the company claims the technique will support deepfake detection and tracing efforts in “real-world” settings, where deepfakes themselves are the only information detectors have to work with.

A new way to detect deepfakes

Current methods of identifying deepfakes focus on distinguishing real from fake images and determining whether an image was generated by an AI model seen during training or not. For example, Microsoft recently launched a deepfake-combating solution in Video Authenticator, a tool that can analyze a still photo or video to provide a score for its level of confidence that the media hasn’t been artificially manipulated. And the winners of Facebook’s Deepfake Detection Challenge, which ended last June, produced a system that can pick out distorted videos with up to 82% accuracy.

But Facebook argues that solving the problem of deepfakes requires taking the discussion one step further. Reverse engineering isn’t a new concept in machine learning — current techniques can arrive at a model by examining its input and output data or examining hardware information like CPU and memory usage. However, these techniques depend on preexisting knowledge about the model itself, which limits their applicability in cases where such information is unavailable.

By contrast, Facebook and MSU’s approach begins with attribution and then works on discovering the properties of the model used to generate the deepfake. By generalizing image attribution and tracing similarities between patterns of a collection of deepfakes, it can ostensibly infer more about the generative model used to create a deepfake and tell whether a series of images originated from a single source.

How it works

The system begins by running a deepfake image through what the researchers call a fingerprint estimation network (FEN) that extracts details about the “fingerprint” left by the model that generated it. These fingerprints are unique patterns left on deepfakes that can be used to identify the generative models the deepfakes originated from.

The researchers estimated fingerprints using different constraints based on properties of deepfake fingerprints found in the wild. They used these constraints to generate a dataset of fingerprints, which they then tapped to train a model to detect fingerprints it hadn’t seen before.

Facebook and MSU say their system can estimate both the network architecture of an algorithm used to create a deepfake and its training loss functions, which evaluate how the algorithm models its training data. It also reveals the features — or the measurable pieces of data that can be used for analysis — of the model used to create the deepfake.

To test this approach, the MSU research team put together a fake image dataset with 100,000 synthetic images generated from 100 publicly available models. Some of the open source projects already had fake images released, in which case the team randomly selected 1,000 deepfakes from the datasets. In cases where there weren’t any fake images available, the researchers ran their code to generate 1,000 images.

The researchers found that their approach performed “substantially better” than chance and was “competitive” with state-of-the-art methods for deepfake detection and attribution. Moreover, they say it could be applied to detect coordinated disinformation attacks where varied deepfakes are uploaded to different platforms but all originate from the same source.

“Importantly, whereas the term deepfake is often associated with swapping someone’s face — their identity — onto new media, the method we describe allows reverse engineering of any fake scene. In particular, it can help with detecting fake text in images,” Facebook AI researcher Tal Hassner told VentureBeat via email. “Beyond detection of malicious attacks — faces or otherwise — our work can help improve AI methods designed for generating images: exploring the unlimited variability of model design in the same way that hardware camera designers improve their cameras. Unlike the world of cameras, however, generative models are new, and with their growing popularity comes a need to develop tools to study and improve them.”

Looming threat

Since 2019, the number of deepfakes online has grown from 14,678 to 145,227, an uptick of roughly 900% year over year, according to Sentinel. Meanwhile, Forrester Research estimated in October 2019 that deepfake fraud scams would cost $250 million by the end of 2020. But businesses remain largely unprepared. In a survey conducted by data authentication startup Attestiv, fewer than 30% of executives say they’ve taken steps to mitigate fallout from a deepfake attack.

Deepfakes are likely to remain a challenge, especially as media generation techniques continue to improve. Earlier this year, deepfake footage of actor Tom Cruise posted to an unverified TikTok account racked up 11 million views on the app and millions more on other platforms. When scanned through several of the best publicly available deepfake detection tools, the deepfakes avoided discovery, according to Vice.

Still, a growing number of commercial and open source efforts promise to put to rest the deepfake threat — at least temporarily. Amsterdam-based Sensity offers a suite of monitoring products that purport to classify deepfakes uploaded on social media, video hosting platforms, and disinformation networks. Dessa has proposed techniques for improving deepfake detectors trained on datasets of manipulated videos. And Jigsaw, Google’s internal technology incubator, released a large corpus of visual deepfakes that was incorporated into a benchmark made freely available to researchers for synthetic video detection system development.

Facebook and MSU plan to open-source the dataset, code, and trained models used to create their system to facilitate research in various domains, including deepfake detection, image attribution, and reverse-engineering of generative models. “Deepfakes are becoming easier to produce and harder to detect. Companies, as well as individuals, should know that methods are being developed, not only to detect malicious deep fakes but also to make it harder for bad actors to get away with spreading them,” Hassner added. “Our method provides new capabilities in detecting coordinated attacks and in identifying the origins of malicious deepfakes. In other words, this is a new forensic tool for those seeking to keep us safe online.”

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member



Repost: Original Source and Author Link

Categories
Tech News

Scientists made an AI that reads your mind so it can generate portraits you’ll find attractive

A team of researchers recently developed a mind-reading AI that uses an individual’s personal preferences to generate portraits of attractive people who don’t exist.

Computer-generated beauty truly is in the AI of the beholder.

The big idea: Scientists from the University of Helsinki and the University of Copenhagen today published a paper detailing a system by which a brain-computer-interface is used to transmit data to an AI system which then interprets that data and uses it to train an image generator.

According to a press release from the University of Helsinki:

Initially, the researchers gave a generative adversarial neural network (GAN) the task of creating hundreds of artificial portraits. The images were shown, one at a time, to 30 volunteers who were asked to pay attention to faces they found attractive while their brain responses were recorded via electroencephalography (EEG) …

The researchers analysed the EEG data with machine learning techniques, connecting individual EEG data through a brain-computer-interface (BCI) to a generative neural network.

Once the user’s preferences were interpreted, the machine then generated a new series of images, tweaked to be more attractive to the individual whose data it was trained on. Upon review, the researchers found that 80% of the personalized images generated by the machines stood up to the attractiveness test.

Background: Sentiment analysis is a big deal in AI, but this is a bit different. Typically, machine learning systems designed to observe human sentiment use cameras and rely on facial recognition. That makes them unreliable for use with the general public, at best.

But this system relies on a direct link up to our brainwaves. And that means it should be a fairly reliable indicator of positive or negative sentiment. In other words: the base idea seems sound enough in that you look at a picture you find pleasing and then an AI tries to make more pictures that trigger the same brain response.

Quick take: You could attempt to hypothetically extrapolate the potential uses for such an AI all day long and never decide whether it was ethical or not. On the one hand, there’s a treasure trove of psychological insight to be gleaned from a machine that can abstract what we like about a given image without relying on us to consciously understand it.

But, on the other hand, based on what bad actors can do with just a tiny sprinkling of data, it’s absolutely horrifying to think of what a company such as Facebook (that’s currently developing its own BCIs) or a political influence machine like Cambridge Analytica could do with an AI system that knows how to skip someone’s conscious mind and appeal directly to the part of their brain that likes stuff. 

You can read the whole paper here.

Published March 5, 2021 — 21:11 UTC



Repost: Original Source and Author Link

Categories
Tech News

Researchers taught an AI to generate fake DNA

File this one under: While the rest of the world was busy trying to get virtual assistants to tell dad jokes, researchers in Estonia figured out how to Deepfake people at the molecular level.

We’ve seen a delightful onslaught of AI-generated content over the last few years from generative adversarial networks (GANs) attempting to create so-called “Deepfake” imagery.

There’s this person does not exist. And this cat does not exist. You can even generate feet, resumes, and waifu that does not exist. It’s astounding how realistic some of the imagery created out of thin air by AI can be.

But this is the first time we’ve seen an AI generate the recipe for a viable, unique human being via the creation of synthetic genomes.

A team of researchers from Estonia have developed a machine learning system capable of generating unique genome sequences. These computer-generated fakes could play a vital role in the future of DNA research.

Per the team’s research paper:

Generative neural networks have been effectively used in many different domains in the last decade, including machine dreamt photo-realistic imagery. In our work, we apply a similar concept to genetic data to automatically learn its structure and, for the first time, produce high quality realistic genomes.

Dubbed “artificial genomes,” these AI-created sequences are indistinguishable from actual human genomes with the exception of being completely synthetic. This means researchers don’t have to be concerned with ethical privacy concerns.

Under the current research paradigm, researchers have to safeguard DNA in order to ensure the privacy of the humans it belongs to. This involves extra rigor and, in many cases, a drought of available data due to the inability for facilities to share their datasets. Synthetic genomes should go a long way towards solving these problems.

Quick take: This is fantastic news for medical researchers and a clear case of GAN tech being used for good. But, this work does shine a light on some near and far future ethical conundrums we’re going to have to face one day.

In the near term, it’s going to get easier for bad actors to create fake personas that can stand up to even the most rigorous inspection. Not that we envision a scenario where a scam artist needs to provide a fake transcript of their genome, but the unknown unknowns are where security holes tend to grow the fastest.

In the far term… if Skynet had this GAN it wouldn’t have had to make so many machines that look like Arnold Schwarznegger. One day, this technology could reach a point where the philosophical question “am I a robot?” will be a very valid one for any given person to ask themselves.

Read the whole paper here.

Published February 8, 2021 — 22:07 UTC



Repost: Original Source and Author Link