Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Get Started Now!

Artificial intelligence tool turns audio into video

Source – digitaljournal.com

Washington – A new artificial intelligence tool can create realistic videos from audio files alone. This technology, developed at the University of Washington, has been tested on speeches made by former President Obama.

The technology is based on newly prepared algorithms, which are designed to overcome a limitation with ‘computer vision’. This is with turning audio clips into realistic, lip-synced videos of the person who is speaking the words. The developed algorithms learn from videos that exist “in the wild”, such as on the Internet or elsewhere.

To do so involved training a neural network (a collection of connected units called artificial neurons) to view videos of an individual and then to translate different audio sounds into basic mouth shapes. The second area was using a new mouth synthesis technique to realistically superimpose mouth shapes and textures onto an existing reference video of a given person.

Individual brain cells within a neural network are highlighted in this image obtained by CMU s Sandr...

Individual brain cells within a neural network are highlighted in this image obtained by CMU’s Sandra Kuhlman using a fluorescent imaging technique
Carnegie Mellon University

To test out the technology, the research group generated a realistic video of Barack Obama discussing such diverse subjects as terrorism, fatherhood and employment. The video was created using audio clips alone together with a separate video image of the former president. The video overcomes a major problem with adding audio to video, where the mouth of the speaker appears unrealistic.

Discussing the outcome, lead researcher Professor Ira Kemelmacher- Shlizerman enthused: “These type of results have never been shown before.” To this required an artificial intelligence algorithm, one capable of learning and anticipating the intricate patterns of human speech. The reason Obama was chosen for the project was due to the sheer volume of available recordings.

President Barack Obama addresses citizens at a town hall meeting in Santa Monica  California  Octobe...

President Barack Obama addresses citizens at a town hall meeting in Santa Monica, California, October 9, 2014
White House

The technology will be presented to the August meeting of SIGGRAPH 2017. A white paper has been produced titled “Synthesizing Obama: Learning Lip Sync from Audio”, to discuss the technology.

What does technology this offer businesses?

The advantages to businesses are considerable, allowing high quality audit recordings to be made and later turned into videos of a higher resolution that would be possible using a standard camera and with taking archival sound recordings, which is an area that may appeal to the entertainments industry. Imagine, for example, being able to hold a conversation with a historical figure in virtual reality by creating visuals just from audio.

What could this mean for you?

For consumers, video chat tools like Skype, Google Hangouts or Messenger will enable any person to collect videos that could be used to train computer models. A further appeal to businesses is since streaming audio over the Internet requires much less bandwidth than video, the new software will put an end to video chats that ‘time out’ as a result of poor connections. This is by reversing the process , that is feeding video into the network instead of just audio. Often with ‘video chats’ the audio is good but the video is poor, which is something that frustrates many business professionals and hampers attempts by businesses to reduce the number of meetings by ‘going digital’.

 

 

 

Related Posts

What is AIOps?

AIOps, short for Artificial Intelligence for IT Operations, is a practice that combines artificial intelligence (AI) and machine learning (ML) technologies with traditional IT operations to enhance Read More

Read More

What is Natural Language Processing (NLP) tools?

Introduction to Natural Language Processing (NLP) Tools If you’ve ever asked Siri a question or talked to Alexa, you’ve used Natural Language Processing (NLP) tools. In essence, Read More

Read More

What are Emotion Detection Tools and Why Emotion Detection Tools are Important?

What are Emotion Detection Tools? Emotion detection tools are a type of technology that analyses human facial expressions, voice tone, and body language to determine the emotional Read More

Read More

What is Sentiment Analysis and what are the Types of Sentiment Analysis and its Important?

Introduction to Sentiment Analysis If you’re a business owner, marketer, or just someone who’s curious about what people think about your brand, then you’ve probably heard of Read More

Read More

What is Object Detection and Why is Object Detection Important?

Introduction to Object Detection Tools Object detection is the process of identifying and locating objects of interest in an image or video. Object detection tools are software Read More

Read More

What is Face Recognition and Why is Face Recognition Important?

Introduction to Face Recognition Tools We’ve all heard of facial recognition technology, but what exactly is it and why is it important? From unlocking your phone with Read More

Read More
Subscribe
Notify of
guest
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
2
0
Would love your thoughts, please comment.x
()
x