Upgrade & Secure Your Future with DevOps, SRE, DevSecOps, MLOps!

We spend hours on Instagram and YouTube and waste money on coffee and fast food, but won’t spend 30 minutes a day learning skills to boost our careers.
Master in DevOps, SRE, DevSecOps & MLOps!

Learn from Guru Rajesh Kumar and double your salary in just one year.

Get Started Now!

Google proposes new metrics for evaluating AI-generated audio and video quality

Source: venturebeat.com

What’s the best way to measure the quality of media generated from whole cloth by AI models? It’s not easy. One of the most popular metrics for images is the Fréchet Inception Distance (FID), which takes photos from both the target distribution and the model being evaluated and uses an AI object recognition system to capture important features and suss out similarities. But although several metrics for synthesized audio and video have been proposed, none has yet been widely adopted.

That’s why researchers hailing from Google are throwing their hats into the ring with what they call the Fréchet Audio Distance (FAD) and Fréchet Video Distance (FVD), which measure the holistic quality of synthesized audio and video, respectively. The researchers claim that unlike peak signal-to-noise ratio, the structural similarity index, or other metrics that have been proposed, FVD looks at look at videos in their entirety. As for AUD, they say it’s reference-free and can be used on any type of audio, in contrast to time-aligned ground truth signals like source-to-distortion ratio (SDR).

“Access to robust metrics for evaluation of generative models is crucial for measuring (and making) progress in the fields of audio and video understanding, but currently no such metrics exist,” wrote software engineers Kevin Kilgour and Thomas Unterthiner in a blog post. “Clearly, some [generated] videos shown below look more realistic than others, but can the differences between them be quantified?”

As it turns out: Yes. In an FAD evaluation, the separation between the distributions of two sets of audio samples — generated and real — is evaluated. As the magnitude of distortions increase, the overlap between the distributions correspondingly decreases, indicating that the synthetic samples are relatively low in quality.

To evaluate how closely FAD and FVD track human judgement, Kilgour, Unterthiner, and colleagues performed a large-scale study involving human evaluators. Here, the evaluators were tasked with examining 10,000 video pairs and 69,000 5-second audio clips. For the FAD, specifically, they were asked to compare the effect of two different distortions on the same audio segment, and both the pair of distortions that they compared and the order in which they appeared were randomized. The collected set of pairwise evaluations was then ranked using a model that estimates a worth value for each parameter configuration.

The team asserts that a comparison of the worth values to the FAD demonstrates that the FAD correlates “quite well” with human judgement.

“We are currently making great strides in generative [AI] models,” said Kilgour and Unterthiner. “FAD and FVD will help us [keep] this progress measurable and will hopefully lead us to improve our models for audio and video generation.”

Related Posts

Google fires second AI ethics leader

Source – https://www.itnews.com.au/ As dispute over research, diversity grows. Google fired staff scientist Margaret Mitchell on Saturday, they both said, a move that fanned company divisions on Read More

Read More

Total and Google to launch AI tool Solar Mapper in Europe

Source: solarpowerportal.co.uk O&G giant Total and Google Cloud are launching a new artificial intelligence (AI) tool to help accelerate the deployment of residential solar panels. Together they Read More

Read More

Unlock a new career in Google Cloud with this mastery bundle

Source: androidguys.com You may not realize this, but you interact with AI technology on a consistent, if not daily basis. And if you do recognize it, chances Read More

Read More

Cloud computing is betting on outer space

Source: livemint.com Microsoft CEO Satya Nadella announced the preview of Azure Orbital at Microsoft Ignite 2020 in New Orleans. According to Microsoft, Orbital is ‘Ground Station as Read More

Read More

Google Cloud And Anaplan Innovate To Transform Enterprise Planning

Source: aithority.com Google Cloud and Anaplan, Inc. announced a strategic partnership to offer Anaplan’s platform for enterprise planning and business performance on Google Cloud. As Anaplan’s first public cloud Read More

Read More

HOW DEEPMIND ALGORITHMS HELPED IMPROVE THE ACCURACY OF GOOGLE MAPS?

Source: analyticsinsight.net DeepMind is one of the companies that are leading the AI charge and coming up with innovative uses of AI. This London-based AI lab has been Read More

Read More
Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x