Google brings cross-platform AI pipeline framework MediaPipe to the web
Roughly a year ago, Google open-sourced MediaPipe, a framework for building cross-platform AI pipelines consisting of fast inference and media processing (like video decoding). Basically, it’s a quick and dirty way to perform object detection, face detection, hand tracking, multi-hand tracking, hair segmentation, and other such tasks in a modular fashion, with popular machine learning frameworks like Google’s own TensorFlow and TensorFlow Lite.
“Since everything runs directly in the browser, video never leaves the user’s computer and each iteration can be immediately tested on a live webcam stream (and soon, arbitrary video),” explained MediaPipe team members Michael Hays and Tyler Mullen in a blog post.
Google leveraged the above-listed components to integrate preview functionality into a web-based visualizer — a sort of workspace for iterating over MediaPipe flow designs. The visualizer, which is hosted at viz.mediapipe.dev, enables developers to inspect MediaPipe graphs (frameworks for building machine learning pipelines) by pasting a graph code into the editor tab or uploading a file to the visualizer. Users can pan around and zoom into the graphical representation using a mouse and scroll wheel, and the visualization reacts to changes made within the editor in real time.
Hays and Mullen note that currently, web-based MediaPipe support is limited to the demo graphs supplied by Google. Developers must edit one of the template graphs — they can’t provide their own from scratch or add or alter assets. TensorFlow Lite inference isn’t supported, and the graph’s computations must be run on a single processor thread.
A lack of compute shaders — routines compiled for high-throughput accelerators — available for the web is to blame for this last limitation, which Hays, Mullen, and team attempted to work around by using graphic cards for image operations where possible and the lightest-weight possible versions of all AI models. They plan to “continue to build upon this new platform” and to provide developers with “much more control” over time.