Google MediaPipe Offers ML-Based Hand-Tracking Tools

Introducing an advanced AI framework for hand, face, pose, and object tracking

3
What is Google MediaPipe? Machine Learning for Hand-Tracking
Mixed RealityInsights

Published: July 12, 2022

Rgreenerheadshot

Rory Greener

Google debuted MediaPipe, an artificial intelligence (AI) framework that detects people and objects in 3D space, in 2019 and the machine learning (ML) solution accurately tracks targets to transform 2D media as 3D spatial data.

The ML service increases the accuracy of extended reality (XR) immersive experiences by tracking real-world assets and incorporating rich AI data that predicts movement.

Google designed it as a hardware-agnostic solution, meaning XR developers can distribute their MediaPipe projects on Android and iOS devices, as well as desktop and head-mounted devices (HMD).

Key Features

The service delivers end-to-end ML acceleration, enhancing MediaPipe content distribution on standard XR hardware.

The free, open-source Google solution incorporates Apache 2.0, an open-source framework that enables fully customizable XR projects. Mediapipe also works across cloud, web, and Internet of Things (IoT) platforms for various use cases such as in advertising and construction.

The service contains a wealth of tools for creating custom augmented and mixed reality (AR/MR) content. MediaPipe provides developers with 16 critical features for Android, iOS, C++, Python, JavaScript, and Coral with mixed compatibility.

MediaPipe’s key XR features include:

  • Face Detection
  • Face Mesh
  • Iris Detection
  • Hand Detection
  • Pose Detection
  • Holistic Detection
  • Selfie Segmentation
  • Hair Segmentation
  • Object Detection
  • Box Tracking
  • Instant Motion Tracking
  • Objectron (Object Box Tracking)
  • KNIFT
  • AutoFlip
  • MediaSequence
  • YouTube 8M

The service also includes versatile frameworks for creating ultrafast face-detecting content and six landmarks to track specific facial features and multi-face support for various users simultaneously.

The detection technology creates a detailed face mesh that accurately assembles facial data, such as gestures and expressions, for animated AR overlays. Google’s face mesh framework uses technology from BlazeFlace that optimizes MediaPipe projects for mobile GPU inferences.

Object Detection and Prediction

MediaPipe supplies developers with tools to track objects within an XR environment, and developers can use its Visualizer tool to manage the core spatial data surrounding a real-world object.

MediaPipe provides KNIFT and Objectron to XR developers — two tools to enhance 3D object tracking.

Objectron paths an object, from a 2D image, as real-time 3D (RT3D) data points. The tool then predicts the poses of the tracked object using ML algorithms.

The ML pipeline also makes consideration for an outdoor environment. Objectron can track entire external spaces based on 2D video footage.

Objectron Outdoor Data Map
An example of an Objectron Outdoor Data Map PHOTO: Google

MediaPipe also offers Keypoint Neural Invariant Feature Transform (KNIFT) to boost object detection. The tool is an ML system that matches the information displayed on a real-world object such as a stop sign to relay the information as spatial data.

KNIFT enables MediaPipe Projects to understand the information on an object despite difficult-to-read angles or lighting.

Building a Business Use Case for MediaPipe

During its development, Google integrated MediaPipe into various internal Projects. Last year, the Cupertino-based firm’s Arts & Culture iOS and Android smartphone application incorporated the MediaPipe tool kit to debut Art Filter, which allows users to wear historical objects such as jewellery or transform themselves into classic art pieces.

Furthermore, in April last year, Google added SignAll to the MediaPipe service that accurately tracks and reads sign language. Google debuted Ace ASL: Learn Fingerspelling on iOS stores as a free augmented American Sign Language (ASL) application powered by MediaPipe.

 

 

AIDesignImmersive Experience
Featured

Share This Post