How Google's new API Recognizes Objects in Videos

Google has just announced the Cloud Video Intelligence API. A one of a kind machine learning system, this new API allows developers to search for specific objects inside video content, something that other APIs can only perform on still images.

During the latest Cloud Next ’17 conference, which took place from March 8th-10th in San Francisco, Google made multiple announcements: the acquisition of data-scientist community Kaggle, new machine learning tools, Hangouts as a business tool, extensions for Gmail and many others.

Cloud Video Intelligence API

Most importantly, for those keeping tabs on AI development, Google announced the launch of a new API toolkit that allows developers to use machine learning to scour video content to find searchable objects.

With the ability to recognize objects and elements in a video, the Video Intelligence API builds on machine learning advances. An improved cache of data to pull from means users can search for elements inside video content directly by using keywords, such as (cat, flower) and even verbs (run, draw). This machine learning system also allows developers to filter video content through powerful automatic learning models that can accurately identify items.

At Cloud Next 2017, Google unveiled an API that enables searching for objects in videos.Click To Tweet

Object Recognition Beyond Static Images

Now in private beta version, Google’s Video Intelligence API allows developers to create apps that can automatically identify elements and describe what is happening in a video. This new machine learning technique will certainly be integrated into large media platforms or used by companies to filter their video content with specific metadata.

YouTube, for example, will be able to refine search results by tapping into metadata generated by object recognition, rather than relying on descriptions and keywords associated with the videos, which sometimes can be misleading because they don’t quite match the query.

Data Rich Environment

Google requires that video content be uploaded to Google Cloud before the Video Intelligence API can be used. This is a huge data mining opportunity for Google and creates a wealth of knowledge for their machine learning platform to pull from.

Google now bundles within its Google Cloud platform a large portfolio of its products for professionals and developers. This includes collaboration and productivity apps (G Suite), API sets, and related services such as hosted databases, Big Data tools, Machine Learning or Network Optimization.

The new Video Intelligence API joins Google Cloud’s other machine learning APIs (Vision, Speech, Natural Language and Jobs), and pushes further object recognition technology, which until now has been limited to analyzing static images.

As a developer, you can sign up and participate in the beta test of Google’s Video Intelligence API or try a demo here.