Science 3 min read

New Deep Learning Strategy Could Enhance Computer Vision

Deep learning is at the core of all future technology projects. Now, researchers have created a new deep learning strategy which could improve the technology's ability exponentially.

Deep Learning is one of the most important technologies that is currently in development. Now, researchers may have made a new breakthrough. | Image By spainter_vfx | Shutterstock

Deep Learning is one of the most important technologies that is currently in development. Now, researchers may have made a new breakthrough. | Image By spainter_vfx | Shutterstock

A deep learning system takes textual hints from the context of images to describe them without the need for prior human annotations.

Since its humble beginnings at the turn of the millennium, deep learning, as both a scientific discipline and an industry, has come a long way.

From smartphone assistants to pattern recognition software, security solutions, and other applications, deep learning is becoming a multi-billion dollar business poised for great growth over the few next years.

Read More: Deep Learning Network can Diagnose Skin Cancer Better Than Dermatologists

However, for deep learning agents to reach their full potential, they have to “learn” how to learn on their own.

Herein lies the whole difference between supervised and unsupervised deep learning.

Self-Supervised Deep Learning

The power and appeal of deep learning is all about their ability to recognize different types of patterns like faces, voices, objects, images, and codes.

AI software doesn’t understand what these things really are, and all they see is digital data, and they’re pretty good at that.

The great computer vision capability of deep learning algorithms enable them to tell these things apart, categorize, and classify them.

To do so, however, this software needs to be supervised.

They require human manual input in the form of annotations to guide them before they generalize and build on what they learned into new, similar situations.

Building and labeling large datasets is a complicated and time-consuming task.

Unsupervised machines will be completely autonomous as all they need is data taken directly from their environment. From there, they would take the information to make predictions and yield the expected results.

To design unsupervised, or self-supervised deep learning systems, computer scientists take inspirations from how human intelligence works.

Now, an international team of computer vision scientists has devised a method to enable deep learning software to learn the visual features of images without the need for annotated examples.

Researchers from Carnegie Mellon University (U.S.), Universitat Autonoma de Barcelona (Spain), and the International Institute of Information Technology (India), worked on the study,

Unsupervised Computer Vision Algorithms, it’s a Matter of Semantics

In the study, the team built computational models that use textual information about images found on websites, like Wikipedia, and linked them to the visual features of these images.

“We aim to give computers the capability to read and understand textual information in any type of image in the real world,” said Dimosthenis Karatzas, a research team member.

In the next step, researchers used the models to train deep learning algorithms to pick adequate visual features that textually describe images.

Instead of labeled information about the content of a particular image, the algorithm takes non-visual cues from the semantic textual information found around the image.

“Our experiments demonstrate state-of-the-art performance in image classification, object detection, and multi-modal retrieval compared to recent self-supervised or naturally-supervised approaches,” wrote researchers in the paper.

This is not a fully unsupervised system as algorithms still need models to train on, but the technique shows that deep learning algorithms can tap into the internet to enhance their unsupervised learning abilities.

“We will continue our work on the joint-embedding of textual and visual information,” said Karatzas. “looking for novel ways to perform semantic retrieval by tapping on noisy information available in the Web and Social Media.”

Do you think it is a good idea to train AI with data from the Internet?

Found this article interesting?

Let Zayan Guedim know how much you appreciate this article by clicking the heart icon and by sharing this article on social media.


Profile Image

Zayan Guedim

Trilingual poet, investigative journalist, and novelist. Zed loves tackling the big existential questions and all-things quantum.

Comments (0)
Most Recent most recent
You
share Scroll to top

Link Copied Successfully

Sign in

Sign in to access your personalized homepage, follow authors and topics you love, and clap for stories that matter to you.

Sign in with Google Sign in with Facebook

By using our site you agree to our privacy policy.