When it comes to the detection of boundaries in images, neural network predictions lack the accuracy level needed for several critical tasks in computer vision, such as recognizing objects and their boundaries in a visual scene and object segmentation.
That’s even though AI in general and deep neural networks, in particular, are pretty good at recognizing objects in digital images.
Now, a research team has tackled the challenging visual boundary prediction, and their results are promising.
Boundary Prediction: STEAL to Make Neural Network Predictions More Accurate
A visual boundary in a natural image is the pixel contour that tells where one object ends, and another starts. This is different from simple “edge” detection, which deals with abrupt changes in color or brightness.
Boundary detection has more to do with changes in other image features like the change in textures between image locations or within the same area. In a densely-textured image location, there could be many edges, but with no defined boundaries.
Computer vision researchers from Nvidia, the University of Toronto, and the Vector Institute for Artificial Intelligence in Toronto have developed a framework to improve AI’s “semantic boundary prediction.”
Called STEAL (Semantically Thinned Edge Alignment Learning), the framework enables existing computer vision models to more accurately detect the pixel boundaries separating objects in a given image.

In a paper, published last April, the authors explain their approach:
“… relevant datasets consist of a significant level of label noise, reflecting the fact that precise annotations are laborious to get and thus annotators trade-off quality with efficiency. We aim to learn sharp and precise semantic boundaries by explicitly reasoning about annotation noise during training. We propose a simple new layer and loss that can be used with existing learning-based boundary detectors. Our layer/loss enforces the detector to predict a maximum response along the normal direction at an edge, while also regularizing its direction.”
Experiments showed that STEAL improves over CASENet, a state-of-the-art semantic edge detection method, by more than 4%.
Other than improving the performance of computer vision models, this method would also be efficient in labeling new data for deep neural network systems.
Accurate neural network predictions about the visual boundaries of objects can have many computer vision applications, such as object detection, image generation, and 3D modeling.
Comments (0)
Most Recent