Researchers Develop new Method to Shrink Deep Neural Networks

Researchers from MIT have just found a way to shrink deep neural networks to make AI systems training faster and more straightforward. In their paper published in MIT’s Computer Science and Artificial Intelligence Lab (CSAIL), the team proposes a new approach that takes advantage of the smaller subnets within neural networks.

According to the MIT researchers, subnets are ten times smaller than an entire neural network. However, they can also be trained to make the same precise predictions as the original network. What’s more, the subnets can perform tasks way faster than the originals in some cases.

Jonathan Frankle, a Ph.D. student from MIT and one of the study’s co-author, said in a statement:

“With a neural network, you randomly initialize this large structure, and after training it on a huge amount of data it magically works. This large structure is like buying a big bag of tickets, even though there’s only a small number of tickets that will actually make you rich.”

Shrinking Deep Neural Networks

The team’s method of shrinking deep neural networks includes the elimination of connections not necessary to the network’s functions so they can adapt to low-powered devices. The said process is commonly referred to as “pruning.”

The researchers select the connections with the lowest weights or importance. Then, they train the subnets after resetting the weights and pruning the unnecessary connections. The researchers continue to remove more links over time to determine how much could be eliminated without affecting the predictive capability of the AI model.

According to the team, the process was repeated tens of thousands of times on different network models to determine just how they can reduce the entire size of the network structure while saving its predictive functions.

The researchers reported that the AI models they were able to identify were 10 to 20 percent less than the size of their fully connected parent networks. Michael Carbin, an MIT assistant professor and co-author of the study was quoted as saying:

“It was surprising to see that re-setting a well-performing network would often result in something better. This suggests that whatever we were doing the first time around wasn’t exactly optimal and that there’s room for improving how these models learn to improve themselves.”