A New Link to an Old Model Could Crack the Mystery of Deep Learning

In the machine learning world, the sizes of artificial neural networks — and their outsize successes — are creating conceptual conundrums. When a network named AlexNet won an annual image recognition competition in 2012, it had about 60 million parameters. These parameters, fine-tuned during training, allowed AlexNet to recognize images that it had never seen before. Two years later, a network named VGG wowed the competition with more than 130 million such parameters. Some artificial neural networks, or ANNs, now have billions of parameters.

These massive networks — astoundingly successful at tasks such as classifying images, recognizing speech and translating text from one language to another — have begun to dominate machine learning and artificial intelligence. Yet they remain enigmatic. The reason behind their amazing power remains elusive.

But a number of researchers are showing that idealized versions of these powerful networks are mathematically equivalent to older, simpler machine learning models called kernel machines. If this equivalence can be extended beyond idealized neural networks, it may explain how practical ANNs achieve their astonishing results.

To read more, click here.