Lhe relationship between neuroscience and artificial intelligence (AI) has always been very rich, with biology inspiring artificial systems, in turn used to model the former. The vision of the central nervous system as a discrete set of interconnected neurons in which information propagates in the form of electrical impulses is at the origin of the first artificial neural networks.
Conceived as theoretical models of biological systems, the perceptron by Rosenblatt (1957) and the neocognitron of Fukushima (1980) model visual perception in humans by a sequence of very simple mathematical operations, the parameters of which are automatically adjusted by an unsupervised learning process (still primitive) at the end of which the network can differentiate two numbers after seeing a sufficient number of examples.
Rosenblatt also proposes an algorithm for the perceptron, in the supervised case where the identity of the digits is available during learning, but real practical successes will have to wait until the 1980s and the invention of convolutional networks by LeCun and colleagues. Apart from a few details, their architecture is the same as that of the neocognitron, but it is the method of supervised training of the network which has changed: the technique of backpropagation invented in the meantime effectively minimizes the difference between actual response and expected response. Twenty years later, they will lead to the deep learning revolution.
With hindsight, the crucial advance of artificial neural networks, already in the making in perceptrons, is their ability to learn the representation of the data (images, text, etc.) that they manipulate, while this is defined “at the hand” by a specialist in “clic” shape recognition, the learning being limited to the adjustment of the parameters allowing the figures to be separated, for example.
Despite the resounding successes of deep learning, the supervised framework requires expensive manual annotation campaigns and, paradoxically, is unlikely to play a primary role in biological visual perception (which teaches a goat to recognize ‘gr from examples?). The large language models of modern generative AI are even more data intensive but take advantage of several recent advances: self-supervised learning now makes it possible to learn their parameters by exploiting the internal consistency of the data to train the machine to predict words, or even entire sentences, hidden in the text, which allows the exploitation of gigantic corpora without manual annotation.
You have 26.33% of this article left to read. The rest is reserved for subscribers.