How Neurocompositional Computing Leads to a New Generation of AI Systems
The Dartmouth workshop in 1956 marked the birth of artificial intelligence as a field. Over the past decade, the field has gained momentum through deep learning. Recent advances in AI are attributed to advances in technical engineering that have led to huge improvements in the amount of computing resources and training data. However, Microsoft researchers have shown that the latest advances in AI are not only due to quantitative leaps in computing power, but also to qualitative changes in the way computing power is deployed. The qualitative changes have led to a new type of computing that Microsoft researchers call neurocompositional computing.
Microsoft’s article, “Neurocompositional Computing: From the Central Paradox of Cognition to a New Generation of AI Systems,” discusses how neurocompositional computing can address AI challenges such as lack of transparency and weak learning general knowledge. New systems can learn in a more robust and understandable way than standard deep learning networks.
In neurocompositional computing, neural networks exploit the principle of compositionality and the principle of continuity. The principle of compositionality asserts that encodings of complex information are structures that are systematically composed of simpler structured encodings. A 2014 Stanford paper titled “Bringing Machine Learning and Compositional Semantics Together” argued that these concepts are deeply united around the concepts of generalization, meaning, and structural complexity. Learning-based theories of semantics bring two worlds together. Compositionality characterizes the recursive nature of the linguistic ability required to generalize to a creative ability. The learning details the conditions under which such capability can be acquired from the data. The principle of compositionality directs researchers to specific model structures, while ML provides them with training methods.
The continuity principle states that the encoding and processing of information is formalized with real numbers that vary continuously. The latest studies show that compositionality could be achieved through traditional methods of symbolic computation and new forms of continuous neural computation. A 2020 Stanford workshop on compositionality and computer vision detailed how recent work on computer vision approaches demonstrated that concepts can be learned from just a few examples using compositional representation. Compositionality allows symbolic clauses to express the hierarchical and tree-like structure of sentences in natural language. Current neural networks exhibit a unique functional form of compositionality that may be able to model the compositional character of cognition even if constituents are changed when composed into a complex expression.
Neural computing encodes information into digital activation vectors, forming a vector space. The activation vector encodes the output results of the propagation of activation which encodes an input between multiple layers of neurons via connections of different strengths or weights, the article says. In a general neural network, the values of these weights are set by training the model on examples of correct input/output pairs. This allows the model to converge to connection weights that produce the correct output when an input is given.
Neural computing also follows the principle of continuity. Here, knowledge about the information encoded in a vector automatically generalizes to similar information encoded in neighboring vectors. This results in a generalization based on similarity. Continuity allows deep learning to improve and modify the statistical inference of a model’s outputs from the inputs of its training set.
Is modern AI neurocompositional?
The techniques prevalent in the 20th century—symbolic and non-neurocompositional neural—violate either of the two principles. Human intelligence respects both.
However, convolutional neural networks (CNN) and transformers have a lot of potential for a breakthrough. CNN processing incorporates the principle of compositionality using spatial structure. At each layer, the analysis of the whole image comes from the composition of analyzes of larger patches of the previous layer’s analysis of its smaller patches. CNNs and transformers derive much of their power from their additional composition structure of spatial structure and a type of graph structure.
The CNN and Transformer architectures fall under 1G neurocompositional computing. Microsoft’s work aims to incorporate the principle of composition by inculcating network capabilities for the explicit construction and processing of general, abstract and compositionally structured activation vector encodings, while remaining within the scope of the neural computing in accordance with the principle of continuity. It’s 2G neurocompositional computing.