Doctoral thesis

Systematic generalization in connectionist models

  • 2023

PhD: Università della Svizzera italiana

English In recent years, neural networks (NNs) revolutionized computer science, solving many problems out of reach of classical methods. Thanks to their flexibility, they can process raw data, such as images, audio, or text, and defeat humans in games. However, a critical challenge remains: they often fail on test data that follow the same underlying rules as the training data but present superficial differences, like longer inputs or unseen word combinations. Generalization to such structurally related data is called systematic generalization. Analysis suggests that NNs often learn a smart interpolation between their training data points and rarely learn a generally applicable rule-based solution. This limits both their applicability and their trustworthiness. Thus, systematic generalization is of utmost importance. This work consists of multiple parts. First, we improve the performance of differentiable neural computers in algorithmic and reasoning tasks. Then we analyze the implicit modularity of neural networks and show that it does not support compositionality. Motivated by compositionality, we introduce architectural changes to transformers, significantly boosting generalization on multiple well-known datasets. Pushing this idea further, we introduce the purely connectionist NDR architecture that can generalize to longer inputs on algorithmic tasks. Then we move our focus to systematicity, and we propose a new dataset to analyze the behavior of the model. Finally, we focus on scaling up NDRs to real-world tasks and improving the Mixture of Experts models, matching the performance of the parameter-equivalent dense baselines. We hope that the high-level ideas outlined in this thesis can provide guidance for further research aiming to achieve compositional generalization.
Collections
Language
  • English
Classification
Computer science and technology
License
License undefined
Open access status
green
Identifiers
Persistent URL
https://n2t.net/ark:/12658/srd1326205
Statistics

Document views: 83 File downloads:
  • 2023INF013.pdf: 127