Towards unsupervised multi-object perception in neural networks

Greff, Klaus

Back

Doctoral thesis

Towards unsupervised multi-object perception in neural networks

Greff, Klaus
Schmidhuber, Juergen (Degree supervisor)

2022

PhD: Università della Svizzera italiana

English By decomposing the world in terms of objects, humans are able to recombine their existing knowledge in a virtually unbounded number ways to understand unfamiliar situations, make novel inferences, or generate new behavior. This ability to form meaningful entities from un-structured sensory information is of central importance for our impressive ability far beyond our direct experience. Contemporary neural networks still fall short of human-level generalization, which we argue is due to their inability to dynamically and flexibly bind information that is distributed throughout the network. This binding problem affects their capacity to acquire a compositional understanding of the world in terms of symbol-like entities (like objects), which is crucial for generalizing in predictable and systematic ways. We focus in particular on the process of perceptually grouping raw sensory inputs into meaningful objects. Importantly, we aim to enable neural networks to learn about objects in an unsupervised fashion, because their required scope and flexibility, renders adequate supervision or engineering infeasible. To that end, we propose a functional definition of objects in terms of predictive modularity, and use it to derive a formalization of perceptual grouping as a particular form of clustering. We demonstrate the feasibility of this approach by developing several neural network models that learn to segment and represent meaningful objects without supervision. Using simple synthetic datasets, we show that these representations are useful for prediction and semi-supervised classification tasks, and that they facilitate certain kinds of systematic generalization. The resulting representations are also more interpretable than non-object centric representations. We believe that a compositional approach to AI, in terms of grounded symbol-like representations, is of fundamental importance for realizing human-level generalization, and we hope that this thesis may contribute towards that goal.

Collections

USI Faculty of Informatics

Language

English

Classification

Computer science and technology

License

License undefined

Open access status

green

Identifiers

NDP-USI 2022INF014
URN urn:nbn:ch:rero-006-121044
ARK ark:/12658/srd1324956

Persistent URL

https://n2t.net/ark:/12658/srd1324956

Statistics

Document views: 210 File downloads:

2022INF014: 527

Doctoral thesis

Towards unsupervised multi-object perception in neural networks

Artificial neural networks

Unsupervised learning

Representation learning

Binding problem

Compositionality

Systematicity

Objects

Neuro-symbolic AI

Instance segmentation

Statistics