Some of the aspects of the world around us are captured in natural language and refer to semantic high-level variables, which often have a causal role (referring to agents, objects, and actions or intentions). These high-level variables also seem to satisfy very peculiar characteristics which low-level data (like images or sounds) do not share, and our work is about clarifying these characteristics in the form of priors which can guide the design of machine learning systems benefitting from these priors. Since these priors are not just about their joint distribution (e.g. it has a sparse factor graph) but also about how the distribution changes (typically by causal interventions), this analysis may also help to build machine learning systems which can generalize better out-of-distribution. There are fascinating connections between these priors and what is hypothesized about conscious processing in the brain, with conscious processing allowing us to reason (i.e., perform chains of inferences about the past and the future, as well as credit assignment) at the level of these high-level variables. This involves attention mechanisms and short-term memory to form a bottleneck of information being broadcast around the brain between different parts of it, as we focus on different high-level variables and some of their interactions. The presentation summarizes a few recent results using some of these ideas for discovering causal structure and modularizing recurrent neural networks with attention mechanisms in order to obtain better out-of-distribution generalization and move deep learning towards capturing some of the functions associated with conscious processing.
Questions & Answers
Post a publicly available question
No answers yet!