Preventing Deterioration of Classification Accuracy in Predictive Coding Networks

Abstract

Predictive Coding Networks (PCNs) aim to learn a generative model of the world. Given observations, this generative model can then be inverted to infer the causes of those observations. However, when training PCNs, a noticeable pathology is often observed where inference accuracy peaks and then declines with further training. This cannot be explained by overfitting since both training and test accuracy decrease simultaneously. Here we provide a thorough investigation of this phenomenon and show that it is caused by an imbalance between the speeds at which the various layers of the PCN converge. We demonstrate that this can be prevented by regularising the weight matrices at each layer: by restricting the relative size of matrix singular values, we allow the weight matrix to change but restrict the overall impact which a layer can have on its neighbours. We also demonstrate that a similar effect can be achieved through a more biologically plausible and simple scheme of just capping the weights.

Paul F. Kinghorn, Beren Millidge, Christopher L. Buckley

Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency

Abstract

Under what circumstances can a system be said to have beliefs and goals, and how do such agency-related features relate to its physical state? Recent work has proposed a notion of interpretation map, a function that maps the state of a system to a probability distribution representing its beliefs about an external world. Such a map is not completely arbitrary, as the beliefs it attributes to the system must evolve over time in a manner that is consistent with Bayes’ theorem, and consequently the dynamics of a system constrain its possible interpretations. Here we build on this approach, proposing a notion of interpretation not just in terms of beliefs but in terms of goals and actions. To do this we make use of the existing theory of partially observable Markov decision processes (POMDPs): we say that a system can be interpreted as a solution to a POMDP if it not only admits an interpretation map describing its beliefs about the hidden state of a POMDP but also takes actions that are optimal according to its belief state. An agent is then a system together with an interpretation of this system as a POMDP solution. Although POMDPs are not the only possible formulation of what it means to have a goal, this nevertheless represents a step towards a more general formal definition of what it means for a system to be an agent.

Martin Biehl, Nathaniel Virgo

Disentangling Shape and Pose for Object-Centric Deep Active Inference Models

Abstract

Active inference is a first principles approach for understanding the brain in particular, and sentient agents in general, with the single imperative of minimizing free energy. As such, it provides a computational account for modelling artificial intelligent agents, by defining the agent’s generative model and inferring the model parameters, actions and hidden state beliefs. However, the exact specification of the generative model and the hidden state space structure is left to the experimenter, whose design choices influence the resulting behaviour of the agent. Recently, deep learning methods have been proposed to learn a hidden state space structure purely from data, alleviating the experimenter from this tedious design task, but resulting in an entangled, non-interpretable state space. In this paper, we hypothesize that such a learnt, entangled state space does not necessarily yield the best model in terms of free energy, and that enforcing different factors in the state space can yield a lower model complexity. In particular, we consider the problem of 3D object representation, and focus on different instances of the ShapeNet dataset. We propose a model that factorizes object shape, pose and category, while still learning a representation for each factor using a deep neural network. We show that models, with best disentanglement properties, perform best when adopted by an active agent in reaching preferred observations.

Stefano Ferraro, Toon Van de Maele, Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt

Object-Based Active Inference

Abstract

The world consists of objects: distinct entities possessing independent properties and dynamics. For agents to interact with the world intelligently, they must translate sensory inputs into the bound-together features that describe each object. These object-based representations form a natural basis for planning behavior. Active inference (AIF) is an influential unifying account of perception and action, but existing AIF models have not leveraged this important inductive bias. To remedy this, we introduce ‘object-based active inference’ (OBAI), marrying AIF with recent deep object-based neural networks. OBAI represents distinct objects with separate variational beliefs, and uses selective attention to route inputs to their corresponding object slots. Object representations are endowed with independent action-based dynamics. The dynamics and generative model are learned from experience with a simple environment (active multi-dSprites). We show that OBAI learns to correctly segment the action-perturbed objects from video input, and to manipulate these objects towards arbitrary goals.

Ruben S. van Bergen, Pablo Lanillos

Knitting a Markov Blanket is Hard When You are Out-of-Equilibrium: Two Examples in Canonical Nonequilibrium Models

Abstract

Bayesian theories of biological and brain function speculate that Markov blankets (a conditional independence separating a system from external states) play a key role for facilitating inference-like behaviour in living systems. Although it has been suggested that Markov blankets are commonplace in sparsely connected, nonequilibrium complex systems, this has not been studied in detail. Here, we show in two different examples (a pair of coupled Lorenz systems and a nonequilibrium Ising model) that sparse connectivity does not guarantee Markov blankets in the steady-state density of nonequilibrium systems. Conversely, in the nonequilibrium Ising model explored, the more distant from equilibrium the system appears to be correlated with the distance from displaying a Markov blanket. These result suggests that further assumptions might be needed in order to assume the presence of Markov blankets in the kind of nonequilibrium processes describing the activity of living systems.

Miguel Aguilera, Ángel Poc-López, Conor Heins, Christopher L. Buckley

Spin Glass Systems as Collective Active Inference

Abstract

An open question in the study of emergent behaviour in multi-agent Bayesian systems is the relationship, if any, between individual and collective inference. In this paper we explore the correspondence between generative models that exist at two distinct scales, using spin glass models as a sandbox system to investigate this question. We show that the collective dynamics of a specific type of active inference agent is equivalent to sampling from the stationary distribution of a spin glass system. A collective of specifically-designed active inference agents can thus be described as implementing a form of sampling-based inference (namely, from a Boltzmann machine) at the higher level. However, this equivalence is very fragile, breaking upon simple modifications to the generative models of the individual agents or the nature of their interactions. We discuss the implications of this correspondence and its fragility for the study of multiscale systems composed of Bayesian agents.

Conor Heins, Brennan Klein, Daphne Demekas, Miguel Aguilera, Christopher L. Buckley

Mapping Husserlian Phenomenology onto Active Inference

Abstract

Phenomenology is the rigorous descriptive study of conscious experience. Recent attempts to formalize Husserlian phenomenology provide us with a mathematical model of perception as a function of prior knowledge and expectation. In this paper, we re-examine elements of Husserlian phenomenology through the lens of active inference. In doing so, we aim to advance the project of computational phenomenology, as recently outlined by proponents of active inference. We propose that key aspects of Husserl’s descriptions of consciousness can be mapped onto aspects of the generative models associated with the active inference approach. We first briefly review active inference. We then discuss Husserl’s phenomenology, with a focus on time consciousness. Finally, we present our mapping from Husserlian phenomenology to active inference.

Mahault Albarracin, Riddhi J. Pitliya, Maxwell J. D. Ramstead, Jeffrey Yoshimi

The Role of Valence and Meta-awareness in Mirror Self-recognition Using Hierarchical Active Inference

Abstract

The underlying processes that enable self-perception are crucial for understanding multisensory integration, body perception and action, and the development of the self. Previous computational models have overlooked an essential aspect: affective or emotional components cannot be uncoupled from the self-recognition process. Hence, here we propose a computational approach to study self-recognition that incorporates affect using state-of-the-art hierarchical active inference. We evaluated our model in a synthetic experiment inspired by the mirror self-recognition test, a benchmark for evaluating self-recognition in animals and humans alike. Results show that i) negative valence arises when the agent recognizes itself and learns something unexpected about its internal states. Furthermore, ii) the agent in the presence of strong prior expectations of a negative affective state will avoid the mirror altogether in anticipation of an undesired learning process. Both results are in line with current literature on human self-recognition.

Jonathan Bauermeister, Pablo Lanillos

World Model Learning from Demonstrations with Active Inference: Application to Driving Behavior

Abstract

Active inference proposes a unifying principle for perception and action as jointly minimizing the free energy of an agent’s internal world model. In the active inference literature, world models are typically pre-specified or learned through interacting with an environment. This paper explores the possibility of learning world models of active inference agents from recorded demonstrations, with an application to human driving behavior modeling. The results show that the presented method can create models that generate human-like driving behavior but the approach is sensitive to input features.

Ran Wei, Alfredo Garcia, Anthony McDonald, Gustav Markkula, Johan Engström, Isaac Supeene, Matthew O’Kelly

Active Blockference: cadCAD with Active Inference for Cognitive Systems Modeling

Abstract

Cognitive approaches to complex systems modeling are currently limited by the lack of flexible, composable, tractable simulation frameworks. Here, we present Active Blockference, an approach for cognitive modeling in complex cyberphysical systems that uses cadCAD to implement multiagent Active Inference simulations. First, we provide an account of the current state of Active Inference in cognitive modeling, with the Active Entity Ontology for Science (AEOS) as a particular example of Active Inference applied to decentralized science communities. We then give a brief overview of Active Blockference and the initial results of simulations of Active Inference agents in grid environments (Active Gridference). We conclude by sharing some preferences and expectations for further research, development, and applications. The open source package can be found at https://github.com/ActiveInferenceLab/ActiveBlockference.

Jakub Smékal, Arhan Choudhury, Amit Kumar Singh, Shady El Damaty, Daniel Ari Friedman

Active Inference Successor Representations

Abstract

Recent work has uncovered close links between classical reinforcement learning (RL) algorithms, Bayesian filtering, and Active Inference which lets us understand value functions in terms of Bayesian posteriors. An alternative, but less explored, model-free RL algorithm is the successor representation, which expresses the value function in terms of a successor matrix of average future state transitions. In this paper, we derive a probabilistic interpretation of the successor representation in terms of Bayesian filtering and thus design a novel active inference agent architecture utilizing successor representations instead of model-based planning. We demonstrate that active inference successor representations have significant advantages over current active inference agents in terms of planning horizon and computational cost. Moreover, we show how the successor representation agent can generalize to changing reward functions such as variants of the expected free energy.

Beren Millidge, Christopher L. Buckley

Learning Policies for Continuous Control via Transition Models

Abstract

It is doubtful that animals have perfect inverse models of their limbs (e.g., what muscle contraction must be applied to every joint to reach a particular location in space). However, in robot control, moving an arm’s end-effector to a target position or along a target trajectory requires accurate forward and inverse models. Here we show that by learning the transition (forward) model from interaction, we can use it to drive the learning of an amortized policy. Hence, we revisit policy optimization in relation to the deep active inference framework and describe a modular neural network architecture that simultaneously learns the system dynamics from prediction errors and the stochastic policy that generates suitable continuous control commands to reach a desired reference position. We evaluated the model by comparing it against the baseline of a linear quadratic regulator, and conclude with additional steps to take toward human-like motor control.

Justus Huebotter, Serge Thill, Marcel van Gerven, Pablo Lanillos

Attachment Theory in an Active Inference Framework: How Does Our Inner Model Take Shape?

Abstract

Starting from the Attachment theory as proposed by John Bowlby in 1969, and from the scientific literature about developmental processes that take place early in life of human beings, the present work aims at exploring a possible approach to attachment, exploiting some aspects deriving from the Active Inference Theory. We describe how, from the prenatal stage until around the second year of life, the sensory, relational, affective and emotional dynamics could interplay in the formation of priors that get rigidly integrated into the internal generative model and that possibly affect the entire life of the individual. It is concluded that the presented qualitative approach could be of interest for experimental studies aiming at giving evidence on how Active Inference could sustain attachment.

Erica Santaguida, Massimo Bergamasco

Capsule Networks as Generative Models

Abstract

Capsule networks are a neural network architecture specialized for visual scene recognition. Features and pose information are extracted from a scene and then dynamically routed through a hierarchy of vector-valued nodes called ‘capsules’ to create an implicit scene graph, with the ultimate aim of learning vision directly as inverse graphics. Despite these intuitions, however, capsule networks are not formulated as explicit probabilistic generative models; moreover, the routing algorithms typically used are ad-hoc and primarily motivated by algorithmic intuition. In this paper, we derive an alternative capsule routing algorithm utilizing iterative inference under sparsity constraints. We then introduce an explicit probabilistic generative model for capsule networks based on the self-attention operation in transformer networks and show how it is related to a variant of predictive coding networks using Von-Mises-Fisher (VMF) circular Gaussian distributions.

Alex B. Kiefer, Beren Millidge, Alexander Tschantz, Christopher L. Buckley

Home Run: Finding Your Way Home by Imagining Trajectories

Abstract

When studying unconstrained behaviour and allowing mice to leave their cage to navigate a complex labyrinth, the mice exhibit foraging behaviour in the labyrinth searching for rewards, returning to their home cage now and then, e.g. to drink. Surprisingly, when executing such a “home run”, the mice do not follow the exact reverse path, in fact, the entry path and home path have very little overlap. Recent work proposed a hierarchical active inference model for navigation, where the low level model makes inferences about hidden states and poses that explain sensory inputs, whereas the high level model makes inferences about moving between locations, effectively building a map of the environment. However, using this “map” for planning, only allows the agent to find trajectories that it previously explored, far from the observed mice’s behaviour. In this paper, we explore ways of incorporating before-unvisited paths in the planning algorithm, by using the low level generative model to imagine potential, yet undiscovered paths. We demonstrate a proof of concept in a grid-world environment, showing how an agent can accurately predict a new, shorter path in the map leading to its starting point, using a generative model learnt from pixel-based observations.

Daria de Tinguy, Pietro Mazzaglia, Tim Verbelen, Bart Dhoedt

A Novel Model for Novelty: Modeling the Emergence of Innovation from Cumulative Culture

Abstract

While the underlying dynamics of active inference communication and cumulative culture have already been formalized, the emergence of novel cultural information from these dynamics has not yet been understood. In this paper, we apply an active Inference framework, informed by genetic speciation, to the emergence of innovation from a population of communicating agents in a cumulative culture. Our model is premised on the idea that innovation emerges from accumulated cultural information when a collective group of agents agree on the legitimacy of an alternative belief to the existing (or- status quo) belief.

Natalie Kastel, Guillaume Dumas

Active Inference and Psychology of Expectations: A Study of Formalizing ViolEx

Abstract

Expectations play a critical role in human perception, cognition, and decision-making. There has been a recent surge in modelling such expectations and the resulting behaviour when they are violated. One recent psychological proposal is the ViolEx model. To move the model forward, we identified three areas of concern and addressed two in this study - Lack of formalization and implementation. Specifically, we provide the first implementation of ViolEx using the Active Inference formalism (ActInf) and successfully simulate all expectation violation coping strategies modelled in ViolEx. Furthermore, through this interdisciplinary exchange, we identify a novel connection between AIF and Piaget’s psychology, engendering a convergence argument for improvement in the former’s structure/schema learning. Thus, this is the first step in developing a formal research framework to study expectation violations and hopes to serve as a base for future ViolEx studies while yielding reciprocal insights into ActInf.

Dhanaraaj Raghuveer, Dominik Endres

AIXI, FEP-AI, and Integrated World Models: Towards a Unified Understanding of Intelligence and Consciousness

Abstract

Intelligence has been operationalized as both goal-pursuit capacity across a broad range of environments, and also as learning capacity above and beyond a foundational set of core priors. Within the normative framework of AIXI, intelligence may be understood as capacities for compressing (and thereby predicting) data and achieving goals via programs with minimal algorithmic complexity. Within the Free Energy Principle and Active Inference framework, intelligence may be understood as capacity for inference and learning of predictive models for goal-realization, with beliefs favored to the extent they fit novel data with minimal updating of priors. Most recently, consciousness has been proposed to enhance intelligent functioning by allowing for iterative state estimation of the essential variables of a system and its relationships to its environment, conditioned on a causal world model. This paper discusses machine learning architectures and principles by which all these views may be synergistically combined and contextualized with an Integrated World Modeling Theory of consciousness.

Adam Safron

Intention Modulation for Multi-step Tasks in Continuous Time Active Inference

Abstract

We extend an Active Inference theory in continuous time of how neural circuitry in the Dorsal Visual Stream (DVS) and the Posterior Parietal Cortex (PPC) implement visually guided goal-directed behavior with novel capacity to resolve multi-step tasks. According to the theory, the PPC maintains a high-level internal representation of the causes of the environment (belief), including bodily states and objects in the scene, and by generating sensory predictions and comparing them with observations it is able to learn and infer the causal relationships and latent states of the external world. We propose that multi-task goal-directed behavior may be achieved by decomposing the belief dynamics into a set of intention functions that independently pull the belief towards different goals; multi-step tasks could be solved by dynamically modulating these intentions within the PPC. This low-level solution in continuous time is applicable to multi-phase actions consisting of a priori defined steps as an alternative to the more general hybrid discrete-continuous approach. As a demonstration, we emulated an agent embodying an actuated upper limb and proprioceptive, visual and tactile sensory systems. Visual information was obtained with the help of a Variational Autoencoder (VAE) simulating the DVS, which allows to dynamically infer the current posture configuration through prediction error minimization and, importantly, an intended future posture corresponding to the visual targets. We assessed the approach on a task including two steps: reaching a target and returning to a home position. We show that by defining a functional that governs the activation of different intentions implementing the corresponding steps, the agent can easily solve the overall task.

Matteo Priorelli, Ivilin Peev Stoianov

Learning Generative Models for Active Inference Using Tensor Networks

Abstract

Active inference provides a general framework for behavior and learning in autonomous agents. It states that an agent will attempt to minimize its variational free energy, defined in terms of beliefs over observations, internal states and policies. Traditionally, every aspect of a discrete active inference model must be specified by hand, i.e. by manually defining the hidden state space structure, as well as the required distributions such as likelihood and transition probabilities. Recently, efforts have been made to learn state space representations automatically from observations using deep neural networks. In this paper, we present a novel approach of learning state spaces using quantum physics-inspired tensor networks. The ability of tensor networks to represent the probabilistic nature of quantum states as well as to reduce large state spaces makes tensor networks a natural candidate for active inference. We show how tensor networks can be used as a generative model for sequential data. Furthermore, we show how one can obtain beliefs from such a generative model and how an active inference agent can use these to compute the expected free energy. Finally, we demonstrate our method on the classic T-maze environment.

Samuel T. Wauthier, Bram Vanhecke, Tim Verbelen, Bart Dhoedt

A Worked Example of the Bayesian Mechanics of Classical Objects

Abstract

Bayesian mechanics is a new approach to studying the mathematics and physics of interacting stochastic processes. We provide a worked example of a physical mechanics for classical objects, which derives from a simple application thereof. We summarise the current state of the art of Bayesian mechanics in doing so.

Dalton A. R. Sakthivadivel

A Message Passing Perspective on Planning Under Active Inference

Abstract

We present a message passing interpretation of planning under Active Inference. Specifically, we show how the Active Inference planning procedure can be broken into a (partial) message passing sweep over a graph, followed by local computations of a cost functional (the Expected Free Energy). Using Forney-style Factor Graphs, we then proceed to show how one can derive novel planning schemes by local changes to the underlying graph and message passing schedule. We illustrate this by first isolating the “sophisticated” aspect of Sophisticated Inference and then proposing a novel planning algorithm through combining the sophisticated update mechanism with a different message passing schedule. Our main contribution is a modular view of planning under Active Inference that can serve as a framework for both understanding existing algorithms, deriving new ones and extending the class of models that are amenable to Active Inference. Approaching Active Inference from a message passing perspective also shows how it can be efficiently implemented using off-the-shelf probabilistic programming software, broadening the class of models available to researchers and practitioners.

Magnus Koudahl, Christopher L. Buckley, Bert de Vries

Efficient Search of Active Inference Policy Spaces Using k-Means

Abstract

We develop an approach to policy selection in active inference that allows us to efficiently search large policy spaces by mapping each policy to its embedding in a vector space. We sample the expected free energy of representative points in the space, then perform a more thorough policy search around the most promising point in this initial sample.

We consider various approaches to creating the policy embedding space, and propose using k-means clustering to select representative points. We apply our technique to a goal-oriented graph-traversal problem, for which naive policy selection is intractable for even moderately large graphs.

Alex B. Kiefer, Mahault Albarracin

Value Cores for Inner and Outer Alignment: Simulating Personality Formation via Iterated Policy Selection and Preference Learning with Self-World Modeling Active Inference Agents

Abstract

Humanity faces multiple existential risks in the coming decades due to technological advances in AI, and the possibility of unintended behaviors emerging from such systems. We believe that better outcomes may be possible by rigorously exploring frameworks for intelligent (goal-oriented) behavior inspired by computational neuroscience. Here, we explore how the Free Energy Principle and Active Inference (FEP-AI) framework may provide solutions for these challenges via affording the realization of control systems operating according to principles of hierarchical Bayesian modeling and prediction-error (i.e., surprisal) minimization. Such FEP-AI agents are equipped with hierarchically-organized world models capable of counterfactual planning, realized by the kinds of reciprocal message passing performed by mammalian nervous systems, so allowing for the flexible construction of representations of self-world dynamics with varying degrees of temporal depth. We will describe how such systems can not only infer the abstract causal structure of their environment, but also develop capacities for “theory of mind” and collaborative (human-aligned) decision making. Such architectures could help to sidestep potentially dangerous combinations of systems with high intelligence and human-incompatible values, since such mental processes are entangled (rather than orthogonal) in FEP-AI agents. We will further describe how (meta-)learned deep goal hierarchies may also well-describe biological systems, suggesting that potential risks from “mesa-optimisers” may actually represent one of the most promising approaches to AI safety: minimizing prediction-error relative to causal self-world models can be used to cultivate modes of policy selection and agent personalities that robustly optimize for achieving goals that are consistently aligned with both individual and shared values. Finally, we will describe how iterative policy selection and preference learning can result in “value cores” or self-reinforcing, relatively stable attracting states that agents will seek to return to through their goal-oriented imaginings and actions.

Adam Safron, Zahra Sheikhbahaee, Nick Hay, Jeff Orchard, Jesse Hoey

Deriving Time-Averaged Active Inference from Control Principles

Abstract

Active inference offers a principled account of behavior as minimizing average sensory surprise over time. Applications of active inference to control problems have heretofore tended to focus on finite-horizon or discounted-surprise problems, despite deriving from the infinite-horizon, average-surprise imperative of the free-energy principle. Here we derive an infinite-horizon, average-surprise formulation of active inference from optimal control principles. Our formulation returns to the roots of active inference in neuroanatomy and neurophysiology, formally reconnecting active inference to optimal feedback control. Our formulation provides a unified objective functional for sensorimotor control and allows for reference states to vary over time.

Eli Sennesh, Jordan Theriault, Jan-Willem van de Meent, Lisa Feldman Barrett, Karen Quigley

Springer Professional

Active Inference

Third International Workshop, IWAI 2022, Grenoble, France, September 19, 2022, Revised Selected Papers

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Preventing Deterioration of Classification Accuracy in Predictive Coding Networks

Interpreting Systems as Solving POMDPs: A Step Towards a Formal Understanding of Agency

Disentangling Shape and Pose for Object-Centric Deep Active Inference Models

Object-Based Active Inference

Knitting a Markov Blanket is Hard When You are Out-of-Equilibrium: Two Examples in Canonical Nonequilibrium Models

Spin Glass Systems as Collective Active Inference

Mapping Husserlian Phenomenology onto Active Inference

The Role of Valence and Meta-awareness in Mirror Self-recognition Using Hierarchical Active Inference

World Model Learning from Demonstrations with Active Inference: Application to Driving Behavior

Active Blockference: cadCAD with Active Inference for Cognitive Systems Modeling

Active Inference Successor Representations

Learning Policies for Continuous Control via Transition Models

Attachment Theory in an Active Inference Framework: How Does Our Inner Model Take Shape?

Capsule Networks as Generative Models

Home Run: Finding Your Way Home by Imagining Trajectories

A Novel Model for Novelty: Modeling the Emergence of Innovation from Cumulative Culture

Active Inference and Psychology of Expectations: A Study of Formalizing ViolEx

AIXI, FEP-AI, and Integrated World Models: Towards a Unified Understanding of Intelligence and Consciousness

Intention Modulation for Multi-step Tasks in Continuous Time Active Inference

Learning Generative Models for Active Inference Using Tensor Networks

A Worked Example of the Bayesian Mechanics of Classical Objects

A Message Passing Perspective on Planning Under Active Inference

Efficient Search of Active Inference Policy Spaces Using k-Means

Value Cores for Inner and Outer Alignment: Simulating Personality Formation via Iterated Policy Selection and Preference Learning with Self-World Modeling Active Inference Agents

Deriving Time-Averaged Active Inference from Control Principles

Backmatter