I coauthored my first two papers!  It’s work I’m very excited about, and it actually tells a single story which I’d like to put together here, by sketching the bigger picture. Moreover, this is the same story behind my most recent talk, ‘Games with players’, so there’s lots to talk about!
The papers stand in a theory-applications relationship. ‘Towards foundations of categorical cybernetics‘ lays the theoretical foundations for the second one, ‘Translating from extensive form games to open games with agency‘, though both works stand on their own. I wrote them with a bunch of MSP people, you can look them up on the arXiv.
I started blogging about the papers two weeks ago, and the size of what was supposed to me one post spiralled out of hand. There’s a lot ideas going around, and eventually I resorted to split the post in three parts plus a spin-off (which is actually an old resurrected draft):
- Open cybernetic systems I: feedback systems as optics
- Open cybernetic systems II: parametrised optics and agency
- Open cybernetic systems III: control mechanisms and equilibria (coming soon)
- Bonus: Amazing backward induction (coming soon)
I’m going to introduce all the notions required to understanding our key idea, namely that parametrised optics are a good framework for formalising agency in open cybernetic systems. It’s been known for a while that ‘vanilla’ optics do this for open dynamical systems (or more), so our contribution is really in the ‘parametrised = agency’ part. Truth be told, that’s also not completely new: notable precedents are ‘Backprop as a Functor‘, in which the Para construction is first sketched and Bruno Gavranović’s recent work. Also, Toby Smithe has been playing around with a similar construction for active inference, which is intimately related to the theory of cybernetic systems.
In these articles I’ll assume familiarity with string diagrams (which, I argue, everybody is born with) and some basic categorical concepts, mainly from the world of monoidal categories. Sometimes I’ll stray away but those are points for the ‘advanced reader’, which can be safely skipped by the uninitiated.
Feedback systems as optics
In categorical systems theory, various kinds of ‘bidirectional morphisms with state’ (which, we are about to see, is what optics means) have been used to represent dynamical systems. I argue a better word for what’ve been studying is feedback systems, since the kind of dynamics encoded by optics is that of action-reaction: in addition to the unfolding of an action, an optic also models a subsequent ‘reaction propagation’ step, where some kind of reward/loss/nudge (in a word: feedback) is returned to the system.
Contrast this with traditional dynamical systems, whose dynamics is encoded by operators acting on a space, a mathematical model representing action but devoid of feedback. Nevertheless, considerable attention is devoled to the study of observables of a system, i.e. a (usually scalar) quantity of interest which we monitor during the dynamics. In particular, one is often interested in how these quantities evolve as the system itself evolves, and thus a dynamical system equipped with a distinguished observable turns out to be very similar to a feedback system.
Another common occurrence is that the evolution of the systems itself is guided by one or more observables. Think of Hamiltonian mechanics, in which a functional , defined on the space of phases of a physical system, orchestrates the whole dynamics (together with a symplectic form on ). In these cases ‘feedback’ is an even more apt terminology.
The kind of feedback systems me and my group are most interested in are games and machine learning models. In both fields, action and feedback are equally important parts of the dynamics. In games, the ‘action’ part is called play and the ‘feedback’ part is payoff distribution, often in the form of backward induction. In machine learning models, they are called the ‘forward’ and ‘backward’ pass. The algorithm implementing the backward pass is backpropagation. I’ve written about the similarity between backward induction and backpropagation in the last post of this series (coming soon).
Nevertheless, I’ve already blogged about how backpropagation is secretely powered by the algebra of lenses. These are gadgets which pack bidirectional morphisms: a lens is a pair of maps (or just ‘forward part’) and (‘backward part’), which live in some category with products . The terminology surrounding them comes from the functional programming community, where lenses are a rudimental abstraction for accessing and ‘mutating’ data structures. 
One can see the forward part as bringing about an action and the backward part as propagating a feedback. This is very evident in backpropagation, where the forward part of a lens represents a function being computed and the backward part is a reverse derivative being pulled back in order to propagate the loss gradient. Hence, for us, ‘do’ and ‘propagate’ (sometimes abbreviated to prop) are better terms for ‘view’ and ‘update’.
What’s quite important in the definition of lenses is that ‘propagate’ has a dependency on , the ‘state’. This fact (witnessed by the wire branching before and going down to ) is actually enforced by the composition law of lenses:
In practical terms, this means that the feedback a lens propagates pertains to the computation actually happened, or that a lens, like the North, remembers.
This is made even more explicit in optics, a wide generalization of lenses. The leap in generality amounts to making the memory mechanism more expressive. Lenses remember exactly what they received from the environment in the form of a state, which is copied and preserved for the backward pass. In an optic, state is remembered, transmitted, and read out using a middleman, the residual. It is usually denoted by , and features predominantly in the work we are doing, albeit on the sly . This generalization also allows one to drop the assumption that is cartesian, and work with an arbitrary category instead. Still, we usually want to assume is at least monoidal, because it should stand for a category of systems, and monoidal categories allow the two most basic kinds of systems composition, sequential and parallel.
The memorization-transmission-readout mechanism is implemented through some clever mathematical machinery. First of all, residuals are assumed to live in their own category, the aptly-named and denoted category of residuals . It is itself monoidal, and acts on the category our optics are made of (), meaning that we can multiply a given with a given (pretty much like a scalar multiplication allows you to multiply numbers and vectors, i.e. objects of different sorts) . We denote such a product .
A residual is attached to the codomain of the forward part and the domain of the backward part. An optic then looks like a pair of maps , . So the ‘do’ part computes, from a given state in , something in to give back to the environment and something in to keep private. Then, given something in (ideally, the readout of what we memorized in the forward pass) and some feedback in coming from the environment, we can meaningfully propagate it to the environment. 
Notice that vertical wires now live in a different category than horizontal ones. I draw them blue for this reason. Ideally, these wires are not even drawn on the same plane, they live in the transverse dimension, going in and out of the place (this also the reason why the residual wire takes that long detour). This dimension will be greatly exploited in the next post, when I’ll introduce parametrised optics.
All in all, given a monoidal categories of residuals acting on a monoidal category , we get a monoidal category whose objects are pair of objects of and morphisms are optics between them. Indeed, optics can be composed in sequence and in parallel:
The unit for is given by the pair , an aptly invisible object in the diagrammatic language. This language can be thought of as ‘living inside’ diagrammatic language, though this is not completely true as we see from the fact there are wires coming from another category. String diagrams for optics are diagrams for so-called teleological categories.
Insofar, I’ve spoken informally of ‘environment’, though it’s mathematical nature is of utmost importance. For a system, ‘environment’ is everything that happens outside of its boundaries. More suggestively, everything is environment and to specify a system we cut a piece out. This mereological point of view will be greatly expounded in the next post, where we’ll see that also agents arise by cutting a boundary.
For now, we limit ourselves to use this intuition to understand what’s a context for an optic. A context is something you sorround an open system with to yield a closed system, i.e. something contained and finished in itself, whose dynamic can work without yielding to external parts.
This means that a closed system is necessarily of type , a fact that manifests diagrammatically as an absence of wires going in or out:
Thus a context has to provide at least (a) an initial state and (b) a continuation, that is, something turning actions into feedbacks. These are respectively morphisms and , also known as states and costates:
The graphical depiction of costates makes it very obvious why they can be considered ‘continuations’: they turn around the information flow, switching from ‘action’ mode to ‘feedback’ mode. While a costate amounts to two morphisms and , you see how it can be easily converted into a single morphism by composition. If and , then the two things are equivalent (that is, any function can be made into a costate), but in general they are not: costates are only those morphisms that can be obtained by squeezing through a given residual, since this is the way the two parts of an optic can store and communicate information.
It’d seem a state and a costate pair are enough to account for the environment, but there’s still a subtlety. At the moment, the environment ‘ceases to exists’ as soon as the system dynamics kicks in. That is, there’s no way for the environment to store state independently from the system, while when a system is doing its thing, usually the rest of world still exists . Hence we are missing something like this:
When we put everything together we realize that the data of a context is given exactly by a comb in the category of optics on . A comb is a U-shaped diagram, with a hole in the middle. The red piece above is a comb, whose hole is filled by the system.
Hence a compact way to define contexts for a system is as ‘states in optics of optics’ (!!), i.e. combs in whose external boundaries are trivial (the unit) and whose internal boundaries are the ones of the system. 
This fits beautiful into the mereological picture of system and environment: a system is a hole in an environment, which ‘wraps’ the system itself. Putting them together yields an inscrutable closed system. Also, let me stress again how boundaries of a system are a modelling choice. This is quite clear when we consider the composite of two systems: to each of the composee, the other one is part of the environment.
Variants & technological horizons
I can’t abstain from mentioning that, at the moment, two separate generalizations of ‘lenses’ are present in the literature. One is what I described above, known in the most general form as mixed optics or profunctor optics (these are an equivalent presentation of the same objects). The other one is F-lenses, which are themselves a generalization of dependent lenses aka containers aka polynomial functors.
This latter framework is quite important, especially as used in the work of Myers, Spivak, Libkind and others. Its strength lies in the fact they feature dependent types, which are very expressive and arguably the right way of doing certain things (i.e. mode-dependent dynamics). It also generalizes further in the direction of indexed containers, which in turn form the mathematical matter of Hancock’s interaction structures, perhaps the most conceptually sharp treatment of feedback systems around.
Dependently-typed mixed optics are thus the holy grail in this area and something me, Bruno, Jules (who blogged about it last year) and Eigil have been actively worked on in the last few months. They would allow the flexibility of optics, especially their indifference towards cartesian structure (very uncommon in resource theories) and at the same type the expressive power of dependent types. I hope we’ll soon have good news on this front!
Finally, there’s a pretty important bit that I swept under the rug in this article, which is that usually residuals are not kept explicit in optics. Optics are in fact defined as a quotient, using a coend indexed by residuals. The equivalence relation is generated by ‘slidings’:
My impression is that something more should be said about this point. For example, there’s merit in keeping the ‘hidden dynamics’ of a context explicit. On the other hand, equivalence under sliding is a very reasonable condition. A way to resolve this tension is to turn the quotient into a groupoid, i.e. remember slidings as invertible 2-cells between optics. This fits very well with the philosophy behind the construction I’ll describe in the next post, Para.
I hope I managed to conveyed what’s my intuition of feedback systems, namely as bidirectional morphisms whose mathematical incarnation is some flavour of optics. Residuals memorize information from the actions executed in the forward pass in order to effectively elaborate feedback in the backward pass. When a system is paired up with a context, it yields a closed system.
Next time, we are going to see how parametrised optics model agency in feedback systems. This will be a first step toward modelling cybernetic systems themselves, which are feedback systems with agency in a control loop.
A list of further resources on this topic. It’s probably gonna grow as things to add come to my mind.
- Towards categorical foundations of learning, blog post by Bruno Gavranović, featuring more nice animations of optics and some insights I didn’t cover here.
- A general definition of open dynamical system, talk by David Jaz Myers, featuring a lot of very cool mathematical insights.
- Bayesian open games, paper by Jules Hedges and Jon Bolt, featuring full-blown optics-as-feedback-systems in the wild.
 Yeah, I went from zero to two in one shot, which has resulted in a pretty hectic writing spree.
 Truth be told, ‘lenses’ in FP are usually limited to what I’d call ‘monomorphic lawful lenses’… There’s a bunch of conflicting terminology around here. Here’s some historical/etymological background.
 There’s a bit of a fight around here: usually residuals are ‘quotiented out’ (see ) and thus become implicit. I make the case residuals should be explicit. More on this in the part about agency.
 To be fair, this happens if acts ‘multiplicatevely’, an informal term meaning that is sort-of a combination of with , and not some other weird thing. These ‘other weird things’ are actually quite interesting and totally deserved to be considered optics, though the dynamical intution falters a bit there.
 The missing bit of math in this description is a coend, dealing with equivalence of optics. A good reference about optics is Riley’s paper. The full definition of mixed optics can be found here. You can read more about coends in the Fosco’s amazing book on the subject, I won’t go down this rabbit hole here.
 A curious fact is that combs do indeed model object permanence only if the resource theory we are using to represent the world is not semicartesian. In fact, in that case the definition of a context would collapse and be equivalent to the sole data of an initial state (see Proposition 2.0.7 here), thereby trivializing whatever ‘hidden dynamics’ the world would have. Indeed, if the unit is terminal, there’s only one closed system and it is trivial.
 Notice, moreover, that double optics now provide a ‘theory of open contexts’ for a given system. An open context is one whose domain as a double optic (or external boundary as a comb) is not the unit, so it actually acts as a middleman between a system and its environment, without closing it up. It can be considered a blanket, borrowing Pearl’s terminology.
One can make a wonderful use of this to model sequential games with imperfect information. The ‘system’ we are considering now is a single decision of this game: it receives the state of the game and outputs a move, and its feedback is given by the final payoff of this decision. Open contexts can be used to beautifully manage state in this setting. They filter the incoming state of the game in order to hide information which is not available to the player (but existing nevertheless) at that time, e.g. cards other players have in their hand. Then we use the move chosen by the player, together with the (hidden) state of the game to update the overall state of the game. These ‘wrapped decisions’ can be then composed in sequence to get the desired game.
A setting in which open contexts shine even more is Bayesian games. In this case, you really see how contexts do not collapse down to state-costate pairs, because Bayesian games make crucial use of non-lenticular optics. I speculate in this setting contexts really amount to Markov blankets.