Toward collaborative modeling with categorical logics

The early design of CatColab

Evan Patterson

Topos Institute

Topos Institute Berkeley Seminar

2024-10-15

Why applied category theory?

A gulf I’ve noticed:

Insiders find it obvious that category theory could dramatically enhance our ability to model and make sense of the world.

Outsiders find it impossible to imagine what this might look like.

This illegibility is a risk:

Inside view: our community will lack legitimacy and resources.

Outside view: our ideas won’t achieve the impact they are capable of!

Category theory as applied mathematics

Contrast with your favorite standard branch of applied math, say optimization.

  • Optimization is a technical field only understood deeply by specialists
  • But everyone knows what it’s for: it makes the number go up!
  • Moreover, it’s used every day by people across a range of fields


As applied category theorists, we should be asking ourselves:

  1. What is category theory for?
  2. Who is going to use it?

What is category theory for?

My working answer:

Category theory is the mathematics to represent diverse things yet still see connections between them, both within and across domains.

This suggests:

Hypothesis

Category theory will most useful at a scale beyond that of a single individual.

Also, a communication challenge: this answer is pretty different from that given by other branches of applied math!

Hypothesis

Category theory will become legible and useful when it is embodied by usable technologies.

Introducing CatColab

This talk describes work in progress on CatColab, a collaborative environment for formal, interoperable, conceptual modeling.

formal
models are mathematical objects that can be critiqued with clarity
interoperable
models, and modeling languages, can be interoperated with each other
conceptual
modeling languages are well adapted to concepts used by practitioners

Outline

  1. Demo
  2. Mathematics behind CatColab
  3. Design of the CatColab system
  4. Outlook

Credits

The development team for CatColab v0.1: Hummingbird was:

  • Kris Brown
  • Kevin Carlson
  • Owen Lynch
  • Evan Patterson

I am also grateful for support from other colleagues at Topos Institute and our collaborators and funders.

Demo

Two of the logics currently available in CatColab:

  1. Causal loop diagrams
  2. Database schemas


Try it yourself: https://catcolab.org

Warning

CatColab is pre-alpha software under active development.

Mathematics

Mathematical foundation for CatColab is the framework of double theories:

  1. “Cartesian double theories: A double-categorical framework for categorical logic” (Lambert and Patterson 2024)
  2. “Products in double categories, revisited,” Section 10: “Finite-product double theories” (Patterson 2024)

This talks

  • gestures at more expressive double theories
  • focuses on simple special cases already implemented

Motivation

Formal languages are the syntactic counterpart to categorical structures:

Logic/language Categorical structure
Algebraic theories Cartesian categories
Typed lambda theories Cartesian closed categories
Resource theories Symmetric monoidal categories
Statistical theories Markov categories
  • Many (though not all) of these categorical structures are known to be models of double theories
  • But we’re starting with simpler examples, often not regarded as logics at all

Simple double theories

Definition

A simple double theory is a small, strict double category.

This is a concept with an attitude, understood as a categorified theory:

  • object = “object type”
  • proarrow (horizontal morphism) = “morphism type”
  • arrow (vertical morphism) = “operation on objects”
  • cell = “operation on morphisms”
  • \(f\) = object operation with input type \(x\) and output type \(w\)
  • \(\alpha\) = morphism operation with input type \(m\) and output type \(n\)

Models of simple double theories

Definition

A model of a simple double theory \(\mathbb{T}\) is

  • a lax double functor \(M: \mathbb{T} \to \mathbb{S}\mathsf{pan}\), or equivalently
  • a normal lax double functor \(M: \mathbb{T} \to \mathbb{P}\mathsf{rof}\)

Such models are categorified copresheaves:

Idea 1-dimensional 2-dimensional
Schema/theory Category \(\mathsf{C}\) Double category \(\mathbb{D}\)
Semantics \(\mathsf{Set}\) \(\mathbb{S}\mathsf{pan}\ (= \mathbb{S}\mathsf{et})\)
Instance/model Functor \(\mathsf{C} \to \mathsf{Set}\) Lax functor \(\mathbb{D} \to \mathbb{S}\mathsf{pan}\)

Discrete double theories

Simplifying still further:

Definition

A discrete double theory is

  • a simple double theory having only trivial arrows and cells, or equivalently
  • a double category of the form \(\mathbb{D}\mathsf{isc}(\mathsf{B}) := \mathbb{H}(\mathsf{B})\), where \(\mathsf{B}\) is a small category

Such a double theory has only object and morphism types, no operations.

Fact

A model of a discrete double theory \(\mathbb{D}\mathsf{isc}(\mathsf{B})\) is equivalent to a category sliced over \(\mathsf{B}\):

\[ \mathsf{Lax}(\mathbb{D}\mathsf{isc}(\mathsf{B}), \mathbb{S}\mathsf{pan}) \simeq \mathsf{Cat}/\mathsf{B}. \]

Examples

The mathematics behind our two examples:

  1. Causal loop diagrams
  2. Database schemas

Causal loop diagrams

Causal loop diagrams (and also regulatory networks) are

  • signed graphs, or
  • free signed categories


Let \(\mathsf{Sgn}:= \{\pm 1\} \cong \mathbb{Z}_2\) be the group of nonzero signs.

Definition

A signed category is a category \(\mathsf{C}\) equipped with a functor \(\mathsf{C} \to \mathsf{Sgn}\).

So, the category of signed categories is the slice

\[ \mathsf{SgnCat} := \mathsf{Cat}/\mathsf{Sgn}. \]

Causal loop diagrams via double theories

Theory

The theory of signed categories is the discrete double theory generated by

  • a object type \(x\)
  • a morphism type \(n: x \mathrel{\mkern 3mu\vcenter{\hbox{$\scriptstyle+$}}\mkern-13mu{\to}}x\) (for “negative”)

subject to the equation \(n \odot n = \mathrm{id}_x\) (where \(\mathrm{id}_x: x \mathrel{\mkern 3mu\vcenter{\hbox{$\scriptstyle+$}}\mkern-13mu{\to}}x\) is “positive”).

Equivalently,

\[ \mathbb{T}_{\mathsf{SgnCat}} := \mathbb{D}\mathsf{isc}(\mathsf{Sgn}), \]

so

\[ \mathsf{Lax}(\mathbb{T}_{\mathsf{SgnCat}}, \mathbb{S}\mathsf{pan}) \simeq \mathsf{Cat}/\mathsf{Sgn} = \mathsf{SgnCat}. \]

Free signed categories

What’s the point about thinking of signed graphs as free signed categories?

Motifs are morphisms between free signed categories, e.g.,

  • Positive/reinforcing feedback loops are morphisms out of:

  • Negative/balancing feedback loops are morphisms out of:

Diagrams as free categorical structures

This phenomenon is generic:

Slogan

Diagrammatic languages are are free categorical structures whenever it makes sense for arrows to compose.

A famous example:

Example

Petri nets are free symmetric/commutative monoidal categories.

Coming in a future version of CatColab!

Database schemas

A basic notion of database schema is a finitely presented profunctor

where \(\mathrm{Mapping} := \mathrm{Hom}_{\mathrm{Entity}}\) and \(\mathrm{AttrOp} := \mathrm{Hom}_{\mathrm{AttrType}}\).

In SQL jargon:

  • “Entity” = table
  • “Mapping” = foreign key
  • “Attr” = data attribute = column that is not a foreign key

Schemas via double theories

Theory

The theory of profunctors is the “walking proarrow” \(\mathbb{D}\mathsf{isc}(\mathsf{2})\), a discrete double theory freely generated by

  • two objects \(0,1\)
  • one proarrow \(0 \mathrel{\mkern 3mu\vcenter{\hbox{$\scriptstyle+$}}\mkern-13mu{\to}}1\).

A model is a profunctor, either directly or indirectly via “barrels”:

\[ \mathsf{Lax}(\mathbb{D}\mathsf{isc}(\mathsf{2}), \mathbb{S}\mathsf{pan}) \simeq \mathsf{Cat}/\mathsf{2}. \]

In future versions of CatColab:

  • Database instances as modules over models of double theories
  • Algebraic databases a la Schultz et al. (2017)

Design of CatColab

  • Going from math to technology is more than just implementing a spec!
  • The embodiment of mathematics in technology requires as much creative input as the math itself

Desiderata

The system should enable formal, interoperable, conceptual modeling in domain-specific logics

  • assuming that the user knows something about a domain of interest
  • not assuming that the user knows about the meta-logical foundation

Intended users have variable levels of technical expertise and might be…

  • scientist
  • engineer
  • policy analyst
  • local expert/community leader

Structure editing

CatColab is a structure editor for categorical structures:

  • content being edited is not just a string
  • but rather a structured object (such as a model of a double theory)

Interpolates between text editors and fully graphical editors:

  • Unlike a text editor, can provide heavy scaffolding and enforce correct syntax by construction
  • Unlike a graphical editor, there seems to be hope of doing it generically across structures!

Hypothesis

It is possible and practical to build a structure editor for collaborative modeling that is ergonomic, yet parametric over the logic.

User interface

We emphasize formal modeling, but informal narrative is indispensable.

CatColab has a notebook-style interface, mixing

  • formal elements handled by the structure editor
  • natural language text handled by a rich text editor

Note

Interface familiar from computational notebooks like Jupyter but very different execution model:

  • In a Jupyter notebook, cells execute individually and produce side effects
  • A CatColab notebook declaratively specifies a single structure

Levels in the system

Dimension Name Objects are Editable by users
3 Doctrine [not systematized]
2 Theory Double categories with structure ? [maybe by power users]
1 Model Categories with structure
0 Instance Sets with structure

Plus:

  • morphisms between all of these!
  • analyses of models and of instances (currently unsystematized)

Levels in the system as an olog

Components and programming languages

CatColab comprises three major components:

  1. Core of double-categorical logic
    • Written in Rust
    • Compiled to WebAssembly to run in the browser
  2. Frontend
  3. Backend
    • Mostly in Rust, with a bit of TypeScript

Architecture

Outlook

The field of computer science is driven by dreams of universality:

  • universal Turing machines
  • Unified Modeling Language (UML)
  • artificial general intelligence (AGI)

Our aim is not to universalize, but to

  • create small languages/logics perfectly fit to purpose
  • integrate them and their models into webs of shared understanding (or legible disagreement!)

Thanks for listening!

References

Aduddell, Rebekah, James Fairbanks, Amit Kumar, Pablo S. Ocal, Evan Patterson, and Brandon T. Shapiro. 2024. “A Compositional Account of Motifs, Mechanisms, and Dynamics in Biochemical Regulatory Networks.” Compositionality 6 (2). https://doi.org/10.32408/compositionality-6-2.
Ahrens, Benedikt, and Peter LeFanu Lumsdaine. 2019. “Displayed Categories.” Logical Methods in Computer Science 15 (1). https://doi.org/10.23638/LMCS-15(1:20)2019.
Baez, John C., Fabrizio Genovese, Jade Master, and Michael Shulman. 2021. “Categories of Nets.” In 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1–13. https://doi.org/10.1109/LICS52264.2021.9470566.
Carlson, Kevin. 2024. “Introducing CatColab.” Topos Institute. 2024. https://topos.site/blog/2024-10-02-introducing-catcolab/.
Lambert, Michael, and Evan Patterson. 2024. “Cartesian Double Theories: A Double-Categorical Framework for Categorical Doctrines.” Advances in Mathematics 444: 109630. https://doi.org/10.1016/j.aim.2024.109630.
Omar, Cyrus, Ian Voysey, Michael Hilton, Joshua Sunshine, Claire Le Goues, Jonathan Aldrich, and Matthew A. Hammer. 2017. “Toward Semantic Foundations for Program Editors.” In Summit on Advances in Programming Languagess (SNAPL), 71:11:1–12. LIPIcs. https://arxiv.org/abs/1703.08694.
Patterson, Evan. 2024. “Products in Double Categories, Revisited.” https://arxiv.org/abs/2401.08990.
Patterson, Evan, Owen Lynch, and James Fairbanks. 2022. “Categorical Data Structures for Technical Computing.” Compositionality 4 (5). https://doi.org/10.32408/compositionality-4-5.
Roig, Gabriel Goren, Joshua Meyers, and Emilio Minichiello. 2024. “Presenting Profunctors.” https://arxiv.org/abs/2404.01406.
Schultz, Patrick, David I. Spivak, Christina Vasilakopoulou, and Ryan Wisnesky. 2017. “Algebraic Databases.” Theory and Applications of Categories 32 (16): 547–619. http://www.tac.mta.ca/tac/volumes/32/16/32-16abs.html.