Zum Inhalt
Erschienen in:

Open Access 2025 | OriginalPaper | Buchkapitel

Coverage Semantics for Dependent Pattern Matching

verfasst von : Joseph Eremondi, Ohad Kammar

Erschienen in: Programming Languages and Systems

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Das Kapitel befasst sich mit der Trennung zwischen Theorie und Praxis des abhängigen Pattern Matching in Programmiersprachen und konzentriert sich auf das Kernmerkmal der abhängigen Programmierung. Es stellt CoverTT vor, eine Variante der Martin-Löf-Typentheorie, die über eine Abdeckung parametrisiert wird, was vielfältige Muster-Matching-Abstraktionen ermöglicht. Der Text untersucht die Anatomie von Datentypen, trennt Konzepte wie Koprodukte, abhängige Felder, Indexierung und Rekursion und stellt ein allgemein ausreichendes Kriterium für gut definierte Musterübereinstimmungen dar, ohne die logische Konsistenz zu beeinträchtigen. Das Kapitel vertieft sich auch in kategorische Modelle von CoverTT, indem es Categories with Families (CwFs) verwendet, um syntaktische Konstrukte in eine semantische Domäne zu übersetzen. Sie führt das Konzept der Garben und Standorte ein und stellt eine ausreichende Bedingung dafür dar, wann eine Abdeckung semantische Musteranpassungen zulässt. Das Kapitel schließt mit einer Diskussion über verwandte Arbeiten und zukünftige Richtungen, wobei das Potenzial für die Implementierung verbesserter Mustersynonyme und die Unterstützung sich überschneidender Musterübereinstimmungen hervorgehoben wird. Die detaillierte Erforschung dieser Themen macht dieses Kapitel zu einer fesselnden Lektüre für diejenigen, die sich für die theoretischen Grundlagen von Programmiersprachen und Typentheorie interessieren.

1 Introduction

Pattern matching is a core feature in dependently typed programming. With pattern matching one can specify a function consuming an input by giving functions for every possible way that input might have been constructed. For dependent types, the defined function can be a universally quantified proof, giving a Curry-Howard analogue of proof-by-cases.
However, there is a disconnect between the theory and practice of dependent pattern matching. Many dependently typed languages take pattern matching as a built-in user-facing construct: Coq [4], Agda [22], and Idris [5] all contain a form of dependent pattern matching in their core calculi. However, most theoretical treatments of dependent types deal with eliminators [19]: primitive recursors with result types dependent on a value of the eliminated type. Eliminators and pattern matching are equally expressive, with or without with Axiom K [6, 12].
While it is possible to express all pattern matches using eliminators, it is not always convenient. Some pattern matching features require lengthy translations when converting to eliminators, such as overlapping patterns, catch-all branches, or matching on multiple values at once. Moreover, languages differ in which pattern matches they allow, so every variant of pattern matching requires a new eliminator-translation to prove consistency. Even the implementations of dependently typed languages are restricted by eliminators, since most pattern matches are elaborated into case trees with a 1:1 correspondence between branches and constructors of an inductive type.
The contribution of this paper is to narrow the theory-practice divide with a highly general syntax (Section 2) and categorical semantics for dependent pattern matching (Section 3). The semantics is direct and generic: pattern matches are translated directly into semantic objects without desugaring to eliminators or case trees, and the semantics is parameterized over an abstract coverage specifying which sets of patterns one can match against. We investigate the vision set forth by Epigram [20, 21] to enable diverse pattern-matching abstractions that go beyond the list of constructors declared by a datatype. We focus on non-overlapping patterns, but give a potential road map to supporting overlap.
In constructing our generic semantics, we present a general sufficient criterion for when a coverage leads to well-defined pattern matches without compromising logical consistency (Section 4), mechanized in Lean [11]. We define this criterion, drawing on parallels between dependent pattern matching and the theory of sheaves on a site, discovering that it is sufficient for each allowed set of patterns to correspond to a cover in the canonical coverage for the semantic category. Moreover, we use elementary results from sheaf theory to describe a group of closure operations which preserve the canonicity of a coverage (Section 5). These give a simple and direct way to model common features like multi-value matches, nested patterns, and matching on propositional equality proofs. With the exception of recursion, we achieve feature parity with the original presentation of dependent pattern matching by Coquand [9]. We conclude with an illustrative example (Section 6) and related and future work (Section 7).
A major contribution of our work is expressing pattern matching in the language of categories and sheaves. Our approach is semantic: instead of elaborating pattern matching syntax, we take pattern matches as primitive and work directly in the semantic domain, avoiding the need to consider eliminators as a syntactic primitive. The connection has been implicit for many decades, but we make it formal. That said, we only assume basic knowledge of functors, pullbacks, and slice categories, and we present all the required sheaf theory.

2 CoverTT: The Source Language

We begin with a variant of Martin Löf Type Theory, called CoverTT, parameterized over which sets of patterns can be matched against. Drawing from sheaf-theory terminology, when a set of patterns is permissible on the left-hand side of a pattern match we call it a cover, and say it is covering. The set of all covers together is called a coverage. The main distinct feature of CoverTT is that it is parameterized over a coverage.

2.1 The Anatomy of a Datatype

As they are presented in most dependently typed languages, inductively defined datatypes conflate four different concepts. Consider the quintessential inductive family of length-indexed vectors:
$$\begin{aligned} &\texttt{data}\ \textsf{Vec}\ (A : \mathcal {U} ) : (m : \mathbb {N}) \rightarrow \mathcal {U} \ \texttt{where}\\ &\qquad \textsf{nil} : \textsf{Vec}\ A\ 0\\ &\qquad \textsf{cons} : (n : \mathbb {N}) \rightarrow A \rightarrow \textsf{Vec}\ A\ n \rightarrow \textsf{Vec}\ A\ (n+1) \end{aligned}$$
This definition implicitly relies on the following concepts, the separation of which motivates the design CoverTT (as exhibited in Section 2.3).
  • Coproducts: An inductive type behaves like the sum of its constructor types. For \(\textsf{Vec}\), it behaves like \(\mathbb {1}+ (A \times \textsf{Vec}\ A\ n)\). In CoverTT, each inductive I is declared at the top level to have a finite collection of constructors \(D^{I}_{1}\ldots D^{I}_{n}\), each of which has a type ending in I.
  • Dependent fields: each constructor has a dependent product type, so the arguments to a constructor are a curried dependent record, where the types of later fields can depend on the values of earlier ones. For \(\textsf{Vec}\), in \(\textsf{cons}\) the return type and the type of the tail of the list depend on the earlier parameter n. In CoverTT, the type of a constructor is given as a telescope, where the types of later entries are allowed to depend on the values of previous entries.
  • Indexing: each constructor implies a specific equation about the index values. For \(\textsf{Vec}\), \(\textsf{nil}\) restricts that \(m = 0\) and \(\textsf{cons}\) restricts that \(m = n+1\). In CoverTT, we use Fording [18], where each constructor has an identical return type, but may constrain type parameters using equality-proof fields. We treat equality as primitive and make it the only way to constrain indices of a constructor, simplifying the models of CoverTT.
  • Recursion: inductive types can refer to themselves in fields, except to the left of a function arrow, a condition known as strict positivity. Typically, one can define structurally recursive functions over \(\textsf{Vec}\). Here, \(\textsf{Vec}\) occurs as a field type for \(\textsf{cons}\), which is allowed because it is not to the left of an arrow type. We omit recursion from CoverTT, as we believe it requires a separate toolkit of abstractions. In Section 7.2 we discuss its possible addition. In examples, we refer to some inductive types defined using self-reference, such as vectors or natural numbers.
Restricting to Top-Level Datatypes CoverTT only allows data types and pattern matches to be declared at the top level. The parameter and constructor types of datatypes must be typeable in the empty context, though they can refer to other data types. Likewise, the branches and motive of a pattern match must be typeable in the empty context. These assumptions simplify the presentation of CoverTT while reflecting how datatypes are implemented in languages like Agda, where declarations in a non-empty context are desugared into top level declarations with extra parameters. Treating nested patterns and arbitrary coverages is an interesting problem even with top-level matches, and requires substantial technical developments even to handle ordinary matching.

2.2 Syntax and Typing

Fig. 1.
CoverTT: Term Typing
The syntax for CoverTT below is standard, except for the separation of concerns from Section 2.1. Figure 1 gives the syntax, along with typing rules for CoverTT.
Overline arrows denote sequences, while bold metavariables denote dependent sequences, e.g. substitutions. Variables are assigned types from the context. Typing for dependent functions and equality is standard, with rules for their types, introduction, and elimination. The equality type is intensional, and there is no reflection rule by which propositional equalities can be made judgmental equalities. Nevertheless, CoverTT is consistent with models that identify all propositionally-equal terms. We have no J-axiom, instead using the coverage to specify how to match on \({\texttt{refl}_{t}}\). We have one universe which does not have a type, leaving a universe hierarchy for future work.
Contexts and Substitutions Figure 2 also specifies well-formedness rules for contexts and substitutions. Contexts are sequences of typed variables, where later types may refer to variables from earlier in the context. Substitutions are the inhabitants of contexts. The rules are like an iterated version of dependent pairs: the type of the later values in the substitutions may depend on the earlier values. We borrow the notation \(\varGamma \vdash \textbf{t} => \varDelta \) from Hofmann [14], since such a substitution corresponds to a morphism \(\varGamma -\rightarrow \varDelta \) in the models we define. We use the notation \([\textbf{s}/\varDelta ]t\) to denote the simultaneous substitution of the variables bound in \(\varDelta \) by the terms of \(\textbf{s}\) in t. If \(\varGamma \vdash \textbf{s} => \varDelta \) and \(\varDelta \vdash t : T\), then \(\varGamma \vdash [\textbf{s}/\varDelta ] t : [\textbf{s}/\varDelta ]T\).
Datatype and Pattern Matching Syntax We assume a fixed collection of inductive type constructors, along with data constructors. For each datatype there is a fixed context of parameters \(\textsf{Params}(I)\) and, for each constructor, fields \(\textsf{Fields}(D^{I})\), such that \(\vdash \textsf{Params}(I), \textsf{Fields}(D^{I})\), i.e., both are well-formed, and the fields may depend on the parameters. The type \(I\ \textbf{t}\) denotes the type constructor I applied to parameters \(\textbf{t}\). \(D^{I}\ \textbf{t}\) is the data constructor D for the type I, given \(\textbf{t}\) for its fields. We omit I in \(D^{I}\) when it is clear from context.
The pattern matching form is a nameless version of defining functions by multiple pattern matching clauses, as in Agda or Idris. The term
denotes a match on scrutinees \(\textbf{t}\) of context-type \(\varXi \), producing a result of the motive type T, where T may refer to the variables bound in \(\varXi \). The branches, indexed by i, each have a left–side pattern \(\mathbf {s_{i}}\), which may contain pattern variables from the context \(\varDelta _{i}\), and which produces a result \(t_{i}\). The scrutinee is a substitution because pattern matching functions can take multiple arguments, and the types of later arguments can depend on the values of earlier ones.
Fig. 2.
Typing: Contexts and Substitutions
Typing Pattern Matches The TyCase typing rule (Fig. 1) is the most important rule. To type https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figb_HTML.gif , the scrutinee inhabits some \(\varXi \). Because pattern matching is dependent, the motive \(T_{motive}\) is indexed over the scrutinee type. The pattern \(\mathbf {s_{i}}\) for each branch inhabit the scrutinee type \(\varXi \) in the context of its pattern variables \(\varDelta _{i}\). Each branch is typed against the context of its pattern variables, and inhabits the motive for the scrutinee value given by that branch’s pattern, e.g., \(\mathbf {s_{i}}\). Finally, the entire match inhabits the motive type, instantiated to the scrutinee. Critically, the motive \(T_{motive}\) and each branch \(t_{i}\) must be typeable in the closed context \(\varDelta _{i}\), making no reference to \(\varGamma \). This restriction matches practice: in Agda and Idris, pattern matching elaborates to top level declarations. We give an example in Section 2.3.
The Generic Coverage Relation In a match, the patterns \(\overline{\mathbf {s_{i}}}\) must be covering. At no point do we require the patterns of a match to correspond to constructors of an inductive type, or even that the matched upon type be inductive. Instead, we appeal to an arbitrary judgment https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figc_HTML.gif , relating sets of substitutions to contexts. This relation is to be read as “the patterns \(\mathbf {s_{1}} \ldots \mathbf {s_{n}}\) are a total decomposition of the context \(\varXi \)”. This relation replaces the usual condition that there must be a case for each constructor of the datatype. It is the parameter by which we tune CoverTT, and can take many forms, from requiring a single scrutinee with exactly one branch per constructor, to allowing multiple scrutinees with arbitrary nested patterns and absurd branches omitted. Section 4.2 explores the conditions a coverage must satisfy to result in a well-behaved type theory. In general, we expect that a coverage will consist of a basis set of coverings, containing at least variables and constructors for each datatype, which can be closed under composition, concatenation, etc. For generality, we do not require closure conditions, but we show in Section 5 they are always permissible.
Computational Rules Figure 3 gives definitional equality for CoverTT, omitting structural (reflexivity, symmetry, transitivity) and congruence rules. Rule EqApp is the usual \(\beta \)-reduction. In EqMatch, we reduce a match when the scrutinee is \([\mathbf {t_{mat}}/\varDelta _j]\textbf{s}_j\) for some \(\varGamma \vdash \mathbf {t_{mat}} => \varDelta _{j}\), i.e., it is a pattern applied to values for the pattern variables of \(\mathbf {s_{j}}\). This substitution \(\mathbf {t_{mat}}\) instantiates the pattern variables in \(t_{j}\), which is the value of the entire match.
Fig. 3.
Definitional Equality: Computational Rules

2.3 Example: Vectors

To see these constructs concretely, we show how length-indexed vectors from Section 2.1 would be represented in our system, along with a type safe head function. In CoverTT, vectors can be defined using labelled sums. Notice the equality-type fields which encode the index constraints via Fording.
In Agda-style notation, a safe head function for vectors can be written:
$$\begin{aligned} & \textsf{head} : (A : \mathcal {U} ) \rightarrow (n : \mathbb {N}) \rightarrow \textsf{Vec}\ A \ (1 + n) \rightarrow A&\\ & \textsf{head}\ A\ n\ (\textsf{Cons}\ h\ t) = h & \end{aligned}$$
In CoverTT this would be defined as:
That is, the function takes A, n, and x as type, number, and vector parameters. It then passes those parameters as the scrutinee of the pattern match, which is annotated with their types. The annotation is a telescope of types, so it introduces new names for the scrutinees. It happens that here we are passing names as the scrutinees, but this need not be the case, which is why we need new names for them. The result type is annotated as A. The match has a single branch, with a telescope of pattern variables m, h and t with their types. The pattern for the branch has \(m+1\) as the value for scrutinee n and \(\textsf{Cons}\) applied to its arguments for scrutinee x. Finally, to the right of \(=>\) is the result for this case, which is h.
Implicit in this example is the need for the following to hold:
That is, the pattern for the single branch needs to cover the scrutinee type. Deducing that these patterns are a valid coverage involves seeing that \({\texttt{refl}_{m+1}}\) constrains the value of n and that \(\textsf{Nil}\) has an absurd type.
The goal of this paper is to define direct semantics that justify such deductions, and give a broad framework to define valid coverages.

3 Categorical Models of CoverTT

In this section, we translate the syntactic constructs of CoverTT into the language of Categories with Families (CwFs). CwFs correspond almost exactly to the syntactic structure of dependent type theory, but with syntactic substitution replaced by a semantic operation, and with implicit liftings between contexts made explicit.

3.1 Background: Categories with Families

We recapitulate the definition and notation of CwFs, a categorical model for dependent type theory that follows the syntax fairly closely. See Hofmann [14] for more details.
Recall that a family X is a pair \(\left( -,I_X\right) {\underline{X} }\) consisting of a set \(I_X\) and an I-indexed sequence of sets \(\left( -\right) [i \in I]{\underline{X}_i}\). A map of families \(f : X \rightarrow Y\) is a pair \(\left( -,f_I\right) {\underline{f}}\) consisting of a function \(f_I : I_X \rightarrow I_Y\) and an I-indexed sequence of functions \(\left( -\right) [i \in I]{f_i : \underline{X}_i \rightarrow \underline{Y}_{f i}}\). The category \(\textbf{Fam}\) has families as objects and maps of families as morphisms, with componentwise identity and composition structure.
A basic CwF \(\mathcal {C}\) is a pair \(\left( -,\mathcal {C}_{\textrm{o}}\right) {F}\) consisting of a category \(\mathcal {C}_{\textrm{o}}\) and a functor \(F : \mathcal {C}_{\textrm{o}}^{\textsf {op}} \rightarrow \textbf{Fam}\). The functor F packs four pieces of structure which we’ll unpack using the following notation:
  • Objects \(\varGamma \in \mathcal {C}_{\textrm{o}}\) are contexts, and morphisms \(\theta : \varGamma \rightarrow \varDelta \) are substitutions.
  • For every context \(\varGamma \in \mathcal {C}_{\textrm{o}}\), we denote the family \(F \varGamma \) by \(\left( -,\textsf{Ty}({\varGamma })\right) {{\textsf{Tm}_{\varGamma }({ \_})}}\). We call the elements of its indexing set \(\textsf{Ty}({\varGamma })\) the types in context \(\varGamma \). For each type in \(T \in \textsf{Ty}({\varGamma })\), we call the element of the component \(\textsf{Tm}_{\varGamma }({ T})\) the terms of type T in context \(\varGamma \). We omit \(\varGamma \) when it is clear from context.
  • For every substitution \(\theta : \varGamma \rightarrow \varDelta \), we have a map of families \(F\theta : F\varGamma \leftarrow F\varDelta \), and we use the same notation for both its components \(F\theta = \left( \_\{\theta \},\_\{\theta \}\right) \) and call both substitution functions. The first component is the substitution function on types \(\_\{\theta \}: \textsf{Ty}({\varGamma })\leftarrow \textsf{Ty}({\varDelta })\). The second component is a sequence of substitution functions on the terms of each type \(\_\{\theta \}: \textsf{Tm}_{\varGamma }({ T\{\theta \}}) \leftarrow \textsf{Tm}_{\varDelta }({ T})\).
  • The functoriality of F amounts to the following four properties, which we call the substitution lemma for this CwF, where T ranges over \(\textsf{Ty}({\varGamma })\), t ranges over \(\textsf{Tm}_{\varGamma }({ T})\):
A basic CwF includes only the bare bones of semantic models for a dependent type theory. In order to model dependent type theories of interests we need to equip them with additional structure.
We start with context extension. Let \(\mathcal {C}\) be a basic CwF, and assume \(\mathcal {C}_{\textrm{o}}\) has a terminal object \(\cdot \). A comprehension structure \(\left( -,\triangleright ,\textsf{p}\right) {\textsf{v}}\) over \(\mathcal {C}\) consists of, for each context \(\varGamma \in \mathcal {C}_{\textrm{o}}\) and type \(T \in \textsf{Ty}({\varGamma })\):
  • A context \(\varGamma \triangleright T\), the context \(\varGamma \) extended by T.
  • A substitution \(\textsf{p}_T : \varGamma \triangleright T \rightarrow \varGamma \), the weakening of \(\varGamma \) by T. We say that we weaken a type or a term by T when we apply the corresponding substitution function for the weakening \(\textsf{p}_T\).
  • A term \(\textsf{v}_T \in \textsf{Tm}_{\varGamma \triangleright T}({T})\), the variable we extend the context \(\varGamma \) with.
  • Moreover, for every substitution \(\theta : \varDelta \rightarrow \varGamma \) and term \(t \in \textsf{Tm}_{\varDelta }({T\{\theta \}})\) there is a unique substitution \(\langle \theta , t: T\rangle : \varDelta \rightarrow \varGamma \triangleright T\) satisfying:
    $$ \textsf{p}_{T} \mathbin {\circ }\langle \theta , t\rangle = \theta \qquad \textsf{v}_{T}\{\langle \theta , t\rangle \} = t $$
    We call this substitution the extension of the substitution \(\theta \) by t. In the sequel we omit the type ascription and write \(\langle \theta , t\rangle \) for \(\langle \theta , t: T\rangle \) when T is clear from context. Likewise, we omit the subscript on \(\textsf{p}\) and \(\textsf{v}\) when clear.
A CwF is a basic CwF together with comprehension structure. For a CwF \(\mathcal {C}= \left( -,\mathcal {C}_{\textrm{o}}\right) {F}\), we refer to \({\mathcal {C}_{\textrm{o}}}\) as \(\mathcal {C}\) when doing so will cause no ambiguity.

3.2 Sections, Slices, and Dependent Types

Let \(\mathcal {C}\) be a category and \(\varGamma \in \mathcal {C}\) an object in it. Recall the slice category \(\mathcal {C}/ \varGamma \) whose objects \(\left( T,d\right) \) consist of an object \(T \in \mathcal {C}\) and a morphism \(d : T \rightarrow \varGamma \). Morphisms \(f : \left( T,d\right) \rightarrow \left( S,e\right) \) in the slice are morphisms \(f : T \rightarrow S\) that lift d through e, i.e.: \(e\circ f = d\). For example, a section of \(\varGamma \) is a morphism out of \(\left( \varGamma ,\textrm{id} \right) \) in the slice \(\mathcal {C}/ \varGamma \), i.e., an object \(T \in \mathcal {C}\), and a pair of morphisms \(f : \varGamma \rightarrow T\) and \(d : T \rightarrow \varGamma \) such that \(d \circ f = id\).
We can use the slices of a category \(\mathcal {C}\) to model dependent type theory by requiring \(\mathcal {C}\) to be locally Cartesian closed (LCCC). The intuition behind this structure is that the object \(\left( T,d\right) \) in the slice \(\mathcal {C}/ \varGamma \) represent types T in context \(\varGamma \), and d represents the dependency of terms of this type on their context. We will not recapitulate the LCCC conditions explicitly. We will, however, spell the induced LCCC structure needed for a CwF, for two reasons. First, we indicate what we have formalized in lean. Second, we rely on this relationship between type families and slice objects in Section 4. It lets us use core results of sheaf theory to model dependencies in pattern matching.
Lemma 1
\(\mathsf {({\checkmark }\textsc {lean})}\) Let \(\mathcal {C}\) be a CwF.
  • The weakening \(\textsf{p}_{T} : (\varGamma \triangleright T) \rightarrow \varGamma \) encodes the type T as the slice object \(\left( (\varGamma \triangleright T),\textsf{p}_T\right) \), in the sense that sections of \(\varGamma \) correspond to terms in context \(\varGamma \):
  • The morphisms in \(\mathcal {C}\) encode indexed types—for all \(\varGamma , \varDelta \in \mathcal {C}\), \(T \in \textsf{Ty}({\varGamma })\), and \(\theta : \varDelta \rightarrow \varGamma \):
For \(t \in \textsf{Tm}_{\varGamma }({T})\) we write \(\overline{t} : \varGamma \rightarrow \varGamma \triangleright T\) for \(\langle id, t\rangle \), the corresponding \(\textsf{p}\)-section.

3.3 Semantic Type Formers and Closedness

CwFs give the core structure of type dependency, but give no indication of what types a model supports. Here we give semantic closedness conditions which postulate the existence of type and term constructors corresponding to common features of a dependent type theory.
Dependent Functions We say that a CwF \(\mathcal {C}\) supports dependent functions if it has:
  • For each \(S \in \textsf{Ty}({\varGamma })\) and \(T \in \textsf{Ty}({\varGamma \triangleright S})\), a type \(\varPi (S,T) \in \textsf{Ty}({\varGamma })\);
  • For each \(T \in \textsf{Ty}({\varGamma \triangleright S}) \) and \(t \in \textsf{Ty}({T})\), a term \(\lambda (t) \in \textsf{Tm} ({\varPi (S,T)})\);
  • For each \(T \in \textsf{Ty}({\varGamma \triangleright S}) \), \(s \in \textsf{Tm} ({S})\) and \(t \in \textsf{Tm} ({\varPi (S,T)})\), a term \(\textsf{App}(t,s) \in \textsf{Tm} ({T\{\langle id, s\rangle \}})\);
such that the usual structural substitution rules hold, as well as the \(\beta \)-reduction equality \( \textsf{App}(\lambda (t),s) = t\{\langle id, s\rangle \}\). We could impose a version of an \(\eta \) rule, but this is not required for our results.
Equality A CwF \(\mathcal {C}\) supports propositional equality types [14] if, for every type \(T \in \textsf{Ty}({\varGamma })\), there exists substitution-stable terms as follows:
  • A type \(\textsf{Id}(T) \in \textsf{Ty}({\varGamma \triangleright T \triangleright T\{\textsf{p}\}})\);
  • A morphism \(\textsf{Refl}_{T} : \varGamma \triangleright T \rightarrow \varGamma \triangleright T \triangleright T\{p\} \triangleright \textsf{Id}(T)\), such that \(\textsf{p}\circ \textsf{Refl} = \langle id, \textsf{v}_{T}\rangle \);
  • For each \(S \in \textsf{Ty}({\varGamma \triangleright T \triangleright T\{\textsf{p}\} \triangleright \textsf{Id}(T)})\), a function (on sets) \({\textsf{J}_{T,S} } \) which is in \({ \textsf{Tm} ({T\{\textsf{Refl}_{T}\}}) \rightarrow \textsf{Tm} ({S})}\), where, for any \(t \in \textsf{Tm} ({S\{\textsf{Refl}_{T}\}})\), \(\textsf{J}(t)\{\textsf{Refl}_{T}\} = t\).
We have analogues of CoverTT terms using substitution. For \(T \in \textsf{Ty}({\varGamma })\) and \(s,t \in \textsf{Tm} ({T})\):
  • \(\textsf{Id}(T,s,t) := \textsf{Id}(T)\{ \langle \langle id, s\rangle , t\rangle \}\in \textsf{Tm} ({\varGamma })\);
  • \(\textsf{Refl}_{T}(t) := \textsf{v}\{\textsf{Refl}_{T}\circ \overline{t}\} \in \textsf{Tm} ({\textsf{Id}(T)\{\textsf{p}\circ \textsf{Refl}_{T} \circ \overline{t}\}}) = \textsf{Tm} ({\textsf{Id}(T,t,t)})\).
We say that a CwF supports extensional equality when \(\textsf{Tm} ({\textsf{Id}(T,s,t)}) \ne \emptyset \) iff \(s = t\), such as in presheaf or set-theoretic models. Note that this condition does not require CoverTT to soundly model an equality reflection rule. Intensional equality in CoverTT can be modelled extensionally, so long as one does not augment CoverTT with axioms like univalence, which are inconsistent with equality reflection. We focus on extensional models, discussing alternatives in Section 7.2.
Labelled Variants Next we define what it means for a category to support labelled variants, so that we can interpret datatypes as coproducts of their constructor types. Labelled variants are simply coproducts with extra structure to mediate the type-context relationship. Assume:
  • a fixed set of type constructors \(\textsc {TyCon}\);
  • for each \(I \in \textsc {TyCon}\), a set of data constructors \(\textsc {DataCon}_{I}\);
  • for each \(I \in \textsc {TyCon}\), an object \(\textsf{Params}_{I} \in \mathcal {C}\) specifying types of arguments to the type constructor;
  • for each \(D^{I} \in \textsc {DataCon}_{I}\) a type \(\textsf{Fields}_{D^{I}} \in \textsf{Ty}({\textsf{Params}_{I}}) \) specifying the types of arguments to the data constructor.
The last condition will usually rely on having some sort of dependent pair to encode multiple fields. Then a CwF supports the labelled variants for I if there exists some \(\textsf{TyCon}_{I} \in \textsf{Ty}({\textsf{Params}_{I}})\), such that for each \(\theta : \varGamma \rightarrow \textsf{Params}_{I}\), there exists an isomorphism
That is, projecting out the context of from the fields’ type is the same as converting into \(\textsf{TyCon}_{I}\) with \(\iota \) and then projecting the context. Then, for each \(t \in \textsf{Tm} ({\textsf{Fields}_{D^{I}_{i}}\{\theta \}})\), the arrow \( \iota ^{-1} \circ \textsf{inj}_{i} \circ \overline{t}\) is a section of \(\textsf{p}_{\textsf{TyCon}_{I}\{\theta \}}\), and hence denotes a term in \(\textsf{Tm} ({\textsf{TyCon}_{I}\{\theta \}})\). We call this term \(\textsf{DataCon}_{D^{I}_{i}}(t)\), since it denotes the ith data constructor applied to field values t.

3.4 Pattern Matching

Here we give a semantic presentation of CoverTT-style pattern matching. In the CwF framework, we can follow the definition of CoverTT. This serves as a statement of what we need to define pattern matching semantically. We show how to fulfill those requirements in Section 4.
Consider a semantic coverage relation https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figk_HTML.gif whose elements are sets containing arrows into \(\varDelta \). We say \(\mathcal {C}\) supports matching over the semantic coverage https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figl_HTML.gif if, for every, \(\varDelta \in \mathcal {C}\), \(T \in \textsf{Ty}({\varDelta })\), index set \(\mathcal {I}\), covering https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figm_HTML.gif for \(i \in \mathcal {I}\), branch results \(t_{i} \in \textsf{Tm} ({T\{\theta _{i}\}}) \) for \(i \in \mathcal {I}\), and scrutinee \(\theta : \varGamma \rightarrow \varDelta \), there exists a term https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Fign_HTML.gif such that:
  • for any \(\sigma _{i} : \varGamma \rightarrow \varDelta _{i} \), if \(\theta = \theta _{i} \circ \sigma _{i}\), then https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figo_HTML.gif ;
  • The above choice commutes with substitution, i.e., for \(\theta _{1} : \varGamma _{1} \rightarrow \varGamma _{2}\) and \(\theta _{2} : \varGamma _{2} \rightarrow \varDelta \), we have https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figp_HTML.gif ;
To see this concretely, consider the \(\textsf{head}\) function from Section 2.3 in the CwF structure of CoverTT. The scrutinee type \(\varDelta \) is \(\cdot , (A : \mathcal {U} ), (n : \mathbb {N}), (x : \textsf{Vec}\ A\ n)\). The cover is the singleton https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figq_HTML.gif , where syntactic extension https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figr_HTML.gif corresponds to \(\langle \_, \_\rangle \). This pattern corresponds to \(\theta _{1}\). The pattern context \(\varDelta _{1}:= \cdot , (m : \mathbb {N}), (h : A), (t : \textsf{Vec}\ A\ m)\), so the pattern in the cover corresponds to an arrow \(\cdot , (m : \mathbb {N}), (h : A), (t : \textsf{Vec}\ A\ m) \rightarrow \cdot ,(A : \mathcal {U} ), (n : \mathbb {N}), (x : \textsf{Vec}\ A\ n)\). There is one branch, whose result \(t_{1}\) is h : A (which matches the overall result because the result type is not dependent). Finally, the scrutinees are the variables https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figs_HTML.gif , which have context-type \(\varDelta \) in the empty context.
The pattern matching condition says that, if https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figt_HTML.gif is supported by a model of \(\textsc {CoverTT} \), and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figu_HTML.gif , then there exists a term \(\textsf{match}_{\varDelta _{1}, \theta _{1}, t_{1}}\) such that, for any mht, applying the substitution yields t, i.e. https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figv_HTML.gif . In the case of our term model, this is given by the pattern match:
However, in other models, there may not be an obvious way to form \(\textsf{match}_{\varDelta _{1}, \theta _{1}, t_{1}}\). We provide a general way of forming such a term in Section 4.

3.5 Model compatibility and soundness

The correspondence between the type formers we have introduced and the constructs of CoverTT is fairly direct, but we must account for some technical details to make the connection formal.
Not every possible syntactic cover of CoverTT leads to a non-trivial model. For example, if https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figx_HTML.gif , then we can write an inhabitant for the empty type:
If we have a CwF structure on a category \(\mathcal {C}\) supporting functions, labelled variants, and pattern matching in the sense of Section 3.4 with a coverage relation https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figz_HTML.gif , then we can soundly model CoverTT in \(\mathcal {C}\) so long as the syntactic https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figaa_HTML.gif is compatible with the semantic https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figab_HTML.gif .
In Fig. 4 we define partial translations \(\llbracket \varGamma \rrbracket \in \mathcal {C}\), \(\llbracket \varGamma \vdash T \rrbracket \in \textsf{Ty}({\varGamma })\), \(\llbracket \varGamma \vdash t : T \rrbracket \in \textsf{Tm}_{\varGamma }({T})\). We omit the cases other than pattern matching, as they are standard. The translation for pattern matches checks if the patterns translate to a semantic cover, which is well founded because each pattern is syntactically smaller than the entire match.
Fig. 4.
Model of Pattern Matching in CoverTT
We have the following soundness result, characterizing how the syntactic coverage must correspond to a semantic coverage supporting pattern matching.
Theorem 1 (soundness)
If every CoverTT syntactic cover https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figac_HTML.gif has \(\llbracket \varDelta _{i} \rrbracket \) and \(\llbracket \textbf{s}_{i} \rrbracket \) defined, and https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figad_HTML.gif , then \(\llbracket \_ \rrbracket \) is a total mapping on terms/types/environments/contexts that are well typed with respect to https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figae_HTML.gif . Moreover, the model is sound with respect to definitional equality.
Proof
If all covers are compatible, then straightforward induction shows that the undefined case never arises on well-typed terms. The argument follows the standard CwF model of type theory, except for pattern matching, where the equations from Section 3.4 directly satisfy EqMatch.

4 Coverages and Sheaves to Model CoverTT

Theorem 1 lists conditions that ensure we can soundly interpret CoverTT. What categories and coverages can we find that fulfill our criteria?
In this section we connect some core concepts of sheaf theory to pattern matching. Specifically, we provide a sufficient condition for when a coverage on a category \(\mathcal {C}\) admits semantic pattern matching as in Section 3.4.

4.1 Coverages and Sheaves

Alongside our contribution we provide a brief introduction to sheaves and sites for completeness. Systematic overviews of sheaves are given, for example, by Johnstone [15, C2] or MacLane and Moerdijk [17].
Coverages and Sites
A sheaf-theoretic coverage on a category \(\mathcal {C}\) is, for each \(\varDelta \in \mathcal {C}\), a set of subsets of https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figag_HTML.gif , called covers, which fulfill the following closure condition [15, C2.1.1]: For each cover \({\{f_{i} : \varGamma _{i} \rightarrow \varDelta \}_{i \in 1 \ldots n}}\), and other morphism \(g : \varXi \rightarrow \varDelta \) there exists a \(\varXi \)-cover \({\{h_{j} : \varXi _{j} \rightarrow \varXi \}_{j \in 1 \ldots m}}\) such that, https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figah_HTML.gif an \(f_{i}\) such \(g \circ h_{j}\) lifts along \(f_{i}\). That is, there exists a \(k_{j}\) making the diagram to the right commute.
This condition is weaker than requiring covers to be closed under pullback by any arrow. In particular, we do not require \(\mathcal {C}\) to have all pullbacks. If \(\mathcal {C}\) does have all pullbacks, one can saturate the coverage so that the property of being a cover is preserved under pullbacks.
A sheaf-theoretic coverage on \(\mathcal {C}\) gives a coverage J for each object \(\varDelta \in \mathcal {C}\). Arrows in a J-cover share a common codomain, so it is clear to which object a cover belongs. A site \((\mathcal {C},J)\) is a category \(\mathcal {C}\) equipped with a coverage J.
We refer to “sheaf theoretic coverages” specifically to distinguish them from coverages in the sense of Sections 2 and 3, i.e. the sets of patterns that we allow. Both denote the sets of morphisms/patterns that cover a given context, but we don’t require pattern coverages to fulfill the sheaf-theoretic closure conditions. In Section 7.2 we show that a constructive syntactic model cannot fulfill them.
Sheaves The next conceptual tools we need are those of a presheaf and a sheaf. A presheaf P is a functor \(P : \mathcal {C}^{\textsf {op}} \rightarrow \textbf{Set}\). Sheaf theorists think of presheaves as abstract collection of functions/terms, with \(P\varDelta \in \textbf{Set}\) being the set of functions out of \(\varDelta \)/terms in context \(\varDelta \). A sheaf is a collection that ‘thinks’ all pattern matches over every cover uniquely defines a term. Formulating sheaves precisely involves multiple nested quantifiers, and we break it down in stages.
For any presheaf P and cover \(\overline{\theta _{i} : \varDelta _{i} \rightarrow \varDelta } \in J\), a matching family is a collection \(x_{i} \in P(\varDelta _{i})\) such that for every \(\sigma : \varXi \rightarrow \varDelta _{i}\) and \(\sigma ' : \varXi \rightarrow \varDelta _{j}\), if \(\theta _{i} \circ \sigma = \theta _{j} \circ \sigma '\), then \(P(\sigma )(x_{i}) = P(\sigma ')(x_{j})\). I.e., a matching family assigns a P value for all arrows in a cover, while agreeing in overlapping cases.
An amalgamation of a matching family https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figai_HTML.gif over \(\overline{\theta _{i} : \varDelta _{i} \rightarrow \varDelta } \in J\) is an \(x \in P\varDelta \) such that \(P(\theta _{i})(x) = x_{i}\), i.e., it is a value in the covered object that is compatible with the matching family.
A sheaf on \((\mathcal {C},J)\) for a cover \(\overline{\theta _{i} : \varDelta _{i} \rightarrow \varDelta } \in J\) is a presheaf P such that every matching family has a unique amalgamation. A presheaf is a J-sheaf, or just a sheaf, when it is a sheaf for each cover in J. It is in this way that a J-sheaf is a presheaf that ‘thinks’ all covers admit pattern-matching.
Let \(\textbf{y}: {\mathcal {C}\rightarrow (\mathcal {C}^{\textsf {op}} \rightarrow \textbf{Set})}\) denote the Yoneda embedding that maps each \(\varGamma \in \mathcal {C}\) to the presheaf https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figaj_HTML.gif . A coverage is subcanonical when for every \(\varDelta \in \mathcal {C}\), \(\textbf{y}\varDelta \) is a sheaf. There is a largest such coverage \(J_{canonical}\)—the canonical coverage. We say a cover is canonical when it is in the canonical coverage. Every representable is a sheaf for a canonical cover, though in Section 7.2 we disprove the converse: there are models which support pattern matching, so every representable is a sheaf for each allowed pattern, but where the allowed patterns do not fulfill the necessary conditions to be a sheaf-theoretic coverage.

4.2 Pattern Matching via Sheaves

The similarity between amalgamation and pattern matching is apparent, and was informally established by Coquand [9]: since morphisms in \(\mathcal {C}\) correspond to substitutions (sequences of terms) in CoverTT, the sheaf condition gives a way to merge arrows (branches) with the same codomain (return type). However, to model dependent pattern matching, we need to handle the dependency of the branch result type on the scrutinee’s value. Thankfully, slices give us the tools to model type dependency, and sheaf theory lets us convert subcanonical coverages on a category to coverages on a slice. The key properties, which we have mechanized in Lean [11], are as follows (see, e.g. Johnstone [15, C2.2.17]):
Theorem 2
\(\mathsf {({\checkmark }\textsc {lean})}\) If \((\mathcal {C},J)\) is a subcanonical site, then for \(\varGamma \in \mathcal {C}\), the site \((\mathcal {C}/\varGamma , J_{\varGamma })\) is subcanonical, where we define \({\{f_{i} : (\varDelta _{i}, \theta _{i}) \rightarrow (\varXi , \sigma )\}_{i}} \in J_{\varGamma }\) if and only if \({\{f_{i} : \varDelta _{i} \rightarrow \varXi \}_{i}} \in J\). In particular, if \({\{f_{i} : \varDelta _{i} \rightarrow \varGamma \}_{i}}\) is canonical, then \({\{f_{i} : (\varDelta _{i}, f_{i}) \rightarrow (\varGamma , id) \}_{i}}\) is too.
These properties are related to the fundamental theorem of topos theory, which says that a slice of a sheaf category is equivalent to a category of sheaves over the slice.
We now have what we need to state and prove the main result of this section: a criterion ensuring that a coverage can model pattern matching. The following theorem has been mechanized in the Lean 4 theorem prover [11]; work is underway to mechanize the model’s soundness and the coverage building rules of Section 5.
Theorem 3
\(\mathsf {({\checkmark }\textsc {lean})}\) Consider a CwF \(\mathcal {C}\) and, for each \(\varDelta \in \mathcal {C}\), a relation https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figak_HTML.gif where the \( \theta _{i}\) are disjoint monomorphisms into \(\varDelta \). If all covers in https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figal_HTML.gif are canonical, then \(\mathcal {C}\) supports pattern matching (in the sense of Section 3.4).
Proof
Let \((C, J_{canonical})\) be a canonical site with a CwF structure and a relation https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figam_HTML.gif . Consider a scrutinee type \(\varDelta : \mathcal {C}\), dependent result type \(T \in \textsf{Ty}({\varDelta })\), canonical cover https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figan_HTML.gif of non-overlapping monos, branch results \(t_{i} \in \textsf{Tm} ({T\{\theta _{i}\}})\), and scrutinee \(\theta : \varGamma \rightarrow \varDelta \). To construct https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figao_HTML.gif , we build \(t'_{match} \in \textsf{Tm}_{\varDelta }({T\{id\}})\) with which we compose the scrutinee \(\theta \). We will show how the sheaf condition corresponds to the pattern match.
Matches as Arrows Recall from Lemma 1 that there is a \(\textbf{Set}\)-isomorphism https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figap_HTML.gif . To find a term in \(\textsf{Tm}_{\varDelta }({T\{id\}})\), we use an arrow in https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figaq_HTML.gif .
Pattern Sets as Slice Covers The patterns \({\{{\theta _{i} : \varDelta _{i} \rightarrow \varDelta }\}_{i}}\) correspond to a canonical \(\mathcal {C}\)-cover by our premise, so by Thm. 2 the cover \({\{\theta '_{i} : (\varDelta _{i}, \theta _{i}) \rightarrow (\varDelta , id) \}_{i}}\) is canonical in \(\mathcal {C}/\varDelta \).
Branches as Matching Families The branch results of the pattern match form a matching family for \(\textbf{y}((\varGamma \triangleright T, \textsf{p}))\). Our branches are https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figar_HTML.gif . By Lem. 1, this family yields a sequence https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figas_HTML.gif , i.e., https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figat_HTML.gif , which is a matching family for the presheaf \(\textbf{y}(\varDelta \triangleright T, \textsf{p})\) and the cover \({\{{\theta _{i} : (\varDelta _{i}, \theta _{i}) \rightarrow (\varDelta , id)}\}_{i}}\).
Amalgamating Branches Because the cover is canonical, then \(\textbf{y}(\varDelta \triangleright T, \textsf{p})\) is a sheaf for it. The sheaf condition states that the above matching family has an amalgamation \(x \in \textbf{y}(\varDelta \triangleright T, \textsf{p})(\varDelta ,id)\), such that \(\theta _{i} \circ x = x_{i}\). So Lem. 1 yields a term \(t'_{match} \in \textsf{Tm} ({T\{id\}})\) such that \(t'_{match}\{\theta _{i}\} = t_{i}\).
Equations and the Scrutinee Finally, given a scrutinee \(\theta : \varGamma \rightarrow \varDelta \), we choose \(t'_{match}\{\theta \}\) as https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figau_HTML.gif . It is in \(\textsf{Tm}_{\varGamma }({T\{\theta \}})\), so it has the correct type. It satisfies the requisite equations. Indeed, since \(t'_{match}\{\theta _{i}\} = t_{i}\), whenever \(\theta = \theta _{i} \circ \sigma _{i}\) for some \(\sigma _{i}\), we have \( {t'_{match}\{\theta \}} = t'_{match}\{\theta _{i} \circ \sigma _{i}\} = (t'_{match}\{\theta _{i}\})\{\sigma _{i}\} = t_{i}\{\sigma _{i}\} \) just as Section 3.4 requires. For substitution, given \(\theta = \theta _{2} \circ \theta _{1}\), we have https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figav_HTML.gif .
The above construction lets us model pattern matching for any canonical cover, where the motive type corresponds to a representable sheaf. If J is subcanonical, then every representable is a sheaf, so we can define dependent pattern matching for any motive type. Moreover, the canonical coverage contains the covers from every subcanonical coverage, so it suffices that each allowed pattern set is a canonical cover. The disjointness and injectivity conditions ensure that the branches of a match follow the sheaf-theoretic definition of a matching family. One could instead require that branches agree on their overlap [7], which is suited to matching on real numbers [25].
We conclude this section by recalling a property characterizing subcanonical covers [15, C2.1.11], which we will utilize in Section 5.
Theorem 4
A set of arrows \({\{\theta _{i} : U_{i} \rightarrow U\}_{i}}\) is canonical for \(\mathcal {C}\) if and only if, for every \(\sigma : V \rightarrow U\) and every object \(T \in \mathcal {C}\), the presheaf \(\textbf{y}(T)\) is a sheaf for the pulled back family \({\{\sigma ^{*} \theta _{i} : \sigma ^{*} U_{i} \rightarrow V \}_{i}}\).
We can use this theorem to form basic subcanonical coverages. To build new coverages from old ones, we employ saturation conditions: operations on coverages which do not change which presheaves are sheaves for that coverage. This is of interest to us because the canonical coverage is invariant under every saturation: it is already the largest possible subcanonical coverage, so no covers can be added without changing its notion of sheaf. The next section gives several examples of useful saturation conditions.

5 Tools for Building Coverages

In this section, we take the abstract canonicity condition and derive concrete rules for forming canonical covers. This section justifies the idea that CoverTT can begin with a basic set of covers for each type and obtain a language of patterns by allowing nesting, variables, and multiple scrutinees. We provide base coverages, along with composition rules for building complex coverages from simpler ones. This recreates commonly supported features of dependent pattern matching: variables, constructors for labelled variants, pruning absurd branches, and matching on \(\texttt{refl}\) with inaccessible (dot) patterns. In Section 7.2 we discuss potential novel coverages that can be supported using coverage semantics.

5.1 Identity and Isomorphism

The most basic canonical covers are singletons consisting of an isomorphism. If two contexts are isomorphic, then moving from one to the other covers all cases.
Lemma 2
A presheaf is a sheaf for a singleton cover containing an isomorphism \(\iota : \varGamma \cong \varDelta \), so every isomorphism is a canonical singleton cover.1
As a consequence, identity arrows are in singleton canonical covers:
Corollary 1
The singleton cover \({\{id : \varGamma \rightarrow \varGamma \}}\) is canonical for any category, so a pattern consisting entirely of variables \(x_{1}, x_{2}, \ldots x_{n}\) can be safely included in any coverage for CoverTT.
Supporting identity arrows is the bare minimum we need for pattern matching. They are the base case out of which other patterns are built, where no discrimination or computation happens at all. Likewise, a catch-all pattern, commonly written as an underscore ‘\(\_\)’, is just an unnamed variable that does not occur in the right-hand side of the branch.
More generally, the canonical coverage contains all isomorphisms. So we can devise a sound semantics for CoverTT where any isomorphism is a valid cover of a type. This allows for operations such as rearranging variables in a dependency-respecting way or re-bracketing nested sums and products.
Functions written by the programmer can even be used as patterns if they are isomorphisms, opening the door for user-defined views into a type. Of course, the existence of a sound semantics does not guarantee a language we can actually implement. Checking whether a term is a definitional isomorphism is undecidable without being explicitly given its inverse. Moreover, depending on how extensional the model is, there may be terms that are isomorphisms in the model, but are not definitional isomorphisms in CoverTT.

5.2 Coproducts and Wadler Views

With variables as patterns, the next primitive patterns we need are constructors for a datatype. As we saw in Section 3.3, one way to model datatypes is with labelled variants. So if \(\mathcal {C}\) has coproducts, we can model datatype constructors as injections into a coproduct context, and we can amalgamate branches that match on all the constructors of a datatype using the universal property of a coproduct.
Labelled variants are only coproducts up to isomorphism, but we have seen that isomorphisms are always singleton covers, and below in Section 5.3 we see that composition preserves canonical covers. So it suffices to consider coproducts directly. Unfortunately, in an arbitrary category, coproduct injections are not guaranteed to form a canonical cover. Thm. 4 requires each representable to be a sheaf for the pullback of every cover, so we need coproducts to be stable under pullback, i.e. the pullback of a coproduct is the coproduct of pullbacks. Thankfully, pullback stability of coproducts holds if \(\mathcal {C}\) is \(\textbf{Set}\), a presheaf category, a topos, or any other locally cartesian closed category (LCCC). We already want \(\mathcal {C}\) to be LCCC in order to support dependent functions.
Theorem 5
Suppose \(\mathcal {C}\) has all pullbacks and that coproducts are disjoint and stable under pullback. Then \({\{ \textsf{inj}_{j} : \varDelta _{i} \rightarrow \coprod _{i \in I}\varDelta _{i} \}_{j}}\) canonically cover \(\coprod _{i \in I}\varDelta _{i}\).
The immediate result of this theorem is that for every inductive type, the constructors are covering for that type, so long as the inductive type is modelled as the labelled variants of its constructor types. However, it is important to realize that this theorem applies for any decomposition of a type into the coproduct of other types, regardless of whether the injections correspond to constructors or not. Such a coverage introduces the possibility for views as introduced by Wadler [27], which act as first class pattern synonyms:
Corollary 2
Consider a finite \(I : \mathcal {U} \), a family \(S : I \rightarrow \mathcal {U} \) and an indexed function \(f : (i : I) \rightarrow S\ i \rightarrow T\) in CoverTT. Suppose we have a category \(\mathcal {C}\) with a CwF model of CoverTT supporting pattern matching as in Section 3.4 over a coverage https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figaw_HTML.gif . If https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figax_HTML.gif , then there is also a model of CoverTT with coverage https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figay_HTML.gif
That is, if T is isomorphic to the sum of some types \(S\ {i}\) over finite i, we can safely match on a value from T, where the ith pattern is the \(S\ i\) value that it corresponds to, regardless of whether T is defined as an inductive type or the \(S\ i\) are its constructor types.

5.3 Nesting and Composition

With variables and constructors as primitive covers, we now need a way to combine them to build more complex covers. Adding the composition of different coverages does not change their sheaves. Every cover has the same sheaves as the sieve it generates, i.e., the closure of the cover under precomposition with any arrow in \(\mathcal {C}\) [15, C2.1.3]. So canonical covers are closed under composition:
Theorem 6
For a cover J of \(\mathcal {C}\), If \({\{f_{i} : \varDelta _{i} \rightarrow \varDelta \}_{i}} \in J\), and for each i, we have \({\{g_{ij} : \varDelta _{ij} \rightarrow \varDelta _{i}\}_{ij}} \in J\), then the sheaves of \((\mathcal {C},J)\) are identical to the sheaves of \((\mathcal {C}, J \cup {\{f_{i} \circ g_{ij} : \varDelta _{ij} \rightarrow \varDelta \}_{i}})\). So if \({\{f_{i} : \varDelta _{i} \rightarrow \varDelta \}_{i}}\) is canonical, and for each i, \({\{ g_{ij} : \varDelta _{ij} \rightarrow \varDelta _{i}\}_{ij}} \) is canonical, then \( {\{f_{i} \circ g_{ij} : \varDelta _{ij} \rightarrow \varDelta \}_{ij}}\) is canonical.
This allows patterns to be nested: if \({\{f_{i}\}_{i}}\) are covering for \(\varDelta \), and for each set of variables in those covering patterns, \({\{g_{ij}\}_{ij}}\) is covering, then we can case split each variable in the \(f_{i}\) into j cases corresponding to the \(g_{ij}\) patterns, and the entire resulting set can still be covering. This property gives semantic justification for the case-split operation of the Agda and Idris editor modes, where the programmer selects a variable in a pattern match, and the pattern containing the variable is replaced by the sequence of patterns that has each possible constructor application in place of the variable.
When we combine the closure of the canonical coverage under identity (isomorphisms), sum injections, and composition, we can recreate dependent pattern matching on non-indexed datatypes. However, we see in the next sections that the language of coverages also gives us the tools to handle indexing.

5.4 Pruning Absurd Contexts

If we allow ourselves some extensionality, then we can use sheaves to model absurd branches and empty cases. Suppose that \(\mathcal {C}\) has an initial object \(\mathbb {0}\), with a unique arrow \(0_{\varGamma } : \mathbb {0}\rightarrow \varGamma \) for every \(\varGamma \). The initial context denotes an empty or absurd context, since we can derive a term of any type from it. It turns out that we do not ever need to include branches for patterns whose contexts are empty. If we can amalgamate for a cover where one pattern has an empty context, we can amalgamate for the same cover with that arrow deleted, since any matching family for the smaller cover can be turned into one for the larger cover by adding the unique arrow out of the initial context.
Theorem 7
Let c be a canonical cover with \(\theta : \varDelta \rightarrow \varGamma \in c\). If there exists an arrow \(\iota : \varDelta \rightarrow \mathbb {0}\), then \(c \setminus {\{ \theta \}}\) is canonical.
This property mirrors how Agda and Idris allow for the omission of empty cases. In some cases, these languages only allow branch right-hand sides to be removed after the programmer specifies an empty pattern, marking which part of the scrutinee has an impossible type. We view this empty pattern as a syntactic aid to tell the type checker when a context is isomorphic to the initial context, so we do not directly model the empty pattern.
Like our assumption about equality, the condition of having a (strong) initial object does not hold in the term model, since not all eliminations of the empty type are definitionally equal. So long as CoverTT does not contain any axioms that specifically distinguish empty eliminations, our model is still sound.

5.5 Propositional Equality

Since isomorphisms are always canonical singleton covers (Lem. 2), we can create a coverage for a sufficiently-extensional equality type.
Corollary 3
If \( { \langle \textsf{v}, \langle \textsf{v}, \textsf{Refl}_{A}\rangle \rangle : \varGamma \triangleright A \rightarrow \varGamma \triangleright A\triangleright A\{\textsf{p}\} \triangleright \textsf{Id}(A\{\textsf{p}^{2}\},\textsf{v},\textsf{v}\{\textsf{p}\}) }\) is an isomorphism, then \({\{\langle \textsf{v}, \langle \textsf{v}, \textsf{Refl}_{A}\rangle \rangle \}}\) is canonical.
In more readable, non CwF notation: if A is isomorphic to https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figaz_HTML.gif in the model via the projections, then \({\{ (x,x,{\texttt{refl}_{x}}) \}}\) can be a singleton cover for https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figba_HTML.gif . Such an isomorphism holds if \(\mathcal {C}\) has extensional equality, since it asserts that there is a unique, internally constructible proof of equality between two equal terms.
For a dependent match targeting \({(x \!:\! A),(y\!:\!A), (pf \!:\! x \!=_{A}\! y) \vdash P(x,y,pf) \!:\! \mathcal {U} }\), the above cover only requires we provide a branch result with type \(P(x,x,\textsf{Refl}_{A})\). The variable y was replaced by x in the goal type. This captures “inaccessible” or “forced” patterns [22], known as dot-patterns in Agda and Idris. By matching on the propositional equality, we work with refined information about the context. Here, we match y against the variable x rather than a constructor. No branching or discrimination happening, since the cover is a singleton. Rather, x is the only possible value for y given the equality proof. Agda writes .x to express this pattern. Section 5.6 extends this to equality proof between arbitrary terms, rather than variables.

5.6 Pullbacks and Unification

We have seen that the canonical coverage includes isomorphisms and injections and that it allows for composition. However, the definition of a sheaf-theoretic coverage enables stability under pullback: for any coverage, a sheaf for that coverage is still a sheaf if we add the pullback of any cover by any arrow. Closure under pullback is the key condition that separates a coverage from a set of arrows. The definition in Section 4.1 is presented in the style of Johnstone [15, C2], in a general way that does not assume the existence of pullbacks. However, when each arrow in a cover has a pullback along some morphism, we get the following saturation condition:
Theorem 8
For a cover J of \(\mathcal {C}\), if \({\{\theta _{i} : \varDelta _{i} \rightarrow \varDelta \}_{i}} \in J\), and \(g : \varGamma \rightarrow \varDelta \), where the pullback of each \(\theta _{i}\) along g exists, then \(J \cup {\{{\{g^{*}\theta _{i} : g^{*} \varDelta _{i} \rightarrow \varGamma \}_{i}}\}}\) has the same sheaves as J. So if \({\{\theta _{i} : \varDelta _{i} \rightarrow \varDelta \}_{i}} \) is canonical, then so is \( {\{g^{*}\theta _{i} : g^{*} \varDelta _{i} \rightarrow \varGamma \}_{i}} \).
This abstract property, known as stability under base change, can be exploited to build interesting covers.
Context Extension Base change lets us add new scrutinees to a pattern match. In any CwF, for \(\theta : \varGamma \rightarrow \varDelta \) and \(T \in \textsf{Ty}({\varDelta })\), pulling back by \(\textsf{p}: \varDelta \triangleright T \rightarrow \varDelta \) yields a morphism \(\langle \theta \circ \textsf{p}, \textsf{v}\rangle : \varGamma \triangleright T\{\theta \} \rightarrow \varDelta \triangleright T\) [14]. Combining this with Thm. 8 gives:
Theorem 9
For canonical \({\{ \theta _{i} : \varDelta _{i} \rightarrow \varDelta \}_{i}} \) and a type \(T \in \textsf{Ty}({\varDelta })\), there is also a canonical cover \({{\{ \langle \theta _{i} \circ \textsf{p}, \textsf{v}\rangle : \varDelta _{i} \triangleright T\{\theta _{i}\} \rightarrow \varDelta \triangleright T \}_{i}} }\).
So we can build a covering pattern for a context \(\varDelta \) and immediately obtain a covering context on \(\varDelta \triangleright T\) by appending a new variable \(\textsf{v}\in \textsf{Tm} ({T\{\theta _{i}\}})\) to each pattern in the cover. In a dependent match the new variable might have a different type in each branch: T may be indexed by variables in \(\varDelta \), but each \(\theta _{i}\) is a value for \(\varDelta \) with variables from \(\varDelta _{i}\). Further case-splitting on the newly introduced variable can be achieved using composition à la Section 5.3.
Matching on Equality Suppose that for some \(\varGamma \in \mathcal {C}\) and \(T \in \textsf{Ty}({\varGamma })\), and that: \( {\{\textsf{Refl}_{T} : \varGamma \triangleright T \rightarrow \varGamma \triangleright T \triangleright T\{\textsf{p}\} \triangleright \textsf{Id}(T\{\textsf{p}^{2}\}, \textsf{v}, \textsf{v}\{\textsf{p}\})\}} \) is canonical. As in Section 5.5, such a property holds for an extensional model. With such a cover on equality, we can apply the base change theorem, the CwF laws, and the properties of equality (Section 3.3) to obtain a cover on contexts containing equalities by matching:
Theorem 10
Suppose \(\mathcal {C}\) has all pullbacks. Consider \(\varGamma \in \mathcal {C}\) with \(T \in \textsf{Ty}({\varGamma })\), \(t_{1},t_{2} \in \textsf{Tm} ({T})\). Then pulling back \(\textsf{Refl}_{T}\) by \(\langle \langle \overline{t_{1}\{\textsf{p}\}}, t_{2}\{\textsf{p}\}\rangle , \textsf{v}\rangle \) yield a context \(\varDelta \) and an arrow \(\langle \theta , \textsf{Refl}_{T}(t_{12})\rangle \), where \(\theta : \varDelta \rightarrow \varGamma \), \(t_{12} \in \textsf{Tm} ({T\{\theta \}})\), and \(t_{1}\{\theta \} = t_{2}\{\theta \} = t_{12}\). Moreover, if \({\{\textsf{Refl}_{T}\}} \) is canonical, then so is the cover \({\{\langle \theta , \textsf{Refl}_{T}(t_{12})\rangle : \varDelta \rightarrow \varGamma \triangleright \textsf{Id}(T,t_{1},t_{2})\}} \).
For the intuition behind this, consider the pullback square to the right, translated to CoverTT-style notation for clarity. First, because the square commutes, we know that \(\theta \) is a substitution that equates \((t_{1}, t_{2}, pf)\) and \((x, x, {\texttt{refl}_{x}})\), i.e., it is a unifier of \(t_{1}\) and \(t_{2}\). Since a pullback is a limit, it is universal, so any other unifiers for \(t_{1}\) and \(t_{2}\) necessarily factor through \(\theta \). Thus, it is the most general unifier for \(t_{1}\) and \(t_{2}\). This is precisely what the usual rule for pattern matching on equality uses: it unifies the two sides of the equality, treating syntactic variables as unification variables, and generates a substitution that is then applied to the goal. The context \(\varDelta \) consists of the variables that were in common between \(t_{1}\) and \(t_{2}\) which remain free in the unification \(t_{12}\). In the case that \(t_{1}\) and \(t_{2}\) do not unify, then the pullback is an arrow out of an initial context, and the branch can be omitted completely (because absurd covers can be omitted, as in Section 5.4).

6 Example: Folding Without a Starting Value

We now have specified everything we need (sans recursion) for feature-parity with the original presentation of dependent pattern matching by Coquand [9]. Our sheaf-centric view generalizes the elaboration process of Goguen et al. [12], but directly within the model instead of as a syntactic elaboration. Constructors for a coproduct form a cover and variables form a singleton cover, acting as the basis from which other covers are generated. Covers can be composed, extended, refined by matching on an equality, or pared down by pruning absurd branches. Indexed data types can be handled using fording and matching on equality.
To see a non-trivial example of how to build a cover in the canonical coverage, consider the \(\textsf{foldr}_{1}\) function found in the Agda standard library [10].
$$\begin{aligned} & \mathsf {\textsf{foldr}}_{1} : (A \rightarrow A \rightarrow A) \rightarrow \textsf{Vec}\,A\,(\textsf{suc}\,n) \rightarrow A &\\ &\mathsf {\textsf{foldr}}_{1}\,f\,(\textsf{cons}\,x\,\textsf{nil})\, = x \ \mid \ \mathsf {\textsf{foldr}}_{1}\,f\,(\textsf{cons}\,x\, (\textsf{cons}\,y\, ys)\, = f\,x\,(\textsf{foldr}_{1}\,f\,(\textsf{cons}\,y\,ys)) & \end{aligned}$$
Because the argument vector has length at least one, the case for \(\textsf{nil}\) can be omitted. The base case is then a vector of length one, and the inductive case is a vector of length two or more.
Assume an extensional CwF model of CoverTT in a LCCC \(\mathcal {C}\) where inductive types are labelled variants. We show how the patterns for \(\textsf{foldr}_{1}\) are in the canonical coverage for \(\mathcal {C}\), and hence \(\textsf{foldr}_{1}\) can be modelled. Note that because labelled variants are defined in terms of isomorphism, we do not preclude initial algebra semantics for modelling the self-reference part of inductive types.

6.1 Translating to CoverTT

First, we translate the function to CoverTT-style by making the length argument explicit and replacing the indexed constructors with ones taking explicit equality proofs. The datatype becomes:
$$\begin{aligned} &\texttt{data}\ \textsf{Vec}\ (A : \mathcal {U} ) : (n : \mathbb {N}) \rightarrow \mathcal {U} \ \texttt{where}\\ & \qquad \textsf{nil} : (n = 0) \rightarrow \textsf{Vec}\ A\ n\\ & \qquad \textsf{cons} : (m : \mathbb {N}) \rightarrow A \rightarrow \textsf{Vec}\ A\ m \rightarrow n = m + 1 \rightarrow \textsf{Vec}\ A\ n. \end{aligned}$$
We also abstract out the recursive calls, since we have not included them in CoverTT and have not required that our model category \(\mathcal {C}\) support them. Despite using recursion, \(\textsf{foldr}_{1}\) is an ideal example because it is not contrived, and uses all the main saturation conditions we developed in Section 5. Since \(\mathsf {foldr_{1}}\) is decreasing in the length of the lists, its recursion can be modelled with well-known techniques orthogonal to our contribution.
$$\begin{aligned} & \mathsf {\textsf{foldr}}_{1} : (n : \mathbb {N}) \rightarrow (A \rightarrow A \rightarrow A) \rightarrow \textsf{Vec}\,A\,(\textsf{suc}\,n) \\ & \qquad \qquad \rightarrow (\textsf{self} : (A \rightarrow A \rightarrow A) \rightarrow \textsf{Vec}\,A\,(\textsf{suc}\,n) \rightarrow A ) \rightarrow A &\\ &\mathsf {\textsf{foldr}}_{1}\,0\,f\,(\textsf{cons}\,0\, x\,(\textsf{nil}\,\textsf{Refl})\,\textsf{Refl})\,\textsf{self} = x &\\ &\mathsf {\textsf{foldr}}_{1}\,(m + 1)\,f\,(\textsf{cons}\,(m + 1)\,x\, (\textsf{cons}\,m\,y\, ys\,\textsf{Refl})\,\textsf{self} \\ & \qquad \qquad = f\,x\,(\textsf{self}\,f\,(\textsf{cons}\,y\,ys\,\textsf{Refl})\,\textsf{Refl})& \end{aligned}$$

6.2 Building the Coverage

Using CoverTT notation rather than CwF notation for clarity and space reasons, we now show how the rules of Section 5 can be used to build a cover:
  • Identity (Cor. 1) has variables \((n) (f) (v) (\textsf{self})\) covering the scrutinee type;
  • Coproduct (Cor. 2) has https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figbd_HTML.gif for any q;
  • Composition (Thm. 6) allows us to construct the canonical cover \({\{ (n\,f\,(\textsf{nil}\,eq_{nil})\,\textsf{self}), (n\,f\,(\textsf{cons}\,m'\,x\,xs\,eq_{cons})\,\textsf{self}) \}}\);
  • We have \(eq_{nil} : n + 1 = 0\), but this type is empty, so applying the absurd rule (Thm. 7) gives a singleton cover \({\{ (n\,f\,(\textsf{cons}\,m'\,x\,xs\,eq_{cons})\,\textsf{self}) \}}\);
  • \(eq_{cons} : n + 1 = m' + 1\), so we get a pullback substitution mapping \(m'\) to n and all other variables to themselves. Applying the \(\textsf{Refl}\) rule (Cor. 3), \(n\,n\,\textsf{Refl}\) is a cover of \({(m' : \mathbb {N}) \triangleright (n : \mathbb {N}) \triangleright (n+1 = m'+1) }\). By composition, the scrutinee context has singleton cover \({\{ (n\,f\,(\textsf{cons}\,n\,x\,xs\,\textsf{Refl})\,\textsf{self}) \}}\) in the canonical coverage;
  • The coproduct property for \(\textsf{Vec}\) and composition yield a cover \({\{ (n\,f\,(\textsf{cons}\,n\,x\,(\textsf{nil}\,eq_{nil})\,\textsf{Refl})\,\textsf{self}), (n\,f\,(\textsf{cons}\,n\,x\,(\textsf{cons}\,m\,y\,ys\,eq_{cons})\,\textsf{Refl})\,\textsf{self}) \}}\);
  • Finally, since \(eq_{nil} : n = 0\) and \(eq_{cons} : n = m + 1\), we apply the \(\textsf{Refl}\) rule for each proof, along with composition, to obtain the desired cover above.
Then, we can use the sheaf condition model how \(f\ x\ (\textsf{self}\ f\ (cons\ y\ ys\ \textsf{Refl})\ \textsf{Refl})\) and x are amalgamated into a denotation for the entire function.

7 Discussion

Dependent pattern matching was first proposed by Coquand [9]. While this work contains no explicit mentions of sheaf theory, it originated the idea that patterns could be thought of in terms of coverings and partitions of a space, which greatly inspired our work. The theory and practice of both pattern matching and eliminators foundational developments in proof assistants: McBride [19] developed elimination for Lego, and later Epigram [20]. This was extended to views and with-clauses by Mcbride and Mckinna [21].
Goguen et al. [12] show how pattern matching can be elaborated to primitive eliminators, and hence given semantics in any model that had semantics for eliminators. These are in turn given semantics using initial algebras [1, 2]. Cockx et al. [6] extend this to work with univalent theories. Elaboration is similar to amalgamation using the sheaf condition, but amalgamation occurs strictly in the model. Elaborating to eliminators also handles recursion, which is not yet explicitly included in our sheaf semantics.
To our knowledge, the first explicit connection between the sheaves and pattern matching was by Sherman et al. [25], which gave a framework for pattern-matching on real numbers using topological spaces. The thesis version of this work [24] generalizes the approach from topological spaces to Grothendieck topologies. Cockx et al. [7] give similar semantics to overlapping patterns by treating them as definitional equalities, using confluence rather than sheaves. Using equalizers or pullbacks to represent unification was originated by Rydeheard and Burstall [23], as well as Goguen [13].

7.2 Future Work

First-Class Pattern Synonyms An immediate application of this work would be to implement an enhanced version of pattern synonyms in a language like Agda. Currently, Agda lets the programmer declare pattern synonyms, but each name must map to a syntactic pattern i.e., a set of nested constructor applications. Agda checks if a definition is covering by elaborating to these patterns. Our framework could be used to build direct coverage checking for pattern synonyms, so the programmer could build their own alternate, extensible covers of a type, using sheaf theory to justify their coverage. This would provide direct semantics for the user-defined views of Wadler [27] and Mcbride and Mckinna [21]. Further research is needed to extract a constructive procedure for amalgamating branches that can be implemented in practice.
Overlapping Pattern Matches Our semantics require non-overlapping, injective patterns, but our framework suggests a way to lift this restriction. Recall that the sheaf condition only requires that a matching family agrees on the overlap between covering patterns. This suggests two ways to give semantics to overlapping patterns: by ensuring that the right-hand sides of each pattern match agree on the overlap of their left-hand sides, or by adding information to each pattern to ensure they are actually non-overlapping.
The latter approach matches current implementations: catch-all patterns are elaborated into multiple branches whose left-hand patterns are the constructors that have not yet been used. Unfortunately, to prove anything about a function defined this way, the programmer needs a proof case for each branch in the elaboration, even if they correspond to a single branch in the function as written. Our framework may support canonical covers which contain extra information preventing overlap, such as proofs that previous branches had not matched. These could be used to develop covers for matches with overlapping patterns that do not require creating additional cases during elaboration, enabling more succinct proofs about overlapping cases.
Inductive Datatypes and Termination Checking When patterns are not restricted to constructors, it is not immediately apparent which recursive pattern matching functions can be soundly modelled, since pattern variables may not be structurally smaller than the patterns in which they occur. Further study is needed to devise criteria for which recursive definitions are well founded with non-constructor patterns.
Beyond Top-Level Matches Our semantics only support top level pattern matches. Many of the results we used, such as the fundamental theorem of topos theory, are well suited to top-level matches but do not directly translate to terms in an arbitrary context. Additionally, if the scrutinee type of a pattern match is in a non-empty context, then matching affects not only the motive, but may refine the values or types of variables in the context on the left. These technical issues suggest a semantic theory of telescopes, which are objects representing extensions to a given context by some number of types, and environments, which extend substitutions by some number of terms.
With Clauses Our approach to matching on equality proofs gives an intuition for modelling Agda-style with-clauses and views [21], though a full account is beyond the scope of this paper. Suppose we are defining a pattern match with scrutinees of type \(\varDelta \) and result type \(\varDelta \vdash T : \mathcal {U} \), and we want to match on some intermediate expression \(\varDelta \vdash s : S\). There is an isomorphism https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figbe_HTML.gif . So if we have a cover of \({\{s_{i} : \varDelta _{i} \rightarrow \varDelta \}_{i}} \) to match s against, we can use composition and extension to obtain a cover \({\{s_{i} : \varDelta , (pf : s_{i} = s) \rightarrow \varDelta \}_{i}}\). In the case that \(s_{i}\) and s unify, the \(\textsf{Refl}\) rule from above can be used to match on the equality, and in an extensional model, the goal type can be safely rewritten due to the existence of the equality proof.
Intensional Models Our current approach relies on extensional models, where equality proofs correspond with equality in the model. We can define models of CoverTT in terms of canonical coverages and non-syntactic equality, and we can give a CwF term model for CoverTT because syntactic pattern matching fulfills the criteria of Section 3.4, but the term model for CoverTT cannot be described in terms of sheaf-theoretic coverages.
To show the issue, we show that \({\{ \textsf{true},\textsf{false} \}}\) is not a canonical cover of \(\textsf{Bool}\) for the CwF given by well-typed \(\textsc {CoverTT} \) terms quotiented by definitional equality. Consider a function \(\textsf{haltsInN} : \textsf{SyntaxTree} \rightarrow \mathbb {N}\rightarrow \textsf{Bool}\), that looks at a syntax tree of an untyped lambda calculus term and checks whether it halts in n or fewer steps. Consider also \(\varOmega : \textsf{SyntaxTree}\), a representation of https://static-content.springer.com/image/chp%3A10.1007%2F978-3-031-91118-7_11/MediaObjects/648518_1_En_11_Figbf_HTML.gif . If \({\{ \textsf{true},\textsf{false} \}}\) is a canonical cover, it is also a sieve in the canonical topology [15, C2.1.8], so pulling the sieve back by \(\textsf{haltsInN}\ \varOmega \) produces the set of arrows \({\{ h_{j} \circ !_{V_{j}} : V_{j} \rightarrow \mathbb {N}\mid h_{j} : \mathbb {1}\rightarrow \mathbb {N} \}}\). This contains each arrow in \(\mathbb {1}\rightarrow \mathbb {N}\), i.e., each natural number. For the cover to be canonical, for any type T there must be a way to amalgamate \({\{ t_{j} : T \mid h_{j} : \mathbb {1}\rightarrow \mathbb {N} \}}\) into \(\mathbb {N}\rightarrow T\). Then all set-theoretic infinite sequences of natural numbers could be amalgamated into type-theoretic functions \(\mathbb {N}\rightarrow \mathbb {N}\), which is impossible.
The above example relies on the existence of an infinite cover. While infinite covers are allowed in sheaf theory, they do not correspond directly to pattern matches that a programmer can write down. So further exploration of finite and infinite covers may resolve the issue.
Another issue is that extensional models typically imply that all equality proofs of a given type are equal. As such our approach is incompatible with univalent theories like Homotopy or Cubical Type Theory [8, 26]. Both of these issues might be addressed by replacing sheaves with stacks or, even more generally, \(\infty \)-stacks [16]. These replace the strict equality of the sheaf condition with higher structure. However, the technical and theoretical overhead of switching to stacks is considerable, and utilizing them for pattern matching will be a significant undertaking.
Toposes and Quasi-toposes Apart from pattern matching, the theory of sheaves plays a central role in categorical logic, since categories of sheaves over a coverage form Grothendieck toposes, which serve as models of constructive logic. Quasi-toposes relax the sheaf conditions to only require uniqueness of amalgamations. Future work should search for deeper connections to toposes or quasi-toposes. Two-level type theories [3] may yield some answers, since they describe the interactions between a model of a type theory and the category of presheaves over that model.

7.3 Conclusion

This work formalizes the connection between dependent pattern matching and the notion of sheaves over a site. We have provided a framework which is expressive enough to capture the semantics of current pattern matching implementations, while laying the groundwork for future enhancements. Our work demonstrates that elaboration to eliminators is not the only feasible semantics for dependent pattern matching, and that there is perspective to be gained from treating pattern matching as a core feature and using the lens of sheaf theory.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Fußnoten
1
In a category with pullbacks, sheaves may be defined for Grothendieck pretopologies, in which all isomorphisms definitionally yield singleton covers.
 
Literatur
4.
Zurück zum Zitat Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development. Springer-Verlag (2004) Bertot, Y., Castéran, P.: Interactive Theorem Proving and Program Development. Springer-Verlag (2004)
8.
Zurück zum Zitat Cohen, C., Coquand, T., Huber, S., Mörtberg, A.: Cubical type theory: A constructive interpretation of the univalence axiom. In: Uustalu, T. (ed.) 21st Int. Conf. Types Proofs Programs TYPES 2015, Leibniz International Proceedings in Informatics (LIPIcs), vol. 69, pp. 5:1–5:34, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018), ISBN 978-3-95977-030-9, ISSN 1868-8969, https://doi.org/10.4230/LIPIcs.TYPES.2015.5 Cohen, C., Coquand, T., Huber, S., Mörtberg, A.: Cubical type theory: A constructive interpretation of the univalence axiom. In: Uustalu, T. (ed.) 21st Int. Conf. Types Proofs Programs TYPES 2015, Leibniz International Proceedings in Informatics (LIPIcs), vol. 69, pp. 5:1–5:34, Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2018), ISBN 978-3-95977-030-9, ISSN 1868-8969, https://​doi.​org/​10.​4230/​LIPIcs.​TYPES.​2015.​5
9.
Zurück zum Zitat Coquand, T.: Pattern matching with dependent types. In: Informal Proc. Log. Framew., vol. 92, pp. 66–79 (1992) Coquand, T.: Pattern matching with dependent types. In: Informal Proc. Log. Framew., vol. 92, pp. 66–79 (1992)
12.
Zurück zum Zitat Goguen, H., McBride, C., McKinna, J.: Eliminating dependent pattern matching. In: Futatsugi, K., Jouannaud, J.P., Meseguer, J. (eds.) Algebra, Meaning, and Computation: Essays Dedicated to Joseph A. Goguen on the Occasion of His 65th Birthday, pp. 521–540, Springer Berlin Heidelberg, Berlin, Heidelberg (2006), ISBN 978-3-540-35464-2, https://doi.org/10.1007/11780274_27 Goguen, H., McBride, C., McKinna, J.: Eliminating dependent pattern matching. In: Futatsugi, K., Jouannaud, J.P., Meseguer, J. (eds.) Algebra, Meaning, and Computation: Essays Dedicated to Joseph A. Goguen on the Occasion of His 65th Birthday, pp. 521–540, Springer Berlin Heidelberg, Berlin, Heidelberg (2006), ISBN 978-3-540-35464-2, https://​doi.​org/​10.​1007/​11780274_​27
13.
Zurück zum Zitat Goguen, J.A.: What is unification?: A categorical view of substitution, equation and solution. In: Algebraic Techniques, pp. 217–261, Elsevier (1989) Goguen, J.A.: What is unification?: A categorical view of substitution, equation and solution. In: Algebraic Techniques, pp. 217–261, Elsevier (1989)
14.
15.
Zurück zum Zitat Johnstone, P.T.: Sketches of an Elephant: A Topos Theory Compendium. Oxford Logic Guides, Oxford University Press, Oxford, New York (Jul 2003), ISBN 978-0-19-852496-0 Johnstone, P.T.: Sketches of an Elephant: A Topos Theory Compendium. Oxford Logic Guides, Oxford University Press, Oxford, New York (Jul 2003), ISBN 978-0-19-852496-0
16.
Zurück zum Zitat Lurie, J.: Higher Topos Theory. No. no. 170 in Annals of Mathematics Studies, Princeton University Press, Princeton, N.J (2009), ISBN 978-0-691-14048-3 978-0-691-14049-0 Lurie, J.: Higher Topos Theory. No. no. 170 in Annals of Mathematics Studies, Princeton University Press, Princeton, N.J (2009), ISBN 978-0-691-14048-3 978-0-691-14049-0
17.
Zurück zum Zitat MacLane, S., Moerdijk, I.: Sheaves in Geometry and Logic: A First Introduction to Topos Theory. Springer, New York, NY, UNITED STATES (1992), ISBN 978-1-4612-0927-0 MacLane, S., Moerdijk, I.: Sheaves in Geometry and Logic: A First Introduction to Topos Theory. Springer, New York, NY, UNITED STATES (1992), ISBN 978-1-4612-0927-0
18.
Zurück zum Zitat McBride, C.: Dependently Typed Functional Programs and Their Proofs. Ph.D. thesis, University of Edinburgh, UK (2000) McBride, C.: Dependently Typed Functional Programs and Their Proofs. Ph.D. thesis, University of Edinburgh, UK (2000)
19.
Zurück zum Zitat McBride, C.: Elimination with a Motive. In: Callaghan, P., Luo, Z., McKinna, J., Pollack, R., Pollack, R. (eds.) Types Proofs Programs, pp. 197–216, Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (2002), ISBN 978-3-540-45842-5, https://doi.org/10.1007/3-540-45842-5_13 McBride, C.: Elimination with a Motive. In: Callaghan, P., Luo, Z., McKinna, J., Pollack, R., Pollack, R. (eds.) Types Proofs Programs, pp. 197–216, Lecture Notes in Computer Science, Springer, Berlin, Heidelberg (2002), ISBN 978-3-540-45842-5, https://​doi.​org/​10.​1007/​3-540-45842-5_​13
20.
23.
Zurück zum Zitat Rydeheard, D.E., Burstall, R.M.: Computational category theory. Prentice Hall International (UK) Ltd., GBR (1988), ISBN 0131627368 Rydeheard, D.E., Burstall, R.M.: Computational category theory. Prentice Hall International (UK) Ltd., GBR (1988), ISBN 0131627368
24.
Zurück zum Zitat Sherman, B.: Making Discrete Decisions Based on Continuous Values. Master of Science, Massachusetts Institute of Technology (2017) Sherman, B.: Making Discrete Decisions Based on Continuous Values. Master of Science, Massachusetts Institute of Technology (2017)
25.
Zurück zum Zitat Sherman, B., Sciarappa, L., Chlipala, A., Carbin, M.: Computable decision making on the reals and other spaces: Via partiality and nondeterminism. In: Proc. 33rd Annu. ACMIEEE Symp. Log. Comput. Sci., pp. 859–868, ACM, Oxford United Kingdom (Jul 2018), ISBN 978-1-4503-5583-4, https://doi.org/10.1145/3209108.3209193 Sherman, B., Sciarappa, L., Chlipala, A., Carbin, M.: Computable decision making on the reals and other spaces: Via partiality and nondeterminism. In: Proc. 33rd Annu. ACMIEEE Symp. Log. Comput. Sci., pp. 859–868, ACM, Oxford United Kingdom (Jul 2018), ISBN 978-1-4503-5583-4, https://​doi.​org/​10.​1145/​3209108.​3209193
27.
Zurück zum Zitat Wadler, P.: Views: A way for pattern matching to cohabit with data abstraction. In: Proc. 14th ACM SIGACT-SIGPLAN Symp. Princ. Program. Lang., pp. 307–313, POPL ’87, Association for Computing Machinery, New York, NY, USA (Oct 1987), ISBN 978-0-89791-215-0, https://doi.org/10.1145/41625.41653 Wadler, P.: Views: A way for pattern matching to cohabit with data abstraction. In: Proc. 14th ACM SIGACT-SIGPLAN Symp. Princ. Program. Lang., pp. 307–313, POPL ’87, Association for Computing Machinery, New York, NY, USA (Oct 1987), ISBN 978-0-89791-215-0, https://​doi.​org/​10.​1145/​41625.​41653
Metadaten
Titel
Coverage Semantics for Dependent Pattern Matching
verfasst von
Joseph Eremondi
Ohad Kammar
Copyright-Jahr
2025
DOI
https://doi.org/10.1007/978-3-031-91118-7_11