Top

2019 | Book

Read chapter Read first chapter

Intelligent Computer Mathematics

12th International Conference, CICM 2019, Prague, Czech Republic, July 8–12, 2019, Proceedings

Editors: Cezary Kaliszyk, Edwin Brady, Andrea Kohlhase, Claudio Sacerdoti Coen

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science

Part of: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

About this book

This book constitutes the refereed proceedings of the 12th International Conference on Intelligent Computer Mathematics, CICM 2019, held in Prague, Czech Republic, in July 2019. The 19 full papers presented were carefully reviewed and selected from a total of 41 submissions. The papers focus on digital and computational solutions which are becoming the prevalent means for the generation, communication, processing, storage and curation of mathematical information. Separate communities have developed to investigate and build computer based systems for computer algebra, automated deduction, and mathematical publishing as well as novel user interfaces. While all of these systems excel in their own right, their integration can lead to synergies offering significant added value.

Frontmatter

Interaction with Formal Mathematical Documents in Isabelle/PIDE

Abstract

Isabelle/PIDE has emerged over more than 10 years as the standard Prover IDE for interactive theorem proving in Isabelle. The well-established Archive of Formal Proofs (AFP) testifies the success of such applications of formalized mathematics in Isabelle/HOL. More recently, the scope of PIDE has widened towards languages that are not connected to logic and proof in Isabelle, but taken from a broader repertoire of mathematics on the computer. The present paper provides a general overview of the PIDE project and its underlying document model, with built-in parallel evaluation and asynchronous interaction. There is also some discussion of original aims and approaches, successes and failures, later changes to the plan, and ideas for the future.

Makarius Wenzel

Beginners’ Quest to Formalize Mathematics: A Feasibility Study in Isabelle

Abstract

How difficult are interactive theorem provers to use? We respond by reviewing the formalization of Hilbert’s tenth problem in Isabelle/HOL carried out by an undergraduate research group at Jacobs University Bremen. We argue that, as demonstrated by our example, proof assistants are feasible for beginners to formalize mathematics. With the aim to make the field more accessible, we also survey hurdles that arise when learning an interactive theorem prover. Broadly, we advocate for an increased adoption of interactive theorem provers in mathematical research and curricula.

Jonas Bayer, Marco David, Abhik Pal, Benedikt Stock

Towards a Unified Mathematical Data Infrastructure: Database and Interface Generation

Abstract

Data plays an increasing role in applied and even pure mathematics: datasets of concrete mathematical objects proliferate and increase in size, reaching up to 1 TB of uncompressed data and millions of objects. Most of the datasets, especially the many smaller ones, are maintained and shared in an ad hoc manner. This approach, while easy to implement, suffers from scalability and sustainability problems as well as a lack of interoperability both among datasets and with computation systems.

In this paper we present another substantial step towards a unified infrastructure for mathematical data: a storage and sharing system with math-level APIs and UIs that makes the various collections findable, accessible, interoperable, and re-usable. Concretely, we provide a high-level data description framework from which database infrastructure and user interfaces can be generated automatically. We instantiate this infrastructure with several datasets previously collected by mathematicians. The infrastructure makes it relatively easy to add new datasets.

Katja Berčič, Michael Kohlhase, Florian Rabe

A Tale of Two Set Theories

Abstract

We describe the relationship between two versions of Tarski-Grothendieck set theory: the first-order set theory of Mizar and the higher-order set theory of Egal. We show how certain higher-order terms and propositions in Egal have equivalent first-order presentations. We then prove Tarski’s Axiom A (an axiom in Mizar) in Egal and construct a Grothendieck Universe operator (a primitive with axioms in Egal) in Mizar.

Chad E. Brown, Karol Pąk

Relational Data Across Mathematical Libraries

Abstract

Formal libraries are treasure troves of detailed mathematical knowledge, but this treasure is usually locked into system- and logic-specific representations that can only be understood by the respective theorem prover system. In this paper we present an ontology for using relational information on mathematical knowledge and a corresponding data set generated from the Isabelle and Coq libraries. We show the utility of the generated data by setting a relational query engine that provides easy access to certain library information that was previously hard or impossible to determine.

Andrea Condoluci, Michael Kohlhase, Dennis Müller, Florian Rabe, Claudio Sacerdoti Coen, Makarius Wenzel

Variadic Equational Matching

Abstract

In this paper we study matching in equational theories that specify counterparts of associativity and commutativity for variadic function symbols. We design a procedure to solve a system of matching equations and prove its soundness and completeness. The complete set of incomparable matchers for such a system can be infinite. From the practical side, we identify two finitary cases and impose restrictions on the procedure to get an incomplete terminating algorithm, which, in our opinion, describes the semantics for associative and commutative matching implemented in the symbolic computation system Mathematica.

Besik Dundua, Temur Kutsia, Mircea Marin

Comparing Machine Learning Models to Choose the Variable Ordering for Cylindrical Algebraic Decomposition

Abstract

There has been recent interest in the use of machine learning (ML) approaches within mathematical software to make choices that impact on the computing performance without affecting the mathematical correctness of the result. We address the problem of selecting the variable ordering for cylindrical algebraic decomposition (CAD), an important algorithm in Symbolic Computation. Prior work to apply ML on this problem implemented a Support Vector Machine (SVM) to select between three existing human-made heuristics, which did better than anyone heuristic alone. Here we extend this result by training ML models to select the variable ordering directly, and by trying out a wider variety of ML techniques.

We experimented with the NLSAT dataset and the Regular Chains Library CAD function for Maple 2018. For each problem, the variable ordering leading to the shortest computing time was selected as the target class for ML. Features were generated from the polynomial input and used to train the following ML models: k-nearest neighbours (KNN) classifier, multi-layer perceptron (MLP), decision tree (DT) and SVM, as implemented in the Python scikit-learn package. We also compared these with the two leading human-made heuristics for the problem: the Brown heuristic and sotd. On this dataset all of the ML approaches outperformed the human-made heuristics, some by a large margin.

Matthew England, Dorian Florescu

Towards Specifying Symbolic Computation

Abstract

Many interesting and useful symbolic computation algorithms manipulate mathematical expressions in mathematically meaningful ways. Although these algorithms are commonplace in computer algebra systems, they can be surprisingly difficult to specify in a formal logic since they involve an interplay of syntax and semantics. In this paper we discuss several examples of syntax-based mathematical algorithms, and we show how to specify them in a formal logic with undefinedness, quotation, and evaluation.

Jacques Carette, William M. Farmer

Lemma Discovery for Induction

A Survey

Abstract

Automating proofs by induction can be challenging, not least because proofs might need auxiliary lemmas, which themselves need to be proved by induction. In this paper we survey various techniques for automating the discovery of such lemmas, including both top-down techniques attempting to generate a lemma from an ongoing proof attempt, as well as bottom-up theory exploration techniques trying to construct interesting lemmas about available functions and datatypes, thus constructing a richer background theory.

Moa Johansson

Experiments on Automatic Inclusion of Some Non-degeneracy Conditions Among the Hypotheses in Locus Equation Computations

Abstract

In automated reasoning in geometry, in particular in computing locus equations, degenerate components usually play an important role. Although degeneracy may have multiple meanings by considering different mathematical traditions, avoiding degenerate components is useful from the geometrical point of view.

Computation of locus equations is usually based on elimination of variables. In most cases the graphical output is checked after the computations and then the degenerate components will be removed manually. In our experiments we prescribe non-degeneracy before starting any computations and expect disappearing of the degenerate components automatically.

In this paper we investigate if such assumptions may be automatized, and if they can help in improving the output by getting the degenerate components automatically removed, and whether the calculation is still feasible due to the higher amount of computations. Our experiments have already been tried in an implementation of our algorithm in GeoGebra 5.0.524.0.

Zoltán Kovács, Pavel Pech

Formalization of Dubé’s Degree Bounds for Gröbner Bases in Isabelle/HOL

Abstract

We present an Isabelle/HOL formalization of certain upper bounds on the degrees of Gröbner bases in multivariate polynomial rings over fields, due to Dubé. These bounds are not only of theoretical interest, but can also be used for computing Gröbner bases by row-reducing Macaulay matrices.

The formalization covers the whole theory developed by Dubé for obtaining the bounds, building upon an extensive existing library of multivariate polynomials and Gröbner bases in Isabelle/HOL. To the best of our knowledge, this is the first thorough formalization of degree bounds for Gröbner bases in any proof assistant.

Alexander Maletzky

The Coq Library as a Theory Graph

Abstract

Representing proof assistant libraries in a way that allows further processing in other systems is becoming increasingly important. It is a critical missing link for integrating proof assistants both with each other or with peripheral tools such as IDEs or proof checkers. Such representations cannot be generated from library source files because they lack semantic enrichment (inferred types, etc.) and only the original proof assistant is able to process them. But even when using the proof assistant’s internal data structures, the complexities of logic, implementation, and library still make this very difficult.

We describe one such representation, namely for the library of Coq, using OMDoc theory graphs as the target format. Coq is arguably the most formidable of all proof assistant libraries to tackle, and our work makes a significant step forward.

On the theoretical side, our main contribution is a translation of the Coq module system into theory graphs. This greatly reduces the complexity of the library as the more arcane module system features are eliminated while preserving most of the structure. On the practical side, our main contribution is an implementation of this translation. It takes the entire Coq library, which is split over hundreds of decentralized repositories, and produces easily-reusable OMDoc files as output.

Dennis Müller, Florian Rabe, Claudio Sacerdoti Coen

BNF-Style Notation as It Is Actually Used

Abstract

The famous BNF grammar notation, as introduced and used in the Algol 60 report, was subsequently followed by numerous notational variants (EBNF, ABNF, RBNF, etc.), and later by a new formal “grammars” metalanguage used for discussing structured objects in Computer Science and Mathematical Logic. We refer to this latter offspring of BNF as MBNF (Math-BNF), and to aspects common to MBNF, BNF, and its notational variants as BNF-style. MBNF is sometimes called “abstract syntax”, but we avoid that name because MBNF is in fact a concrete form and there is a more abstract form. What all BNF-style notations share is the use of production rules like (P) below which state that “every instance of \(\circ _i\) for \(i \in \{1,...,n\}\) is also an instance of \(\bullet \)”.

However, MBNF is distinct from all variants of BNF in the entities and operations it allows. Instead of strings, MBNF builds arrangements of symbols that we call math-text and allows “syntax” to be built by interleaving MBNF production rules and other mathematical definitions that can contain chunks of math-text. The differences between BNF (or its variant forms) and MBNF have not been clearly presented before. (There is also no clear definition of MBNF anywhere but this is beyond the scope of this paper.)

This paper reviews BNF and some of its variant forms as well as MBNF, highlighting the differences between BNF (including its variant forms) and MBNF. We show via a series of detailed examples that MBNF, while superficially similar to BNF, differs substantially from BNF and its variants in how it is written, the operations it allows, and the sets of entities it defines. We also argue that the entities MBNF handles may extend far outside the scope of rewriting relations on strings and syntax trees derived from such rewriting sequences, which are often used to express the meaning of BNF and its notational variants.

Dee Quinlan, Joe B. Wells, Fairouz Kamareddine

MMTTeX: Connecting Content and Narration-Oriented Document Formats

Abstract

Narrative, presentation-oriented assistant systems for mathematics such as

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-23250-4_14/MediaObjects/478766_1_En_14_Figa_HTML.gif

on the one hand and formal, content-oriented ones such as proof assistants and computer algebra systems on the other hand have so far been developed and used largely independently. The former excel at communicating mathematical knowledge and the latter at certifying its correctness.

MMTTeX aims at combining the advantages of the two paradigms. Concretely, we use

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-23250-4_14/MediaObjects/478766_1_En_14_Figb_HTML.gif

for the narrative and Mmt for the content-oriented representation. Formal objects may be written in MMT and imported into

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-23250-4_14/MediaObjects/478766_1_En_14_Figc_HTML.gif

documents or written in the

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-23250-4_14/MediaObjects/478766_1_En_14_Figd_HTML.gif

document directly. In the latter case, Mmt parses and checks the formal content during

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-23250-4_14/MediaObjects/478766_1_En_14_Fige_HTML.gif

compilation and substitutes it with

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-23250-4_14/MediaObjects/478766_1_En_14_Figf_HTML.gif

presentation macros.

Besides checking the formal objects, this allows generating higher-quality

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-23250-4_14/MediaObjects/478766_1_En_14_Figg_HTML.gif

than could easily be produced by hand, e.g., by inserting hyperlinks and tooltips into formulas. Moreover, it allows reusing formalizations across narrative documents as well as between formal and narrative ones. As a case study, the present document was already written with MMTTeX.

Florian Rabe

Diagram Combinators in MMT

Abstract

Formal libraries, especially large ones, usually employ modularity to build and maintain large theories efficiently. Although the techniques used to achieve modularity vary between systems, most of them can be understood as operations in the category of theories and theory morphisms. This yields a representation of libraries as diagrams in this category, with all theory-forming operations extending the diagram.

However, as libraries grow even bigger, it is starting to become important to build these diagrams modularly as well, i.e., we need languages and tools that support computing entire diagrams at once. A simple example would be to systematically apply the same operation to all theories in a diagram and return both the new diagram and the morphisms that relate the old one to the new one.

In this work, we introduce such diagram combinators as an extension to the MMT language and tool. We provide many important combinators, and our extension allows library developers to define arbitrary new ones. We evaluate our framework by building a library of algebraic theories in an extremely compact way.

Florian Rabe, Yasmine Sharoda

Inspection and Selection of Representations

Abstract

We present a novel framework for inspecting representations and encoding their formal properties. This enables us to assess and compare the informational and cognitive value of different representations for reasoning. The purpose of our framework is to automate the process of representation selection, taking into account the candidate representation’s match to the problem at hand and to the user’s specific cognitive profile. This requires a language for talking about representations, and methods for analysing their relative advantages. This foundational work is first to devise a computational end-to-end framework where problems, representations, and user’s profiles can be described and analysed. As AI systems become ubiquitous, it is important for them to be more compatible with human reasoning, and our framework enables just that.

Daniel Raggi, Aaron Stockdill, Mateja Jamnik, Grecia Garcia Garcia, Holly E. A. Sutherland, Peter C.-H. Cheng

A Plugin to Export Coq Libraries to XML

Abstract

We introduce a plugin for the interactive prover Coq to export its libraries to a machine readable XML format. The information exported is the one checked by Coq’s kernel after the input is elaborated, augmented with additional data coming from the elaboration itself.

The plugin has been applied to the 49 Coq libraries that have an opam package and that currently compile with the latest version of Coq (8.9.0), generating a large dataset of 1,235,934 compressed XML files organized in 18,780 directories that require 17 GB on disk.

Claudio Sacerdoti Coen

Forms of Plagiarism in Digital Mathematical Libraries

Abstract

We report on an exploratory analysis of the forms of plagiarism observable in mathematical publications, which we identified by investigating editorial notes from zbMATH. While most cases we encountered were simple copies of earlier work, we also identified several forms of disguised plagiarism. We investigated 11 cases in detail and evaluate how current plagiarism detection systems perform in identifying these cases. Moreover, we describe the steps required to discover these and potentially undiscovered cases in the future.

Moritz Schubotz, Olaf Teschke, Vincent Stange, Norman Meuschke, Bela Gipp

Integrating Semantic Mathematical Documents and Dynamic Notebooks

Abstract

Mathematical software systems offer two major paradigms for interacting with mathematical knowledge. One is static files with semantically annotated representations that define mathematical knowledge and can be compiled into documents (PDF, html, etc.), and the other dynamically build mathematical objects in interactive read-eval-print loops (REPL) such as notebooks. Many author-facing interfaces offer both features in some way. However, reader-facing interfaces usually show only one or the other.

In this paper we present an integration of the approaches in the context of the MMT system. Firstly, we present a Jupyter kernel for MMT which provides web-ready REPL functionality for MMT. Secondly, we integrate the resulting Jupyter notebooks into MathHub, a web-based frontend for mathematical documents. This allows users to context-sensitively open a Jupyter notebook as a dynamic subdocument anywhere inside a static MathHub document. Vice versa, any such highly interactive and often ephemeral notebook can be saved persistently in the MathHub backend at which point it becomes available as a static document. We also show how Jupyter widgets can be deeply integrated with the MMT knowledge management facilities to give semantics-aware interaction facilities.

Kai Amann, Michael Kohlhase, Florian Rabe, Tom Wiesing

Explorations into the Use of Word Embedding in Math Search and Math Semantics

Abstract

Word embedding, which represents individual words with semantically rich numerical vectors, has made it possible to successfully apply deep learning to NLP tasks such as semantic role modeling, question answering, and machine translation. As math text consists of natural text as well as math expressions that similarly exhibit linear correlation and contextual characteristics, word embedding can be applied to math documents as well. On the other hand, math terms also show characteristics (e.g., abstractions) that are different from textual words. Accordingly, it is worthwhile to explore the use and effectiveness of word embedding in math language processing and MKM.

In this paper, we present exploratory investigations of math embedding by testing it on some basic tasks such as (1) math-term similarity, (2) analogy, (3) basic numerical concept-modeling using a novel approach based on computing the (weighted) centroid of the keywords that characterize a concept, and (4) math search, especially query expansion using the weighted centroid of the query keywords and then expanding the query with new keywords that are most similar to the centroid. Due to lack of benchmarks, our investigations were done using carefully selected illustrations on the DLMF. We draw from our investigations some general observations and lessons that form a trajectory for future statistically significant testing on large benchmarks. Our preliminary results and observations show that math embedding holds much promise but also point to the need for more robust embedding.

Abdou Youssef, Bruce R. Miller

Backmatter

Title: Intelligent Computer Mathematics
Editors: Cezary Kaliszyk
Edwin Brady
Andrea Kohlhase
Claudio Sacerdoti Coen
Publisher: Springer International Publishing
Electronic ISBN: 978-3-030-23250-4
Print ISBN: 978-3-030-23249-8
DOI: https://doi.org/10.1007/978-3-030-23250-4

Springer Professional

Intelligent Computer Mathematics

12th International Conference, CICM 2019, Prague, Czech Republic, July 8–12, 2019, Proceedings

About this book

Table of Contents

Frontmatter

Interaction with Formal Mathematical Documents in Isabelle/PIDE

Beginners’ Quest to Formalize Mathematics: A Feasibility Study in Isabelle

Towards a Unified Mathematical Data Infrastructure: Database and Interface Generation

A Tale of Two Set Theories

Relational Data Across Mathematical Libraries

Variadic Equational Matching

Comparing Machine Learning Models to Choose the Variable Ordering for Cylindrical Algebraic Decomposition

Towards Specifying Symbolic Computation

Lemma Discovery for Induction

Experiments on Automatic Inclusion of Some Non-degeneracy Conditions Among the Hypotheses in Locus Equation Computations

Formalization of Dubé’s Degree Bounds for Gröbner Bases in Isabelle/HOL

The Coq Library as a Theory Graph

BNF-Style Notation as It Is Actually Used

MMTTeX: Connecting Content and Narration-Oriented Document Formats

Diagram Combinators in MMT

Inspection and Selection of Representations

A Plugin to Export Coq Libraries to XML

Forms of Plagiarism in Digital Mathematical Libraries

Integrating Semantic Mathematical Documents and Dynamic Notebooks

Explorations into the Use of Word Embedding in Math Search and Math Semantics

Backmatter

Premium Partner