nach oben

2017 | Buch

Kapitel lesen Erstes Kapitel lesen

Intelligent Computer Mathematics

10th International Conference, CICM 2017, Edinburgh, UK, July 17-21, 2017, Proceedings

herausgegeben von: Herman Geuvers, Matthew England, Osman Hasan, Florian Rabe, Olaf Teschke

Verlag: Springer International Publishing

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This book constitutes the refereed proceedings of the 10th International Conference on Intelligent Computer Mathematics, CICM 2017, held in Edinburgh, Scotland, in July 2017.

The 22 full papers and 3 abstracts of invited papers presented were carefully reviewed and selected from a total of 40 submissions. The papers are organized in three tracks: the Calculemus track examining the integration of symbolic computation and mechanized reasoning; the Digital Mathematics Libraries track dealing with math-aware technologies, standards, algorithms, and processes; the Mathematical Knowledge Management track being concerned with all aspects of managing mathematical knowledge, in informal, semi-formal, and formal settings. An additional track Systems and Projects contains descriptions of systems and relevant projects, both of which are key to a research topic where theory and practice interact on explicitly represented knowledge.

Inhaltsverzeichnis

Frontmatter

DeepAlgebra - An Outline of a Program

Abstract

We outline a program in the area of formalization of mathematics to automate theorem proving in algebra and algebraic geometry. We propose a construction of a dictionary between automated theorem provers and (La)TeX exploiting syntactic parsers. We describe its application to a repository of human-written facts and definitions in algebraic geometry (The Stacks Project). We use deep learning techniques.

Przemysław Chojecki

Formalizing Mathematical Knowledge as a Biform Theory Graph: A Case Study

Abstract

A biform theory is a combination of an axiomatic theory and an algorithmic theory that supports the integration of reasoning and computation. These are ideal for formalizing algorithms that manipulate mathematical expressions. A theory graph is a network of theories connected by meaning-preserving theory morphisms that map the formulas of one theory to the formulas of another theory. Theory graphs are in turn well suited for formalizing mathematical knowledge at the most convenient level of abstraction using the most convenient vocabulary. We are interested in the problem of whether a body of mathematical knowledge can be effectively formalized as a theory graph of biform theories. As a test case, we look at the graph of theories encoding natural number arithmetic. We used two different formalisms to do this, which we describe and compare. The first is realized in \({\textsc {ctt}}_\mathrm{uqe}\), a version of Church’s type theory with quotation and evaluation, and the second is realized in Agda, a dependently typed programming language.

Jacques Carette, William M. Farmer

The Formalization of Vickrey Auctions: A Comparison of Two Approaches in Isabelle and Theorema

Abstract

In earlier work presented at CICM, four theorem provers (Isabelle, Mizar, Hets/CASL/TPTP, and Theorema) were compared based on a case study in theoretical economics, the formalization of the landmark Theorem of Vickrey in auction theory. At the time of this comparison the Theorema system was in a state of transition: The original Theorema system (Theorema 1) had been shut down by the Theorema group and the successor system Theorema 2.0 was just about to be launched. Theorema 2.0 participated in the competition, but only parts of the system were ready for use. In particular, the new reasoning engines had not been set up, so that some of the results in the system comparison had to be extrapolated from experience we had with Theorema 1. In this paper, we now want to compare a complete formalization of Vickrey’s Theorem in Theorema 2.0 with the original formalization in Isabelle. On the one hand, we compare the mathematical setup of the two theories and, on the other hand, we also give an overview on statistical indicators, such as number of auxiliary lemmas and the total number of proof steps needed for all proofs in the theory. Last but not least, we present a shorter version of proof of the main theorem in Isabelle.

Alexander Maletzky, Wolfgang Windsteiger

Automatically Proving Equivalence by Type-Safe Reflection

Abstract

One difficulty with reasoning and programming with dependent types is that proof obligations arise naturally once programs become even moderately sized. For example, implementing an adder for binary numbers indexed over their natural number equivalents naturally leads to proof obligations for equalities of expressions over natural numbers. The need for these equality proofs comes, in intensional type theories, from the fact that the propositional equality enables us to prove as equal terms that are not judgementally equal, which means that the typechecker can’t always obtain equalities by reduction. As far as possible, we would like to solve such proof obligations automatically. In this paper, we show one way to automate these proofs by reflection in the dependently typed programming language Idris. We show how defining reflected terms indexed by the original Idris expression allows us to construct and manipulate proofs. We build a hierarchy of tactics for proving equivalences in semi-groups, monoids, commutative monoids, groups, commutative groups, semi-rings and rings. We also show how each tactic reuses those from simpler structures, thus avoiding duplication of code and proofs.

Franck Slama, Edwin Brady

The Global Digital Mathematics Library and the International Mathematical Knowledge Trust

Abstract

We recall some of the reasons why we want and do not yet have a Global Digital Mathematics Library (GDML), both before and after the setting up of a GDML WG at the Seoul 2014 ICM. The recent founding of an International Mathematical Knowledge Trust (IMKT) in Waterloo ON, Canada is an important move in the right direction. The IMKT’s form and initial efforts will be described, with attention to why the project is the way it is.

Patrick D. F. Ion, Stephen M. Watt

The New Numdam Platform

Abstract

The Numdam French digital mathematics library has now been in operation for more than 15 years with no major upgrade. It holds more than 57000 documents either scanned or born digital (spanning over one million pages). The information system has been recently completely redesigned. In this article, we present the new Numdam ecosystem. A metadata factory is used to store metadata from a variety of sources, normalize it under JATS (articles) or BITS (books) XML formats, and enhance it through manual editing or automated agents (tagging math formulas, matching to external databases and interlinking, etc.). The data model supports the main types of documents currently expected to populate the DML: journals, seminars, conference proceedings, multivolume works, books, book parts, doctoral theses. All documents are in collections that can belong to one or more corpus. The workflow has been simplified and allows easy deployment on test and production web sites. A platform holds all the data in one place, can generate multiple web sites, each with a different view on the data, and provides an OAI-PMH server to the outside world. Finally, the article presents future plans to create a DML-ready platform based on the new Numdam platform.

Thierry Bouche, Olivier Labbe

Classification of Alignments Between Concepts of Formal Mathematical Systems

Abstract

Mathematical knowledge is publicly available in dozens of different formats and languages, ranging from informal (e.g. Wikipedia) to formal corpora (e.g., Mizar). Despite an enormous amount of overlap between these corpora, only few machine-actionable connections exist. We speak of alignment if the same concept occurs in different libraries, possibly with slightly different names, notations, or formal definitions. Leveraging these alignments creates a huge potential for knowledge sharing and transfer, e.g., integrating theorem provers or reusing services across systems. Notably, even imperfect alignments, i.e. concepts that are very similar rather than identical, can often play very important roles. Specifically, in machine learning techniques for theorem proving and in automation techniques that use these, they allow learning-reasoning based automation for theorem provers to take inspiration from proofs from different formal proof libraries or semi-formal libraries even if the latter is based on a different mathematical foundation. We present a classification of alignments and design a simple format for describing alignments, as well as an infrastructure for sharing them. We propose these as a centralized standard for the community. Finally, we present an initial collection of \(\approx \)12000 alignments from the different kinds of mathematical corpora, including proof assistant libraries and semi-formal corpora as a public resource.

Dennis Müller, Thibault Gauthier, Cezary Kaliszyk, Michael Kohlhase, Florian Rabe

Software Citations, Information Systems, and Beyond

Abstract

Even though software plays an ever-increasing role in today’s research and engineering processes, the scholarly publication process has not quite caught up with this. In particular, referencing and citing software remains problematic. Citations for publications are well-standardized but don’t immediately apply to software as, for instance, (a) software information is extremely heterogeneous, (b) software code is not persistent, and (c) the level of software information is often too coarse-granular.

Current initiatives try to solve (a) by postulating “landing pages” for software that aggregate standardized meta-data and can be used as targets for citations and (b) by version-specific sub-landing pages. However no information services that provide such landing pages currently exist, making these proposals ineffective in practice.

After an overview of the state-of-the-art, we propose to use swMATH’s information system for mathematical software as a source of landing pages, show an approach for version-specific sub-pages, and discuss approaches to cope with problem (c) (granularity).

Michael Kohlhase, Wolfram Sperber

Semantic Preserving Bijective Mappings of Mathematical Formulae Between Document Preparation Systems and Computer Algebra Systems

Abstract

Document preparation systems like

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq1_HTML.gif

offer the ability to render mathematical expressions as one would write these on paper. Using

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq2_HTML.gif

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq3_HTML.gif

, and tools generated for use in the National Institute of Standards (NIST) Digital Library of Mathematical Functions, semantically enhanced mathematical

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq4_HTML.gif

markup (semantic

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq5_HTML.gif

) is achieved by using a semantic macro set. Computer algebra systems (CAS) such as Maple and Mathematica use alternative markup to represent mathematical expressions. By taking advantage of Youssef’s Part-of-Math tagger and CAS internal representations, we develop algorithms to translate mathematical expressions represented in semantic

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq6_HTML.gif

to corresponding CAS representations and vice versa. We have also developed tools for translating the entire Wolfram Encoding Continued Fraction Knowledge and University of Antwerp Continued Fractions for Special Functions datasets, for use in the NIST Digital Repository of Mathematical Formulae. The overall goal of these efforts is to provide semantically enriched standard conforming MathML representations to the public for formulae in digital mathematics libraries. These representations include presentation MathML, content MathML, generic

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq7_HTML.gif

, semantic

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_9/454024_1_En_9_IEq8_HTML.gif

, and now CAS representations as well.

Howard S. Cohl, Moritz Schubotz, Abdou Youssef, André Greiner-Petter, Jürgen Gerhard, Bonita V. Saunders, Marjorie A. McClain, Joon Bang, Kevin Chen

Towards Mathematical AI via a Model of the Content and Process of Mathematical Question and Answer Dialogues

Abstract

This paper outlines a strategy for building semantically meaningful representations and carrying out effective reasoning in technical knowledge domains such as mathematics. Our central assertion is that the semi-structured Q&A format, as used on the popular Stack Exchange network of websites, exposes domain knowledge in a form that is already reasonably close to the structured knowledge formats that computers can reason about. The knowledge in question is not only facts – but discursive, dialectical, argument for purposes of proof and pedagogy. We therefore assert that modelling the Q&A process computationally provides a route to domain understanding that is compatible with the day-to-day practices of mathematicians and students. This position is supported by a small case study that analyses one question from Mathoverflow in detail, using concepts from argumentation theory. A programme of future work, including a rigorous evaluation strategy, is then advanced.

Joseph Corneli, Ursula Martin, Dave Murray-Rust, Alison Pease

Theory Morphisms in Church’s Type Theory with Quotation and Evaluation

Abstract

\({\textsc {ctt}}_\mathrm{qe}\) is a version of Church’s type theory with global quotation and evaluation operators that is engineered to reason about the interplay of syntax and semantics and to formalize syntax-based mathematical algorithms. \({\textsc {ctt}}_\mathrm{uqe}\) is a variant of \({\textsc {ctt}}_\mathrm{qe}\) that admits undefined expressions, partial functions, and multiple base types of individuals. It is better suited than \({\textsc {ctt}}_\mathrm{qe}\) as a logic for building networks of theories connected by theory morphisms. This paper presents the syntax and semantics of \({\textsc {ctt}}_\mathrm{uqe}\), defines a notion of a theory morphism from one \({\textsc {ctt}}_\mathrm{uqe}\) theory to another, and gives two simple examples involving monoids that illustrate the use of theory morphisms in \({\textsc {ctt}}_\mathrm{uqe}\).

William M. Farmer

Semantic Representation of General Topology in the Wolfram Language

Abstract

The Wolfram Knowledgebase, powered by the entity framework, contains expertly curated data from thousands of diverse domains. We have begun expanding this framework to include mathematical knowledge, making significant strides in the representation of results pertaining to continued fractions, function spaces, and most recently, topology. This paper will focus on our progress in the representation of general topology. We have curated over 700 entities representing concept definitions, theorem statements, and concrete topological spaces, as well as their corresponding properties, including their formal representations as well as references, computed properties, and other metadata. Virtually every formal representation in this project required extensions to the Wolfram Language, mostly for basic set theory. We will outline all of these design choices by way of examples, as well as present additional functionality for querying, usage messages, formatting, and other computations.

Ian Ford

Zeta Types and Tannakian Symbols as a Method for Representing Mathematical Knowledge

Abstract

We present two closely related notions called (1) a zeta type and (2) a Tannakian symbol. These are data types for representing information about number-theoretic objects, and we argue that a database built out of zeta types and Tannakian symbols could lead to interesting discoveries, similar to what has been achieved for example by the OEIS, the LMFDB, and other existing databases of mathematical objects. We give several examples illustrating what database records would look like, and we describe a tiny prototype database which has already been used to find and automatically prove new theorems about multiplicative functions.

Andreas Holmstrom, Torstein Vik

Presentation and Manipulation of Mizar Properties in an Isabelle Object Logic

Abstract

One of the crucial factors enabling an efficient use of a logical framework is the convenience of entering, manipulating, and presenting object logic constants, statements, and proofs. In this paper, we discuss various elements of the Mizar language and the possible ways how these can be represented in the Isabelle framework in order to allow a suitable way of working in typed set theory. We explain the interpretation of various components declared in each Mizar article environment and create Isabelle attributes and outer syntax that allow simulating them. We further discuss introducing notations for symbols defined in the Mizar Mathematical Library, but also synonyms and redefinitions of such symbols. We also compare the language elements corresponding to the actual proofs, with special care for implicit proof expansions not present in Isabelle. We finally discuss Mizar’s hidden arguments and demonstrate that some of them are not necessary in an Isabelle representation.

Cezary Kaliszyk, Karol Pąk

Visual Structure in Mathematical Expressions

Abstract

Mathematics uses formulae to express knowledge about objects concisely and economically. Mathematical formulae are at the same time an indispensable tool for the initiated and a formidable barrier to novices. Surprisingly little is known about the cognitive basis of this practice. In this paper we start to rectify this situation with an investigation of how humans read (and understand) mathematical expressions.

A previous exploratory study suggested the interplay of visual patterns and content structure as a key ingredient of decoding and understanding math expressions that differentiates between math-literate and -illiterate subjects. The main contribution of this paper is an eye-tracking study on mathematically trained researchers conducted to verify the mathematical practices suggested by the first study and refine our understanding of the mechanisms.

Andrea Kohlhase, Michael Kohlhase, Michael Fürsich

Mathematical Models as Research Data via Flexiformal Theory Graphs

Abstract

Mathematical modeling and simulation (MMS) has now been established as an essential part of the scientific work in many disciplines. It is common to categorize the involved numerical data and to some extent the corresponding scientific software as research data. But both have their origin in mathematical models, therefore any holistic approach to research data in MMS should cover all three aspects: data, software, and models. While the problems of classifying, archiving and making accessible are largely solved for data and first frameworks and systems are emerging for software, the question of how to deal with mathematical models is completely open.

In this paper we propose a solution – to cover all aspects of mathematical models: the underlying mathematical knowledge, the equations, boundary conditions, numeric approximations, and documents in a flexiformal framework, which has enough structure to support the various uses of models in scientific and technology workflows.

Concretely we propose to use the OMDoc/MMT framework to formalize mathematical models and show the adequacy of this approach by modeling a simple, but non-trivial model: van Roosbroeck’s drift-diffusion model for one-dimensional devices. This formalization – and future extensions – allows us to support the modeler by e.g., flexibly composing models, visualizing Model Pathway Diagrams, and annotating model equations in documents as induced from the formalized documents by flattening. This directly solves some of the problems in treating mathematical models as “research data” and opens the way towards more MKM services for models.

Michael Kohlhase, Thomas Koprucki, Dennis Müller, Karsten Tabelow

A Verified Algorithm Enumerating Event Structures

Abstract

An event structure is a mathematical abstraction modeling concepts as causality, conflict and concurrency between events. While many other mathematical structures, including groups, topological spaces, rings, abound with algorithms and formulas to generate, enumerate and count particular sets of their members, no algorithm or formulas are known to generate or count all the possible event structures over a finite set of events. We present an algorithm to generate such a family, along with a functional implementation verified using Isabelle/HOL. As byproducts, we obtain a verified enumeration of all possible preorders and partial orders. While the integer sequences counting preorders and partial orders are already listed on OEIS (On-line Encyclopedia of Integer Sequences), the one counting event structures is not. We therefore used our algorithm to submit a formally verified addition, which has been successfully reviewed and is now part of the OEIS.

Juliana Bowles, Marco B. Caminati

Reasoning with Concept Diagrams About Antipatterns in Ontologies

Abstract

Ontologies are notoriously hard to define, express and reason about. Many tools have been developed to ease the ontology debugging and reasoning, however they often lack accessibility and formalisation. A visual representation language, concept diagrams, was developed for expressing ontologies, which has been empirically proven to be cognitively more accessible to ontology users. In this paper we answer the question of “How can concept diagrams be used to reason about inconsistencies and incoherence of ontologies?”. We do so by formalising a set of inference rules for concept diagrams that enables stepwise verification of the inconsistency and incoherence of a set of ontology axioms. The design of inference rules is driven by empirical evidence that concise (merged) diagrams are easier to comprehend for users than a set of lower level diagrams that are a one-to-one translation from OWL ontology axioms. We prove that our inference rules are sound, and exemplify how they can be used to reason about inconsistencies and incoherence.

Zohreh Shams, Mateja Jamnik, Gem Stapleton, Yuri Sato

A Web-Based Toolkit for Mathematical Word Processing Applications with Semantics

Abstract

Lurch is an open-source word processor that can check the steps in students’ mathematical proofs. Users write in a natural language, but mark portions of a document as meaningful, so the software can distinguish content for human readers from content it should analyze.

This paper describes the Lurch Web Platform, a system of tools the authors have created as part of a project to upgrade Lurch from a desktop application to a web application. That system of tools is available on GitHub for other mathematical software developers to use in their own projects. It includes a web editor with mathematical typesetting, an interface for marking up documents with mathematical (or other structured) meaning, OpenMath support, meaning visualization tools, and document dependence and sharing features, among others.

We conclude with design plans for ongoing development of the web version of Lurch that will be built on the Lurch Web Platform.

Nathan C. Carter, Kenneth G. Monks

ENIGMA: Efficient Learning-Based Inference Guiding Machine

Abstract

ENIGMA is a learning-based method for guiding given clause selection in saturation-based theorem provers. Clauses from many previous proof searches are classified as positive and negative based on their participation in the proofs. An efficient classification model is trained on this data, classifying a clause as useful or un-useful for the proof search. This learned classification is used to guide next proof searches prioritizing useful clauses among other generated clauses. The approach is evaluated on the E prover and the CASC 2016 AIM benchmark, showing a large increase of E’s performance.

Jan Jakubův, Josef Urban

Proof Mining with Dependent Types

Abstract

Several approaches exist to data-mining big corpora of formal proofs. Some of these approaches are based on statistical machine learning, and some – on theory exploration. However, most are developed for either untyped or simply-typed theorem provers. In this paper, we present a method that combines statistical data mining and theory exploration in order to analyse and automate proofs in dependently typed language of Coq.

Ekaterina Komendantskaya, Jónathan Heras

Formalization of Transform Methods Using HOL Light

Abstract

Transform methods, like Laplace and Fourier, are frequently used for analyzing the dynamical behaviour of engineering and physical systems, based on their transfer function, and frequency response or the solutions of their corresponding differential equations. In this paper, we present an ongoing project, which focuses on the higher-order logic formalization of transform methods using HOL Light theorem prover. In particular, we present the motivation of the formalization, which is followed by the related work. Next, we present the task completed so far while highlighting some of the challenges faced during the formalization. Finally, we present a roadmap to achieve our objectives, the current status and the future goals for this project.

Adnan Rashid, Osman Hasan

Combining Refinement and Signal-Temporal Logic for Biological Systems

Abstract

System-level modeling and analysis of biological phenomena have become an important research topic amongst different fields including mathematics, computer science, electrical and system engineering. This is a consequence of the recent development in these fields which can be utilized to understand the dynamics of complex biological organisms such as cancer, malaria and diabetes, etc. However, the concept of model refinement (i.e., the transformation of an abstract models into a detailed model) is largely unexplored in biology. In this paper, we describe our ongoing project which aims at combining the concept of model refinement and temporal logic for the analysis of a wide class of biological systems.

Usman Sanwal, Umair Siddique

VMEXT: A Visualization Tool for Mathematical Expression Trees

Abstract

Mathematical expressions can be represented as a tree consisting of terminal symbols, such as identifiers or numbers (leaf nodes), and functions or operators (non-leaf nodes). Expression trees are an important mechanism for storing and processing mathematical expressions as well as the most frequently used visualization of the structure of mathematical expressions. Typically, researchers and practitioners manually visualize expression trees using general-purpose tools. This approach is laborious, redundant, and error-prone. Manual visualizations represents a user’s notion of what the markup of an expression should be, but not necessarily what the actual markup is. This paper presents VMEXT – a free and open source tool to directly visualize expression trees from parallel

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_24/454024_1_En_24_IEq1_HTML.gif

. VMEXT simultaneously visualizes the presentation elements and the semantic structure of mathematical expressions to enable users to quickly spot deficiencies in the Content

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_24/454024_1_En_24_IEq2_HTML.gif

markup that does not affect the presentation of the expression. Identifying such discrepancies previously required reading the verbose and complex

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_24/454024_1_En_24_IEq3_HTML.gif

markup. VMEXT also allows one to visualize similar and identical elements of two expressions. Visualizing expression similarity can support developers in designing retrieval approaches and enable improved interaction concepts for users of mathematical information retrieval systems. We demonstrate VMEXT’s visualizations in two web-based applications. The first application presents the visualizations alone. The second application shows a possible integration of the visualizations in systems for mathematical knowledge management and mathematical information retrieval. The application converts

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_24/454024_1_En_24_IEq4_HTML.gif

input to parallel

https://static-content.springer.com/image/chp%3A10.1007%2F978-3-319-62075-6_24/454024_1_En_24_IEq5_HTML.gif

, computes basic similarity measures for mathematical expressions, and visualizes the results using VMEXT.

Moritz Schubotz, Norman Meuschke, Thomas Hepp, Howard S. Cohl, Bela Gipp

Part-of-Math Tagging and Applications

Abstract

Nearly all of the recent mathematical literature, and much of the old literature, are online and mostly in natural-language form. Therefore, math content processing presents some of the same challenges faced in natural language processing (NLP), such as math disambiguation and math semantics determination. These challenges must be surmounted to enable more effective math knowledge management, math knowledge discovery, automated presentation-to-computation (P2C) conversion, and automated math reasoning. To meet this goal, considerable math language processing (MLP) technology is needed.

This project aims to advance MLP by developing (1) a sophisticated part-of-math (POM) tagger, (2) math-sense disambiguation techniques along with supporting Machine-Learning (ML) based MLP algorithms, and (3) semantics extraction from, and enrichment of, math expressions. Specifically, the project first created an evolving tagset for math terms and expressions, and is developing a general-purpose POM tagger. The tagger works in several scans and interacts with other MLP algorithms that will be developed in this project. In the first scan of an input math document, each math term and some sub-expressions are tagged with two kinds of tags. The \(1^\mathrm{st}\) kind consists of definite tags (such as operation, relation, numerator, etc.) that the tagger is certain of. The \(2^\mathrm{nd}\) kind consists of alternative, tentative features (including alternative roles and meanings) drawn from a knowledge base that has been developed for this project. The \(2^\mathrm{nd}\) and \(3^\mathrm{rd}\) scan will, in conjunction with some NLP/ML-based algorithms, select the right features from among those alternative features, disambiguate the terms, group subsequences of terms into unambiguous sub-expressions and tag them, and thus derive definite unambiguous semantics of math terms and expressions. The NLP/ML-based algorithms needed for this work will be another part of this project. These include math topic modeling, math context modeling, math document classification (into various standard areas of math), and definition-harvesting algorithms.

The project will create significant new concepts and techniques that will advance knowledge in two respects. First, the tagger, math disambiguation techniques, and NLP/ML-based algorithms, though they correspond to NLP and ML counterparts, will be quite novel because math expressions are radically different from natural language. Second, the project outcomes will enable the development of new advanced applications such as: (1) techniques for computer-aided semantic enrichment of digital math libraries; (2) automated P2C conversion of math expressions from natural form to (i) a machine-computable form and (ii) a formal form suitable for automated reasoning; (3) math question-answering capabilities at the manuscript level and collection level; (4) richer math UIs; and (5) more accurate math optical character recognition.

Abdou Youssef

Backmatter

Titel: Intelligent Computer Mathematics
herausgegeben von: Herman Geuvers
Matthew England
Osman Hasan
Florian Rabe
Olaf Teschke
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-62075-6
Print ISBN: 978-3-319-62074-9
DOI: https://doi.org/10.1007/978-3-319-62075-6