skip to main content
10.1145/1060590.1060645acmconferencesArticle/Chapter ViewAbstractPublication PagesstocConference Proceedingsconference-collections
Article

Learning nonsingular phylogenies and hidden Markov models

Published:22 May 2005Publication History

ABSTRACT

In this paper, we study the problem of learning phylogenies and hidden Markov models. We call a Markov model nonsingular if all transition matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomial-time algorithm for learning nonsingular phylogenies and hidden Markov models.

References

  1. N. Abe and N. K. Warmuth, On the Computational Complexity of Approximating Probability Distributions by Probabilistic Automata, Machine Learning, 9(2--3), 205--260, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Addario-Berry, B. Chor, M. Hallett, J. Lagergren, A. Panconesi and Todd Wareham, Ancestral Maximum Likelihood of Evolutionary Trees Is Hard, Journal of Bioinformatics and Computational Biology, 2, 257--271, 2004.]]Google ScholarGoogle ScholarCross RefCross Ref
  3. A. Ambainis, R. Desper, M. Farach, and S. Kannan, Nearly tight bounds on the learnability of evolution, FOCS 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Cavender, Taxonomy with confidence, Mathematical Biosciences, 40, 271--280, 1978.]]Google ScholarGoogle ScholarCross RefCross Ref
  5. J. T. Chang, Full Reconstruction of Markov Models on Evolutionary Trees: Identifiability and Consistency, Mathematical Biosciences, 137, 51--73, 1996.]]Google ScholarGoogle ScholarCross RefCross Ref
  6. R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer, New York, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. Cryan, L. A. Goldberg, and P. W. Goldberg, Evolutionary Trees can be Learned in Polynomial Time in the Two-State General Markov Model, SIAM Journal on Computing, 31(2), 375--397, 2002.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.]]Google ScholarGoogle ScholarCross RefCross Ref
  9. R. Durrett, Probability: Theory and Examples, Duxbury, 1996.]]Google ScholarGoogle Scholar
  10. P. L. Erdos, M. Steel, L. Szekely, and T. Warnow, A Few Logs Suffice to Build (Almost) all Trees (I), Random Structures and Algorithms, 14, 153--184, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. L. Erdos, M. A. Steel, L. A. Szekely, and T. J. Warnow, A Few Logs Suffice to Build (Almost) all Trees (II), Theoretical Computer Science, 221(1--2), 77--118, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. J. Felsenstein. Inferring Phylogenies. Sinauer, New York, New York, 2004.]]Google ScholarGoogle Scholar
  13. M. Farach and S. Kannan, Efficient algorithms for inverting evolution, STOC 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. J. Feldman and R. O'Donnell and R. Servedio. Learning mixtures of product distributions, preprint, 2004.]]Google ScholarGoogle Scholar
  15. J. Felsenstein, Cases in which parsimony or compatibility methods will be positively misleading, Systematic Zoology, 27, 410--410, 1978.]]Google ScholarGoogle ScholarCross RefCross Ref
  16. G. H. Golub and C. H. Van Loan, Matrix Computations, Johns Hopkins University Press, 1996.]]Google ScholarGoogle Scholar
  17. R. L. Graham and L. R. Foulds, Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time, Mathematical Biosciences, 60, 133--142, 1982.]]Google ScholarGoogle ScholarCross RefCross Ref
  18. R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, 1985.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. Kearns, Efficient noise-tolerant learning from statistical queries, STOC 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. J. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R. E. Schapire, and L. Sellie, On the learnability of discrete distributions, STOC 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. R. B. Lyngsø and C. N. S. Pedersen, Complexity of Comparing Hidden Markov Models, ISAAC 2001.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. E. Mossel, Distorted metrics on trees and phylogenetic forests, preprint, 2004 (available at http://arxiv.org/math.CO/0403508).]]Google ScholarGoogle Scholar
  23. E. Mossel and S.Roch, Learning Nonsingular Phylogenies and Hidden Markov Models, preprint, 2005 (available at http://arxiv.org/cs.LG/0502076).]]Google ScholarGoogle Scholar
  24. L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 77(2), 257--286, 1989.]]Google ScholarGoogle ScholarCross RefCross Ref
  25. K. Rice and T. Warnow, Parsimony is Hard to Beat, Proceedings of the Third Annual International Conference on Computing and Combinatorics, Lecture Notes In Computer Science, Springer, 1276, 124 -- 133, 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. D. Ron, Y. Singer, and N. Tishby, On the Learnability and Usage of Acyclic Probabilistic Finite Automata, J. Comput. Syst. Sci., 56(2), 133--152, 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Semple and M. Steel. Phylogenetics, volume 22 of Mathematics and its Applications series. Oxford University Press, 2003.]]Google ScholarGoogle Scholar
  28. M. Steel, Recovering a tree from the leaf colourations it generates under a Markov model, Appl. Math. Lett. 7(2), 19--24, 1994.]]Google ScholarGoogle ScholarCross RefCross Ref
  29. M. A. Steel, L. A. Szekely, and M. D. Hendy, Reconstructing trees when sequence sites evolve at variable rates, J. Comput. Biol. 1(2), 153--63, 1994.]]Google ScholarGoogle ScholarCross RefCross Ref
  30. L. G. Valiant, A theory of the learnable, Communications of the ACM, 27(11), 1134--1142, 1984.]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Learning nonsingular phylogenies and hidden Markov models

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
              May 2005
              778 pages
              ISBN:1581139608
              DOI:10.1145/1060590

              Copyright © 2005 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 22 May 2005

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • Article

              Acceptance Rates

              Overall Acceptance Rate1,469of4,586submissions,32%

              Upcoming Conference

              STOC '24
              56th Annual ACM Symposium on Theory of Computing (STOC 2024)
              June 24 - 28, 2024
              Vancouver , BC , Canada

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader