Article

Learning nonsingular phylogenies and hidden Markov models

Authors:
Elchanan Mossel

University of California - Berkeley, Berkeley, CA

University of California - Berkeley, Berkeley, CA
View Profile

,
Sébastien Roch

University of California - Berkeley, Berkeley, CA

University of California - Berkeley, Berkeley, CA
View Profile

STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computingMay 2005Pages 366–375https://doi.org/10.1145/1060590.1060645

Published:22 May 2005Publication History

STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing

Pages 366–375

ABSTRACT

In this paper, we study the problem of learning phylogenies and hidden Markov models. We call a Markov model nonsingular if all transition matrices have determinants bounded away from 0 (and 1). We highlight the role of the nonsingularity condition for the learning problem. Learning hidden Markov models without the nonsingularity condition is at least as hard as learning parity with noise. On the other hand, we give a polynomial-time algorithm for learning nonsingular phylogenies and hidden Markov models.

References

N. Abe and N. K. Warmuth, On the Computational Complexity of Approximating Probability Distributions by Probabilistic Automata, Machine Learning, 9(2--3), 205--260, 1992.]] Google ScholarDigital Library
L. Addario-Berry, B. Chor, M. Hallett, J. Lagergren, A. Panconesi and Todd Wareham, Ancestral Maximum Likelihood of Evolutionary Trees Is Hard, Journal of Bioinformatics and Computational Biology, 2, 257--271, 2004.]]Google ScholarCross Ref
A. Ambainis, R. Desper, M. Farach, and S. Kannan, Nearly tight bounds on the learnability of evolution, FOCS 1997.]] Google ScholarDigital Library
J. Cavender, Taxonomy with confidence, Mathematical Biosciences, 40, 271--280, 1978.]]Google ScholarCross Ref
J. T. Chang, Full Reconstruction of Markov Models on Evolutionary Trees: Identifiability and Consistency, Mathematical Biosciences, 137, 51--73, 1996.]]Google ScholarCross Ref
R. G. Cowell, A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter, Probabilistic Networks and Expert Systems, Springer, New York, 1999.]] Google ScholarDigital Library
M. Cryan, L. A. Goldberg, and P. W. Goldberg, Evolutionary Trees can be Learned in Polynomial Time in the Two-State General Markov Model, SIAM Journal on Computing, 31(2), 375--397, 2002.]] Google ScholarDigital Library
R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.]]Google ScholarCross Ref
R. Durrett, Probability: Theory and Examples, Duxbury, 1996.]]Google Scholar
P. L. Erdos, M. Steel, L. Szekely, and T. Warnow, A Few Logs Suffice to Build (Almost) all Trees (I), Random Structures and Algorithms, 14, 153--184, 1997.]] Google ScholarDigital Library
P. L. Erdos, M. A. Steel, L. A. Szekely, and T. J. Warnow, A Few Logs Suffice to Build (Almost) all Trees (II), Theoretical Computer Science, 221(1--2), 77--118, 1999.]] Google ScholarDigital Library
J. Felsenstein. Inferring Phylogenies. Sinauer, New York, New York, 2004.]]Google Scholar
M. Farach and S. Kannan, Efficient algorithms for inverting evolution, STOC 1996.]] Google ScholarDigital Library
J. Feldman and R. O'Donnell and R. Servedio. Learning mixtures of product distributions, preprint, 2004.]]Google Scholar
J. Felsenstein, Cases in which parsimony or compatibility methods will be positively misleading, Systematic Zoology, 27, 410--410, 1978.]]Google ScholarCross Ref
G. H. Golub and C. H. Van Loan, Matrix Computations, Johns Hopkins University Press, 1996.]]Google Scholar
R. L. Graham and L. R. Foulds, Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time, Mathematical Biosciences, 60, 133--142, 1982.]]Google ScholarCross Ref
R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, 1985.]] Google ScholarDigital Library
M. Kearns, Efficient noise-tolerant learning from statistical queries, STOC 1993.]] Google ScholarDigital Library
M. J. Kearns, Y. Mansour, D. Ron, R. Rubinfeld, R. E. Schapire, and L. Sellie, On the learnability of discrete distributions, STOC 1994.]] Google ScholarDigital Library
R. B. Lyngsø and C. N. S. Pedersen, Complexity of Comparing Hidden Markov Models, ISAAC 2001.]] Google ScholarDigital Library
E. Mossel, Distorted metrics on trees and phylogenetic forests, preprint, 2004 (available at http://arxiv.org/math.CO/0403508).]]Google Scholar
E. Mossel and S.Roch, Learning Nonsingular Phylogenies and Hidden Markov Models, preprint, 2005 (available at http://arxiv.org/cs.LG/0502076).]]Google Scholar
L. R. Rabiner, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 77(2), 257--286, 1989.]]Google ScholarCross Ref
K. Rice and T. Warnow, Parsimony is Hard to Beat, Proceedings of the Third Annual International Conference on Computing and Combinatorics, Lecture Notes In Computer Science, Springer, 1276, 124 -- 133, 1997.]] Google ScholarDigital Library
D. Ron, Y. Singer, and N. Tishby, On the Learnability and Usage of Acyclic Probabilistic Finite Automata, J. Comput. Syst. Sci., 56(2), 133--152, 1998.]] Google ScholarDigital Library
C. Semple and M. Steel. Phylogenetics, volume 22 of Mathematics and its Applications series. Oxford University Press, 2003.]]Google Scholar
M. Steel, Recovering a tree from the leaf colourations it generates under a Markov model, Appl. Math. Lett. 7(2), 19--24, 1994.]]Google ScholarCross Ref
M. A. Steel, L. A. Szekely, and M. D. Hendy, Reconstructing trees when sequence sites evolve at variable rates, J. Comput. Biol. 1(2), 153--63, 1994.]]Google ScholarCross Ref
L. G. Valiant, A theory of the learnable, Communications of the ACM, 27(11), 1134--1142, 1984.]] Google ScholarDigital Library

Index Terms

Learning nonsingular phylogenies and hidden Markov models

Recommendations

Coding with partially hidden Markov models
DCC '95: Proceedings of the Conference on Data Compression

Partially hidden Markov models (PHMM) are introduced. They are a variation of the hidden Markov models (HMM) combining the power of explicit conditioning on past observations and the power of using hidden states. (P)HMM may be combined with arithmetic ...
Read More
Structural hidden Markov models: An application to handwritten numeral recognition

We introduce in this paper a generalization of the widely used hidden Markov models (HMM's), which we name "structural hidden Markov models" (SHMM). Our approach is motivated by the need of modeling complex structures which are encountered in many ...
Read More
Joint semi-supervised learning of Hidden Conditional Random Fields and Hidden Markov Models

Although semi-supervised learning has generated great interest for designing classifiers on static patterns, there has been comparatively fewer works on semi-supervised learning for structured outputs and in particular for sequences. We investigate semi-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
May 2005
778 pages
ISBN:1581139608
DOI:10.1145/1060590
General Chair:
Hal Gabow
University of Colorado
,
Program Chair:
Ronald Fagin
IBM Almaden Research Center
Copyright © 2005 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 May 2005
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
PAC learning
evolutionary trees
hidden Markov models
phylogenetic reconstruction
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,469of4,586submissions,32%
Upcoming Conference
STOC '24

Sponsor:

sigact

56th Annual ACM Symposium on Theory of Computing (STOC 2024)

June 24 - 28, 2024

Vancouver , BC , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 61
  Total Citations
  View Citations
- 494
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Learning nonsingular phylogenies and hidden Markov models

STOC '05: Proceedings of the thirty-seventh annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Coding with partially hidden Markov models

Structural hidden Markov models: An application to handwritten numeral recognition

Joint semi-supervised learning of Hidden Conditional Random Fields and Hidden Markov Models