Abstract
In 1975, Valiant showed that Boolean matrix multiplication can be used for parsing context-free grammars (CFGs), yielding the asympotically fastest (although not practical) CFG parsing algorithm known. We prove a dual result: any CFG parser with time complexity O(gn3-∈), where g is the size of the grammar and n is the length of the input string, can be efficiently converted into an algorithm to multiply m × m Boolean matrices in time O(m3-∈/3). Given that practical, substantially subcubic Boolean matrix multiplication algorithms have been quite difficult to find, we thus explain why there has been little progress in developing practical, substantially subcubic general CFG parsers. In proving this result, we also develop a formalization of the notion of parsing.
- AHO, A. V., SETHI, R., AND ULLMAN, J. D. 1986. Compilers: Principles, Techniques and Tools. Addison-Wesley, Reading, Mass. Google Scholar
- ARLAZAROV, V. L., DINIC, E. A., KRONROD, M. A., AND FARADZEV, I. A. 1970. On economical construction of the transitive closure of an oriented graph. Soviet Math. Dokl. 11, 1209-1210. (English translation of Russian article in Dokl. Akad. Nauk SSSR 194 (1970).)Google Scholar
- BAILEY, D. 1988. Extra high speed matrix multiplication on the Cray-2. SIAM J. Sci. Stat. Comput. 9, 3, 603-607. Google Scholar
- CHOMSKY, N. 1956. Three models for the description of language. IRE Trans. Inf. Theory 2, 3, 113-124.Google Scholar
- COPPERSMITH, D., AND WINOGRAD, S. 1987. Matrix multiplication via arithmetic progression. In Proceedings of the ACM Annual Symposium on the Theory of Computing. ACM, New York, pp. 1-6. Google Scholar
- COPPERSMITH, D., AND WINOGRAD, S. 1990. Matrix multiplication via arithmetic progression. J. Symb. Comput. 9, 3, 251-280 (Special Issue on Computational Algebraic Complexity). Google Scholar
- DURBIN, R., EDDY, S., KROGH, A., AND MITCHISON, G. 1998. Biological Sequence Analysis. Cambridge University Press, Cambridge, Mass.Google Scholar
- EARLEY, J. 1970. An efficient context-free parsing algorithm. Communi. ACM 13, 2 (Feb.), 94-102. Google Scholar
- GALLAIRE, H. 1969. Recognition time of context-free languages by on-line Turing machines. Inf. Cont. 15, 3 (Sept.), 288-295.Google Scholar
- GRAHAM, S. L., HARRISON, M. A., AND RUZZO, W. L. 1980. An improved context-free recognizer. ACM Trans. Prog. Lang. Syst. 2, 3, 415-462. Google Scholar
- HARRISON, M., AND HAVEL, I. 1974. On the parsing of deterministic languages. J. ACM 21, 4 (Oct.), 525-548. Google Scholar
- HARRISON, M. A. 1978. Introduction to Formal Language Theory. Addison-Wesley, Reading, Mass. Google Scholar
- HARTMANIS, J., AND STEARNS, R. E. 1965. On the computational complexity of algorithms. Trans. AMS 117, 285-306.Google Scholar
- HOPCROFT, J. E., AND ULLMAN, J. D. 1979. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, Mass. Google Scholar
- JOHNSON, M. 1998. PCFG models of linguistic tree representations. Computat. Ling. 24, 4, 613-632. Google Scholar
- JOSHI, A. 1997. Parsing techniques. In Survey of the State of the Art in Human Language Technology, Ronald Cole, Joseph Mariani, Hans Uszkoreit, Giovanni Battista Varile, Annie. Zaenen, Antonio. Zampolli, and Victor Zue, Eds. Studies in Natural Language Processing. Cambridge University Press, Cambridge, Mass, pp. 351-356. Google Scholar
- JOSHI, A. K., LEVY, L. S., AND TAKAHASHI, M. 1975. Tree adjunct grammars. J. Comput. Syst. Sci. 10, 1, 136-163.Google Scholar
- JURAFSKY, D., AND MARTIN, J. H. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall series in Artificial Intelligence. Prentice-Hall, Englewood Cliffs, N. J. Google Scholar
- KASAMI, T. 1965. An efficient recognition and syntax algorithm for context-free languages. Scientific Rep. AFCRL-65-758. Air Force Cambridge Research Lab, Bedford, Mass.Google Scholar
- NEDERHOF, M. J., AND SATTA, G. 1996. Efficient tabular LR parsing. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics. pp. 239-246. Google Scholar
- RAJASEKARAN, S., AND YOOSEPH, S. 1998. TAL recognition in O(M(n 2 )) time. J. Comput. Syst. Sci. 56, 1, 83-89. Google Scholar
- RUZZO, W. L. 1979. On the complexity of general context-free language parsing and recognition. In Proceedings of the 6th Colloquium on Automata, Languages and Programming (ICALP), Lecture Notes in Computer Science, vol. 71. Springer-Verlag, New York, pp. 489-497. Google Scholar
- RYTTER, W. 1985. Fast recognition of pushdown automaton and context-free languages. Inf. Cont. 67, 12-22. Google Scholar
- RYTTER, W. 1995. Context-free recognition via shortest paths computation: A version of Valiant's algorithm. Theoret. Comput. Sci. 143, 2, 343-352. Google Scholar
- SATTA, G. 1994. Tree-adjoining grammar parsing and Boolean matrix multiplication. Comput. Ling. 20, 2, 173-191. Google Scholar
- SEIFERAS, J. 1986. A simplified lower bound for context-free-language recognition. Inf. Cont. 69: 255-260. Google Scholar
- STRASSEN, V. 1969. Gaussian elimination is not optimal. Num. Math. 14, 3, 354-356.Google Scholar
- STRASSEN, V. 1990. Algebraic complexity theory. In Handbook of Theoretical Computer Science, vol. A, Jan van Leeuwen, Ed. Elsevier Science Publishers, Amsterdam, The Netherlands, pp. 633-672. Google Scholar
- THOTTETHODI, MITHUNA, T., CHATTERJEE, S., AND LEBECK, A. R. 1998. Tuning Strassen's matrix multiplication for memory efficiency. In SC98: High Performance Networking and Computing Conference. Google Scholar
- VALIANT, L. G. 1975. General context-free recognition in less than cubic time. J. Comput. Syst. Sci. 10, 308-315.Google Scholar
- YOUNGER, D. H. 1967. Recognition and parsing of context-free languages in time n 3 . Inf. Cont. 10, 2, 189-208.Google Scholar
Index Terms
- Fast context-free grammar parsing requires fast boolean matrix multiplication
Recommendations
Fast context-free parsing requires fast Boolean matrix multiplication
ACL '98/EACL '98: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational LinguisticsValiant showed that Boolean matrix multiplication (BMM) can be used for CFG parsing. We prove a dual result: CFG parsers running in time O(|G||w|3-e) on a grammar G and a string w can be used to multiply m x m Boolean matrices in tmie O(m3-e/3). In the ...
Parsing linear context-free rewriting systems with fast matrix multiplication
We describe a recognition algorithm for a subset of binary linear context-free rewriting systems LCFRS with running time Onωd where Mm = Omω is the running time for m × m matrix multiplication and d is the "contact rank" of the LCFRS-the maximal number ...
Parsing Arabic using induced probabilistic context free grammar
The importance of the parsing task for NLP applications is well understood. However developing parsers remains difficult because of the complexity of the Arabic language. Most parsers are based on syntactic grammars that describe the syntactic ...
Comments