skip to main content
article

Fast context-free grammar parsing requires fast boolean matrix multiplication

Published:01 January 2002Publication History
Skip Abstract Section

Abstract

In 1975, Valiant showed that Boolean matrix multiplication can be used for parsing context-free grammars (CFGs), yielding the asympotically fastest (although not practical) CFG parsing algorithm known. We prove a dual result: any CFG parser with time complexity O(gn3-∈), where g is the size of the grammar and n is the length of the input string, can be efficiently converted into an algorithm to multiply m × m Boolean matrices in time O(m3-∈/3). Given that practical, substantially subcubic Boolean matrix multiplication algorithms have been quite difficult to find, we thus explain why there has been little progress in developing practical, substantially subcubic general CFG parsers. In proving this result, we also develop a formalization of the notion of parsing.

References

  1. AHO, A. V., SETHI, R., AND ULLMAN, J. D. 1986. Compilers: Principles, Techniques and Tools. Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  2. ARLAZAROV, V. L., DINIC, E. A., KRONROD, M. A., AND FARADZEV, I. A. 1970. On economical construction of the transitive closure of an oriented graph. Soviet Math. Dokl. 11, 1209-1210. (English translation of Russian article in Dokl. Akad. Nauk SSSR 194 (1970).)Google ScholarGoogle Scholar
  3. BAILEY, D. 1988. Extra high speed matrix multiplication on the Cray-2. SIAM J. Sci. Stat. Comput. 9, 3, 603-607. Google ScholarGoogle Scholar
  4. CHOMSKY, N. 1956. Three models for the description of language. IRE Trans. Inf. Theory 2, 3, 113-124.Google ScholarGoogle Scholar
  5. COPPERSMITH, D., AND WINOGRAD, S. 1987. Matrix multiplication via arithmetic progression. In Proceedings of the ACM Annual Symposium on the Theory of Computing. ACM, New York, pp. 1-6. Google ScholarGoogle Scholar
  6. COPPERSMITH, D., AND WINOGRAD, S. 1990. Matrix multiplication via arithmetic progression. J. Symb. Comput. 9, 3, 251-280 (Special Issue on Computational Algebraic Complexity). Google ScholarGoogle Scholar
  7. DURBIN, R., EDDY, S., KROGH, A., AND MITCHISON, G. 1998. Biological Sequence Analysis. Cambridge University Press, Cambridge, Mass.Google ScholarGoogle Scholar
  8. EARLEY, J. 1970. An efficient context-free parsing algorithm. Communi. ACM 13, 2 (Feb.), 94-102. Google ScholarGoogle Scholar
  9. GALLAIRE, H. 1969. Recognition time of context-free languages by on-line Turing machines. Inf. Cont. 15, 3 (Sept.), 288-295.Google ScholarGoogle Scholar
  10. GRAHAM, S. L., HARRISON, M. A., AND RUZZO, W. L. 1980. An improved context-free recognizer. ACM Trans. Prog. Lang. Syst. 2, 3, 415-462. Google ScholarGoogle Scholar
  11. HARRISON, M., AND HAVEL, I. 1974. On the parsing of deterministic languages. J. ACM 21, 4 (Oct.), 525-548. Google ScholarGoogle Scholar
  12. HARRISON, M. A. 1978. Introduction to Formal Language Theory. Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  13. HARTMANIS, J., AND STEARNS, R. E. 1965. On the computational complexity of algorithms. Trans. AMS 117, 285-306.Google ScholarGoogle Scholar
  14. HOPCROFT, J. E., AND ULLMAN, J. D. 1979. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, Mass. Google ScholarGoogle Scholar
  15. JOHNSON, M. 1998. PCFG models of linguistic tree representations. Computat. Ling. 24, 4, 613-632. Google ScholarGoogle Scholar
  16. JOSHI, A. 1997. Parsing techniques. In Survey of the State of the Art in Human Language Technology, Ronald Cole, Joseph Mariani, Hans Uszkoreit, Giovanni Battista Varile, Annie. Zaenen, Antonio. Zampolli, and Victor Zue, Eds. Studies in Natural Language Processing. Cambridge University Press, Cambridge, Mass, pp. 351-356. Google ScholarGoogle Scholar
  17. JOSHI, A. K., LEVY, L. S., AND TAKAHASHI, M. 1975. Tree adjunct grammars. J. Comput. Syst. Sci. 10, 1, 136-163.Google ScholarGoogle Scholar
  18. JURAFSKY, D., AND MARTIN, J. H. 2000. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall series in Artificial Intelligence. Prentice-Hall, Englewood Cliffs, N. J. Google ScholarGoogle Scholar
  19. KASAMI, T. 1965. An efficient recognition and syntax algorithm for context-free languages. Scientific Rep. AFCRL-65-758. Air Force Cambridge Research Lab, Bedford, Mass.Google ScholarGoogle Scholar
  20. NEDERHOF, M. J., AND SATTA, G. 1996. Efficient tabular LR parsing. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics. pp. 239-246. Google ScholarGoogle Scholar
  21. RAJASEKARAN, S., AND YOOSEPH, S. 1998. TAL recognition in O(M(n 2 )) time. J. Comput. Syst. Sci. 56, 1, 83-89. Google ScholarGoogle Scholar
  22. RUZZO, W. L. 1979. On the complexity of general context-free language parsing and recognition. In Proceedings of the 6th Colloquium on Automata, Languages and Programming (ICALP), Lecture Notes in Computer Science, vol. 71. Springer-Verlag, New York, pp. 489-497. Google ScholarGoogle Scholar
  23. RYTTER, W. 1985. Fast recognition of pushdown automaton and context-free languages. Inf. Cont. 67, 12-22. Google ScholarGoogle Scholar
  24. RYTTER, W. 1995. Context-free recognition via shortest paths computation: A version of Valiant's algorithm. Theoret. Comput. Sci. 143, 2, 343-352. Google ScholarGoogle Scholar
  25. SATTA, G. 1994. Tree-adjoining grammar parsing and Boolean matrix multiplication. Comput. Ling. 20, 2, 173-191. Google ScholarGoogle Scholar
  26. SEIFERAS, J. 1986. A simplified lower bound for context-free-language recognition. Inf. Cont. 69: 255-260. Google ScholarGoogle Scholar
  27. STRASSEN, V. 1969. Gaussian elimination is not optimal. Num. Math. 14, 3, 354-356.Google ScholarGoogle Scholar
  28. STRASSEN, V. 1990. Algebraic complexity theory. In Handbook of Theoretical Computer Science, vol. A, Jan van Leeuwen, Ed. Elsevier Science Publishers, Amsterdam, The Netherlands, pp. 633-672. Google ScholarGoogle Scholar
  29. THOTTETHODI, MITHUNA, T., CHATTERJEE, S., AND LEBECK, A. R. 1998. Tuning Strassen's matrix multiplication for memory efficiency. In SC98: High Performance Networking and Computing Conference. Google ScholarGoogle Scholar
  30. VALIANT, L. G. 1975. General context-free recognition in less than cubic time. J. Comput. Syst. Sci. 10, 308-315.Google ScholarGoogle Scholar
  31. YOUNGER, D. H. 1967. Recognition and parsing of context-free languages in time n 3 . Inf. Cont. 10, 2, 189-208.Google ScholarGoogle Scholar

Index Terms

  1. Fast context-free grammar parsing requires fast boolean matrix multiplication

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader