Abstract
A method is presented for calculating the amount of information conveyed to a hearer by a speaker emitting a sentence generated by a probabilistic grammar known to both parties. The method applies the work of Grenander (1967) to the intermediate states of a top-down parser. This allows the uncertainty about structural ambiguity to be calculated at each point in a sentence. Subtracting these values at successive points gives the information conveyed by a word in a sentence. Word-by-word information conveyed is calculated for several small probabilistic grammars, and it is suggested that the number of bits conveyed per word is a determinant of reading times and other measures of cognitive load.
Similar content being viewed by others
References
Bever, T. G. (1970). The cognitive basis for linguistic structures. In J. Hayes (Ed.), Cognition and the development of language (pp. 279-362). New York: Wiley.
Billot, S., & Lang, B. (1989). The structure of shared forests in ambiguous parsing. In Proceedings of the 1989 Meeting of the Association for Computational Linguistics. Vancouver.
Charniak, E. (1993). Statistical language learning. MIT Press.
Charniak, E., & Goldman, R. P. (1993). A bayesian model of plan recognition. Artificial Intelligence, 64, 53-79.
Chomsky, N. (1956). Three models for the description of language. IRE Transactions on Information Theory, 2, 113-124.
Chomsky, N. (1965). Aspects of the theory of syntax. Cambridge MA: MIT Press.
Cover, T. M., & Thomas, J. A. (1991). Elements of information theory. John Wiley and Sons.
Crain, S., & Fodor, J. D. (1985). How can grammars help parsers? In D. R. Dowty, L. Karttunen, & A. M. Zwicky (Eds.), Natural language parsing: Psychological, computational and theoretical perspectives (pp. 94-127). Cambridge: Cambridge University Press.
Den, Y., & Inoue, M. (1997). Disambiguation with verb-predictability: Evidence from Japanese garden-path phenomena. In Proceedings of the Nineteenth Annual Conference of the Cognitive Science Society (pp. 179-184). Lawrence Erlbaum.
Ellis, C. A. (1969). Probabilistic languages and automata. Unpublished doctoral dissertation, University of Illinois, Urbana.
Elman, J. (1990). Finding structure in time. Cognitive Science, 14, 179-211.
Feldman, J. A., & Ballard, D. H. (1982). Connectionist models and their properties. Cognitive Science, 6, 205-254.
Frazier, L., & Clifton, C., Jr. (1996). Construal. Cambridge, MA: MIT Press.
Garnsey, S. M., Pearlmutter, N. J., Myers, E., & Lotocky, M. A. (1997). The contributions of verb bias and plausibility to the comprehension of temporarily ambiguous sentences. Journal of Memory and Language, 37, 58-93.
Gazdar, G., Klein, E., Pullum, G., & Sag, I. (1985). Generalized phrase structure grammar. Cambridge, MA: Harvard University Press.
Gibson, E. (1991). A computational theory of human linguistic processing: Memory limitations and processing breakdown. Unpublished doctoral dissertation, Pittsburgh, PA: Carnegie Mellon University.
Gibson, E. (1998). Linguistic complexity: Locality of syntactic dependencies. Cognition, 68, 1-76.
Gibson, E. (2000). The dependency locality theory: A distance-based theory of linguistic complexity. In Y. Miyashita, A. Marantz, & W. O'Neil (Ed.), Image, language, brain. Cambridge, Massachusetts: MIT Press.
Gibson, E., & Pearlmutter, N. J. (1998). Constraints on sentence processing. Trends in Cognitive Sciences, 2, 262-268.
Grenander, U. (1967). Syntax-controlled probabilities (Tech. Rep.). Providence, RI: Brown University Division of Applied Mathematics.
Grodner, D., Watson, D., & Gibson, E. (2000). Locality effects on sentence processing. In Thirteenth Annual CUNY Conference on Human Sentence Processing. San Diego, CA. (Talk presented at CUNY 2000).
Harris, T. (1963). The theory of branching processes. New York: Springer-Verlag.
Huang, T., & Fu, K. (1971). On stochastic context-free languages. Information Sciences, 3, 201-224.
Inoue, M., & Den, Y. (1999). Influence of verb-predictability on ambiguity resolution in Japanese. Poster presented at the 16th Annual Meeting of the Japanese Cognitive Science Society.
Jensen, F. V. (1996). An introduction to Bayesian Networks. London: University College London Press.
Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambiguation. Cognitive Science, 20, 137-194.
Jurafsky, D., & Martin, J. H. (2000). Speech and language processing: An introduction to natural language processing, computational linguistics and speech recognition. Upper Saddle River, NJ: Prentice-Hall.
Kurtzman, H. S. (1985). Studies in syntactic ambiguity resolution. Unpublished doctoral dissertation, MIT.
Lang, B. (1974). Deterministic techniques for efficient non-deterministic parsers. In J. Loeckx (Ed.), Proceedings of the 2nd Colloquium on Automata, Languages and Programming (pp. 255-269). SaarbrÜcken.
Lang, B. (1988). Parsing incomplete sentences. In Proceedings of the 12th International Conference on Computational Linguistics (pp. 365-371). Budapest.
Legendre, G., Miyata, & Smolensky, P. (1990a). Harmonic grammar—a formal multilevel connectionist theory of linguistic well-formedness: An application. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society (pp. 884-891). Cambridge MA: Erlbaum.
Legendre, G., Miyata, Y., & Smolensky, P. (1990b). Harmonic grammar—a formal multilevel connectionist theory of linguistic well-formedness: Theoretical foundations. In Proceedings of the Twelfth Annual Conference of the Cognitive Science Society. (pp. 388-395). Cambridge MA: Erlbaum.
MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). Lexical nature of syntactic ambiguity resolution. Psychological Review, 101, 676-703.
Manning, C. D., & SchÜtze, H. (2000). Foundations of statistical natural language processing. Cambridge, MA: MIT Press.
McClelland, J., & St. John, M. (1989). Sentence comprehension: A PDP approach. Language and Cognitive Processes, 4, 287-336.
McClelland, J. L., & Kawamoto, A. H. (1986). Mechanisms of sentence processing: Assigning roles to constituents of sentences. In Parallel distributed processing: Explorations in the microstructure of cognition (pp. 272-325). Cambridge, MA: MIT Press.
Mitchell, D. C., Cuetos, F., Corley, M. M., & Brysbaert, M. (1995). Exposure-based models of human parsing: Evidence for the use of coarse-grained (nonlexical) statistical records. Journal of Psycholinguistic Research, 24, 469-488.
Narayanan, S., & Jurafsky, D. (1998). Bayesian models of human sentence processing. In Proceedings of the 19th Annual Conference of the Cognitive Science Society. University of Wisconsin-Madson.
Narayanan, S., & Jurafsky, D. (2002). A Bayesian model predicts human parse preference and reading time in sentence processing. In T. G. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Advances in neural information processing 14. Cambridge, MA: MIT Press.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. San Meteo, CA: Morgan Kaufmann.
Pritchett, B. (1988). The grammatical basis of language processing. Language, 64, 539-576.
Rodriguez, P. F. (1999). Mathematical foundations of simple recurrent networks in language processing. Unpublished doctoral dissertation, University of California-San Diego.
Rohde, D. L. (2002). A connectionist model of sentence comprehension and production. Unpublished doctoral dissertation, Carnegie Mellon University.
Rumelhart, D. E., McClelland, J., & PDP Research Group. (1986). Parallel distributed processing: Explorations in the microstructure of cognition. Cambridge, MA: MIT Press.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379-423, 623-656.
Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. In Parallel distributed processing: Explorations in the microstructure of cognition. Cambridge, MA: MIT Press.
Steedman, M. (1999). Connectionist sentence processing in perspective. Cognitive Science, 23, 615-634.
Steedman, M. (2000). The syntactic process. Cambridge, MA: MIT Press.
St. John, M. F., & McClelland, J. L. (1990). Learning and applying contextual constraints in sentence comprehension. Artificial Intelligence, 46, 217-257.
Sturt, P., Pickering, M. J., & Crocker, M. W. (1999). Structural change and reanalysis difficulty in language comprehension. Journal of Memory and Language, 40, 136-150.
Suppes, P. (1970). Probabilistic grammars for natural language. Synthese, 22, 95-116.
Tabor, W. (2000). Fractal encoding of context free grammars in connectionist networks. Expert Systems: The International Journal of Knowledge Engineering and Neural Networks, 17, 41-56.
Tabor, W., Juliano, C., & Tanenhaus, M. (1997). Parsing in a dynamical system: An attractor-based account of the interaction of lexical and structural constraints in sentence processing. Language and Cognitive Processes, 12, 211-271.
Tabor, W., & Tanenhaus, M. K. (2001). Dynamical systems for sentence processing. In Connectionist psycholinguistics: Capturing the empirical data. Westport, CT: Ablex.
Tanenhaus, M., Spivey-Knowlton, M., Eberhard, K., & Sedivy, J. (1995). Integration of visual and linguistic information in spoken language comprehension. Science, 268, 1632-1634.
Tanenhaus, M. K., & Trueswell, J. C. (1995). Sentence comprehension. In J. L. Miller & P. D. Eimas (Eds.), Speech, language and communication 2nd ed., (Vol. 11, pp. 217-262). San Diego, CA: Academic Press.
Trueswell, J. C. (1996). The role of lexical frequency in syntactic ambiguity resolution. Journal of Memory and Language, 35, 566-585.
Trueswell, J. C., Tanenhaus, M. K., & Garnsey, S. M. (1994). Semantic influences on parsing: Use of thematic role information in syntactic disambiguation. Journal of Memory and Language, 33, 285-318.
Yngve, V. H. (1960). A model and an hypothesis for language structure. In Proceedings of the American Philosophical Society (Vol. 104), pp. 444-466). Philadelphia.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Hale, J. The Information Conveyed by Words in Sentences. J Psycholinguist Res 32, 101–123 (2003). https://doi.org/10.1023/A:1022492123056
Issue Date:
DOI: https://doi.org/10.1023/A:1022492123056