Elsevier

Cognition

Volume 164, July 2017, Pages 116-143
Cognition

Bootstrapping language acquisition

Parts of this work were previously presented at EMNLP 2010, EACL 2012, and appeared in Kwiatkowski’s doctoral dissertation
https://doi.org/10.1016/j.cognition.2017.02.009Get rights and content

Highlights

  • Computational implementation of the Semantic Bootstrapping Hypothesis.

  • Joint Bayesian modeling of the acquisition of word meanings and syntax.

  • The model performs incremental learning over naturalistic child-directed utterances.

  • The model provides a unified account for a range of developmental phenomena.

  • Simulations predict experimental results on syntactic bootstrapping.

Abstract

The semantic bootstrapping hypothesis proposes that children acquire their native language through exposure to sentences of the language paired with structured representations of their meaning, whose component substructures can be associated with words and syntactic structures used to express these concepts. The child’s task is then to learn a language-specific grammar and lexicon based on (probably contextually ambiguous, possibly somewhat noisy) pairs of sentences and their meaning representations (logical forms).

Starting from these assumptions, we develop a Bayesian probabilistic account of semantically bootstrapped first-language acquisition in the child, based on techniques from computational parsing and interpretation of unrestricted text. Our learner jointly models (a) word learning: the mapping between components of the given sentential meaning and lexical words (or phrases) of the language, and (b) syntax learning: the projection of lexical elements onto sentences by universal construction-free syntactic rules. Using an incremental learning algorithm, we apply the model to a dataset of real syntactically complex child-directed utterances and (pseudo) logical forms, the latter including contextually plausible but irrelevant distractors. Taking the Eve section of the CHILDES corpus as input, the model simulates several well-documented phenomena from the developmental literature. In particular, the model exhibits syntactic bootstrapping effects (in which previously learned constructions facilitate the learning of novel words), sudden jumps in learning without explicit parameter setting, acceleration of word-learning (the “vocabulary spurt”), an initial bias favoring the learning of nouns over verbs, and one-shot learning of words and their meanings. The learner thus demonstrates how statistical learning over structured representations can provide a unified account for these seemingly disparate phenomena.

Introduction

One of the fundamental challenges facing a child language learner is the problem of generalizing beyond the input. Using various social and other extralinguistic cues, a child may be able to work out the meaning of particular utterances they hear, like “you read the book” or “Eve will read Lassie”, if these are encountered in the appropriate contexts. But merely memorizing and reproducing earlier utterances is not enough: children must also somehow use these experiences to learn to produce and interpret novel utterances, like “you read Lassie” and “show me the book”. There are many proposals for how this might be achieved, but abstractly speaking it seems to require the ability to explicitly or implicitly (a) decompose the utterance’s form into syntactic units, (b) decompose the utterance’s meaning into semantic units, (c) learn lexical mappings between these syntactic and semantic units, and (d) learn the language-specific patterns that guide their recombination (so that e.g. “Eve will read Lassie to Fraser”, “will Eve read Fraser Lassie?”, and “will Fraser read Eve Lassie?” have different meanings, despite using the same or nearly the same words). A further challenge is that even in child-directed speech, many sentences are more complex than “you read Lassie”; the child’s input consists of a mixture of high- and low-frequency words falling into a variety of syntactic categories and arranged into a variety of more or less complex syntactic constructions.

In this work, we present a Bayesian language-learning model focused on the acquisition of compositional syntax and semantics in an incremental, naturalistic setting. That is, our model receives training examples consisting of whole utterances paired with noisy representations of the whole utterance’s meaning, and from these it learns probabilistic representations of the semantics and syntax of individual words, in such a way that it becomes able to recombine these words to understand novel utterances and express novel meanings. This requires that the model simultaneously learn how to parse syntactic constructions, assign meaning to specific words, and use syntactic regularities (for example, in verb argument structure) to guide interpretation of ambiguous input. Our training data consists of real, syntactically complex child-directed utterances drawn from a single child in the CHILDES corpus, and our training is incremental in the sense that the model is presented with each utterance exactly once, in the same order that the child actually encountered them.

The work described here represents an advance over previous models that focused on learning either word meanings or syntax given the other (see below for a review). By developing a joint learning model we are able to explore how these phenomena interact during learning. A handful of other joint learning models have been presented in the literature, but these have either worked from synthetic input with varying degrees of realism (Beekhuizen, 2015, Maurits et al., 2009) or have not yet been evaluated on specific phenomena known from child language acquisition, as we do here (Chrupała et al., 2015, Jones, 2015). In particular, we show in a series of simulations that our model exhibits syntactic bootstrapping effects (in which previously learned constructions facilitate the learning of novel words), sudden jumps in learning without explicit parameter setting, acceleration of word-learning (the “vocabulary spurt”), an initial bias favoring the learning of nouns over verbs, and one-shot learning of words and their meanings. These results suggest that there is no need to postulate distinct learning mechanisms to explain these various phenomena; rather they can all be explained through a single mechanism of statistical learning over structured representations.

Our model falls under the general umbrella of “Semantic Bootstrapping” theory, which assumes that the child can access a structural representation of the intended semantics or conceptual content of the utterance, and that such representations are sufficiently homomorphic to the syntax of the adult language for a mapping from sentences to meanings to be determined (Bowerman, 1973, Brown, 1973, Clark, 1973, Grimshaw, 1981, Pinker, 1979, Schlesinger, 1971; cf. Wexler & Culicover, 1980:78–84; Berwick, 1985:22–24). By “homomorphic”, we simply mean that meaning representation and syntax stand in a “type-to-type” relation, according to which every syntactic type (such as the English intransitive verb) corresponds to a semantic type (such as the predicate), and every rule (such as English SNPVP) corresponds to a semantic operation (such as function application of the predicate to the subject).

Early accounts of semantic bootstrapping (e.g. Berwick, 1985, Wexler and Culicover, 1980) assumed perfect access to a single meaning representation in the form of an Aspects-style Deep Structure already aligned to the words of the language. Yet, as we shall see, semantic bootstrapping is sufficiently powerful that such strong assumptions are unnecessary.

Since, on the surface, languages differ in many ways—for example with respect to the order of heads and complements, and in whether such aspects of meaning as tense, causality, evidentiality, and information structure are explicitly marked—the meaning representations must be expressed in a universal prelinguistic conceptual representation, in whose terms all such distinctions are expressable. The mapping must further be learned by general principles that apply to all languages. These general principles are often referred to as “universal grammar”, although the term is somewhat misleading in the present context since the model we develop is agnostic as to whether these principles are unique to language or apply more generally in cognition.

A number of specific instantiations of the semantic bootstrapping theory have been proposed over the years. For example, “parameter setting” accounts of language acquisition assume, following Chomsky (1981), that grammars for each natural language can be described by a finite number of finitely-valued parameters, such as head-position, pro-drop, or polysynthesis (Hyams, 1986 and much subsequent work). Language acquisition then takes a form that has been likened to a game of Twenty-Questions (Yang, 2006 Ch:7), whereby parameters can be set when the child encounters “triggers”, or sentences that can only be analyzed under one setting of a parameter. For example, for Hyams (1986), the fact that English has lexical expletive subjects (e.g., it in it rained) is unequivocal evidence that the pro-drop parameter is negative, while for others the position of the verb in simple intransitive sentences in Welsh is evidence for head-initiality. Such triggers are usually discussed in purely syntactic terms. However, in both examples, the child needs to know which of the words is the verb, which requires a prior stage of semantic bootstrapping at the level of the lexicon (Hyams, 1986:132–133).

Unfortunately, parameter setting seems to raise as many questions as it answers. First, there are a number of uncertainties concerning the way the learner initially identifies the syntactic categories of the words, the specific inventory of parameters that are needed, and the aspects of the data that “trigger” their setting (Fodor, 1998, Gibson and Wexler, 1994, Niyogi and Berwick, 1996). Second, several combinatoric problems arise from simplistic search strategies in this parameter space (Fodor & Sakas, 2005). Here, we will demonstrate that step-like learning curves used to argue for parameter-setting approaches (Thornton & Tesan, 2007) can be explained by a statistical model without explicit linguistic parameters.

A further variant of the semantic bootstrapping theory to be discussed below postulates a second, later, stage of “syntactic bootstrapping” (Braine, 1992, Gleitman, 1990, Landau and Gleitman, 1985, Trueswell and Gleitman, 2007), during which the existence of early semantically bootstrapped syntax allows rapid or even “one-shot” learning of lexical items, including ones for which the situation of utterance offers little or no direct evidence. Early discussions of syntactic bootstrapping implied that it is a learning mechanism in its own right, distinct from semantic bootstrapping. However, we will demonstrate that these effects attributed to syntactic bootstrapping emerge naturally under the theory presented here. That is, our learner exhibits syntactic bootstrapping effects (using syntax to accelerate word learning) without the need for a distinct mechanism: the mechanism of semantic bootstrapping is sufficient to engender the effects.

Although varieties of semantic bootstrapping carry considerable currency, some researchers have pursued an alternative distributional approach (Redington, Chater, & Finch, 1998), which assumes that grammatical structure can be inferred from statistical properties of strings alone. Many proponents of this approach invoke Artificial Neural Network (ANN) computational models as an explanation for how this could be done—see Elman et al. (1996) for examples—while others in both cognitive science and computer science have proposed methods using structured probabilistic models (Cohn et al., 2010, Klein and Manning, 2004, Perfors et al., 2011). The distributional approach is appealing to some because it avoids the assumption that the child can access meanings expressed in a language of mind that is homomorphic to spoken language in the sense defined above, but inaccessible to adult introspection and whose detailed character is otherwise unknown.

There has been some success in using this kind of meaning-free approach to learn non-syntactic structure such as word- and syllable-level boundaries (Goldwater et al., 2009, Johnson and Goldwater, 2009, Phillips and Pearl, 2014). However, attempts to infer syntactic structures such as dependency or constituency structure, and even syntactic categories, have been notably less successful despite considerable effort (e.g. Abend et al., 2010, Christodoulopoulos et al., 2010, Cohn et al., 2010, Klein and Manning, 2004, Klein and Manning, 2005). Within the context of state-of-the-art natural language processing (NLP) applications, ANN models that have no explicit structured representations have yielded excellent language modeling performance (i.e., prediction of probable vs. improbable word sequences) (e.g., Mikolov et al., 2010, Sundermeyer et al., 2012). They have also been used to learn distributed word representations that capture some important semantic and syntactic information (Mikolov, Yih, & Zweig, 2013). Yet the reason these models work as well as they do in NLP tasks arises from the way they mix sentence-internal syntax and semantics with pragmatics and frequency of collocation. Thus, they often learn to conflate antonyms as well as synonyms to similar representations (Turney & Pantel, 2010). This representation creates problems for the compositional semantics of logical operators (such as negation) of a kind that the child never exhibits.

In this work, we take a close relation between syntax and compositional semantics as a given. We also follow the basic premise of semantic bootstrapping that the learner is able to infer the meaning of at least some of the language she hears on the basis of nonlinguistic context. However, unlike some other versions of semantic bootstrapping, we assume that the available meanings are at the level of utterances rather than individual words, and that word meanings (i.e., the mapping from words to parts of the utterance meaning) are learned from such data.

We represent the meaning of an utterance as a sentential logical form. Thus, the aim of the learner is to generalize from the input pairs of an observed sentence and a possible meaning in order to interpret new sentences whose meaning is unavailable contextually, and to generate new sentences that express an intended meaning. Because of the limitations of available corpus annotations, the logical forms we consider are restricted to predicate-argument relations, and lack the interpersonal content whose importance in language acquisition is generally recognized. We examine the nature of such content in Section 4.4, where we argue that our model can be expected to generalize to more realistic meaning representations.

Recent work suggests that the infant’s physically limited view of the world, combined with social, gestural, prosodic, and other cues, may lead to considerably less ambiguity in inferring utterance meanings than has previously been supposed (Yu and Smith, 2012, Yu and Smith, 2013, Yurovsky et al., 2013). Nevertheless, most contexts of utterance are likely to support more than one possible meaning, so it is likely that the child will have to cope with a number of distracting spurious meaning candidates. We model propositional ambiguity in the input available to the learner by assuming each input utterance s is paired with several contextually plausible logical forms {m1,,mk}, of which only one is correct and the rest serve as distractors. While these meaning representations are assumed to reflect an internal language of mind, our model is general enough to allow the inclusion of various types of content in the logical forms, including social, information-structural, and perceptual.

Within this general framework, we develop a statistical learner that jointly models both (a) the mapping between components of the given sentential meaning and words (or phrases) of the language, and (b) the projection of lexical elements onto constituents and sentences by syntactic rules. In earlier work, we defined the learner and gave preliminary simulation results (Kwiatkowski, Goldwater, Zettlemoyer, & Steedman, 2012); here we expand considerably on the description of the learner, the range of simulations, and the discussion in relation to human language acquisition.

There has been considerable previous work by others on both word learning and syntactic acquisition, but they have until very recently been treated separately. Thus, models of cross-situational word learning have generally focused on learning either word-meaning mappings (mainly object referents) in the absence of syntax (Alishahi et al., 2008, Frank et al., 2009, McMurray et al., 2012, Plunkett et al., 1992, Regier, 2005, Siskind, 1996, Yu and Ballard, 2007), or learning verb-argument structures assuming nouns and/or syntax are known (Alishahi and Stevenson, 2008, Alishahi and Stevenson, 2010, Barak et al., 2013, Beekhuizen et al., 2014, Chang, 2008, Morris et al., 2000, Niyogi, 2002).4 Conversely, most models of syntactic acquisition have considered learning from meaning-free word sequences alone (see discussion above), or have treated word-meaning mapping and syntactic learning as distinct stages of learning, with word meanings learned first followed by syntax (Buttery, 2006, Dominey and Boucher, 2005, Villavicencio, 2002).

Several previous researchers have demonstrated that correct (adult-like) knowledge of syntax (e.g., known part-of-speech categories or syntactic parses) can help with word-learning (Mellish, 1989, Göksun et al., 2008, Ural et al., 2009, Fazly et al., 2010, Thomforde and Steedman, 2011, Yu and Siskind, 2013), and a few (Alishahi and Chrupała, 2012, Yu, 2006) have gone further in showing that learned (and therefore imperfect) knowledge of POS categories can help with word learning. However, these models are not truly joint learners, since the learned semantics does not feed back into further refinement of POS categories.

By treating word learning and syntactic acquisition jointly, our proposal provides a working model of how these two aspects of language can be learned simultaneously in a mutually reinforcing way. And unlike models such as those of Maurits et al. (2009) and Beekhuizen (2015), our model learns from real corpus data, meaning it needs to handle variable-length sentences and predicates with differing numbers of arguments, as well as phenomena such as multiple predicates per sentence (including hierarchical relationships, e.g., want to go) and logical operators, such as negation and conjunction. To tackle this challenging scenario, we adopt techniques originally developed for the task of “semantic parsing” (more properly, semantic parser induction) in computational linguistics (Kwiatkowski et al., 2010, Kwiatkowski et al., 2011, Thompson and Mooney, 2003, Zettlemoyer and Collins, 2005, Zettlemoyer and Collins, 2007).

Our model rests on two key features which we believe to be critical to early syntactic and semantic acquisition in children. The first, shared by most of the models above, is statistical learning. Our model uses a probabilistic grammar and lexicon whose model parameters are updated using an incremental learning algorithm. (By parameters here, we mean the probabilities in the model; our model does not include linguistic parameters in the sense noted earlier of Chomsky (1981) and Hyams (1986).) This statistical framework allows the model to take advantage of incomplete information while being robust to noise. Although some statistical learners have been criticized for showing learning curves that are too gradual—unlike the sudden jumps in performance sometimes seen in children (Thornton & Tesan, 2007)—we show that our model does not suffer from this problem.

The second key feature of our model is its use of syntactically guided semantic compositionality. This concept lies at the heart of most linguistic theories, yet has rarely featured in previous computational models of acquisition. As noted above, many models have focused either on syntax or semantics alone, with another large group considering the syntax-semantics interface only as it applies to verb learning. Of those models that have considered both syntax and semantics for full sentences, many have assumed that the meaning of a sentence is simply the set of meanings of the words in that sentence (Alishahi and Chrupała, 2012, Allen and Seidenberg, 1999, Fazly et al., 2010, Yu, 2006). Connor, Fisher, and Roth (2012) addressed the acquisition of shallow semantic structures (predicate-argument structures and their semantic roles) from sentential meaning representations consisting of the set of semantic roles evoked by the sentence. Villavicencio (2002) and Buttery (2006) make similar assumptions to our own about semantic representation and composition, but also assume a separate stage of word learning prior to syntactic learning, with no flow of information from syntax back to word learning as in our joint model. Thus, their models are unable to capture syntactic bootstrapping effects. Chrupała et al. (2015) have a joint learning model, but no explicit syntactic or semantic structure. The model most similar to our own in this respect is that of Jones (2015), but as noted above, the simulations in that work are more limited than those we include here.

Much of the power of our model comes from this assumption that syntactic and semantic composition are closely coupled. To implement this assumption, we have based our model on Combinatory Categorial Grammar (CCG, Steedman, 1996b, Steedman, 2000, Steedman, 2012). CCG has been extensively applied to parsing and interpretation of unrestricted text (Auli and Lopez, 2011, Clark and Curran, 2004, Hockenmaier, 2003, Lewis and Steedman, 2014), and has received considerable attention recently in the computational literature on semantic parser induction (e.g., Artzi et al., 2014, Krishnamurthy and Mitchell, 2014, Kwiatkowski et al., 2010, Kwiatkowski et al., 2011, Matuszek et al., 2012, Zettlemoyer and Collins, 2005, Zettlemoyer and Collins, 2007).

This attention stems from two essential properties of CCG. First, unlike Lexicalized Tree-Adjoining Grammar (Joshi & Schabes, 1997), Generalized Phrase Structure Grammar (Gazdar, Klein, Pullum, & Sag, 1985), and Head-driven PSG (Pollard & Sag, 1994)—but like LFG (Bresnan, 1982) and the Minimalist program (Chomsky, 1995)—a single nondisjunctive lexical entry governs both in situ and extracted arguments of the verb. Second, the latter kind of dependencies are established without the overhead for the learner of empty categories and functional uncertainty or movement, of the kind used in LFG and the Minimalist Program.

These properties of CCG, together with its low near-context-free expressive power and simplicity of expression, have made it attractive both for semantic parsing, and for our purposes here. Nevertheless, the presented framework can in principle be implemented with any compositional grammar formalism as long as (1) it allows for the effective enumeration of all possible syntactic/semantic derivations given a sentence paired with its meaning representation; and (2) it is associated with a probabilistic model that decomposes over these derivations.

Using our incremental learning algorithm, the learner is trained on utterances from the Eve corpus (Brown, 1973, Brown and Bellugi, 1964) in the CHILDES database (MacWhinney, 2000), with meaning representations produced automatically from an existing dependency annotation of the corpus (Sagae, Davis, Lavie, MacWhinney, & Wintner, 2010). We use these automatically produced meaning representations as a proxy for the child’s actual meaning representation in the hidden conceptual language of mind. (Crucially, our model entirely ignores the alignment in these data between logical constants and English words, as if the sentences were in a completely unknown language.)

We evaluate our model in several ways. First, we test the learner’s ability to correctly produce the meaning representations of sentences in the final Eve session (not included in the training data). This is a very harsh evaluation, since the learner is trained on only a small sample of the data the actual child Eve was exposed to by the relevant date. Nevertheless, we show a consistent increase in performance throughout learning and robustness to the presence of distractor meaning representations during training. Next, we perform simulations showing that a number of disparate phenomena from the language acquisition literature fall out naturally from our approach. These phenomena include sudden jumps in learning without explicit parameter setting; acceleration of word-learning (the “vocabulary spurt”); an initial bias favoring the learning of nouns over verbs; and one-shot learning of words and their meanings, including simulations of several experiments previously used to illustrate syntactic bootstrapping effects. The success of the model in replicating these findings argues that separate accounts of semantic and syntactic bootstrapping are unnecessary; rather, a joint learner employing statistical learning over structured representations provides a single unified account of both.

Section snippets

Semantic bootstrapping for grammar acquisition

The premise of this work is that the child at the onset of language acquisition enjoys direct access to some form of pre-linguistic conceptual representations. We are not committed to these representations taking any particular symbolic or non-symbolic form, but for the compositional learning process to proceed, there must be some kind of structure in which complex concepts can be decomposed into more primitive concepts, and here we will abstractly represent this conceptual compositionality

Simulations

We conduct a range of simulations with our model, looking at four main types of effects: (1) learning curves, both in terms of the model’s overall generalization ability, and its learning of specific grammatical phenomena; (2) syntactic bootstrapping effects, where previously acquired constructions accelerate the pace at which words are learned (we show overall trends and simulate specific behavioral experiments); (3) one-shot learning effects, showing that the model is able to infer the

General discussion

We have presented an incremental model of language acquisition that learns a probabilistic CCG grammar from utterances paired with one or more potential meanings. The learning model assumes no knowledge specific to the target language, but does assume that the learner has access to a universal functional mapping from syntactic to semantic types (Klein & Sag, 1985), as well as a Bayesian model favoring grammars with heavy reuse of existing rules and lexical types. It is one of only a few

Conclusion

This paper has presented, to our knowledge, the first model of child language acquisition that jointly learns both word meanings and syntax and is evaluated on naturalistic child-directed sentences paired with structured representations of their meaning. We have demonstrated that the model reproduces several important characteristics of child language acquisition, including rapid learning of word order, an acceleration in vocabulary learning, one-shot learning of novel words, and syntactic

Acknowledgements

We thank Julia Hockenmaier and Mark Johnson for guidance, and Inbal Arnon, Jennifer Culbertson, and Ida Szubert for their feedback on a draft of this article.

References (172)

  • D. Jurafsky

    A probabilistic model of lexical and syntactic access and disambiguation

    Cognitive Science

    (1996)
  • D. Klein et al.

    Natural language grammar induction with a generative constituent-context model

    Pattern Recognition

    (2005)
  • O. Abend et al.

    Improved unsupervised POS induction through prototype discovery

  • Alishahi, A., & Chrupała, G. (2012). Concurrent acquisition of word meaning and lexical categories. In Proceedings of...
  • Alishahi, A., Fazly, A., & Stevenson, S. (2008). Fast mapping in word learning: What probabilities tell us. In...
  • A. Alishahi et al.

    A computational model of early argument structure acquisition

    Cognitive Science

    (2008)
  • A. Alishahi et al.

    A computational model of learning semantic roles from child-directed language

    Language and Cognitive Processes

    (2010)
  • J. Allen et al.

    The emergence of grammaticality in connectionist networks

    The Emergence of Language

    (1999)
  • B.R. Ambati et al.

    Hindi CCGbank: A CCG treebank from the Hindi dependency treebank

    Language Resources and Evaluation

    (2017)
  • B. Ambridge et al.

    Child language acquisition: Why universal grammar doesn’t help

    Language

    (2014)
  • Artzi, Y., Das, D., & Petrov, S. (2014). Learning compact lexicons for CCG semantic parsing. In Proceedings of the 2014...
  • R. Atkinson et al.

    Introduction to mathematical learning theory

    (1965)
  • M. Auli et al.

    A comparison of loopy belief propagation and dual decomposition for integrated CCG supertagging and parsing

  • Baldridge, J. (2002). Lexically specified derivational control in Combinatory Categorial Grammar (Unpublished doctoral...
  • Barak, L., Fazly, A., & Stevenson, S. (2013). Modeling the emergence of an exemplar verb in construction learning. In...
  • Beal, M. J. (2003). Variational algorithms for approximate Bayesian inference (Unpublished doctoral dissertation)....
  • Becker, M. (2005). Raising, control, and the subset principle. In Proceedings of the 24th West Coast conference on...
  • Beekhuizen, B. (2015). Constructions emerging: A usage-based model of the acquisition of grammar (Unpublished doctoral...
  • Beekhuizen, B., Bod, R., Fazly, A., Stevenson, S., & Verhagen, A. (2014). A usage-based model of early grammatical...
  • R. Berwick

    The acquisition of syntactic knowledge

    (1985)
  • P. Boersma et al.

    Empirical tests of the gradual learning algorithm

    Linguistic Inquiry

    (2001)
  • D. Bolinger

    Forms of English

    (1965)
  • M. Bowerman

    Structural relationships in children’s utterances: Syntactic or semantic?

  • Bresnan, J., & Nikitina, T. (2003). On the gradience of the dative alternation. Unpublished manuscript. Stanford...
  • R. Brown

    A first language: The early stages

    (1973)
  • R. Brown et al.

    Three processes in the child’s acquisition of syntax

  • Buttery, P. (2006). Computational models for first language acquisition (Unpublished doctoral dissertation). University...
  • S. Calhoun

    The centrality of metrical structure in signaling information structure: A probabilistic perspective

    Language

    (2010)
  • S. Calhoun et al.

    The NXT-format Switchboard corpus: A rich resource for investigating the syntax, semantics, pragmatics, and prosody of dialog

    Language Resources and Evaluation

    (2010)
  • E. Cauvet et al.

    Function words constrain on-line recognition of verbs and nouns in French 18-month-olds

    Language Learning and Development

    (2014)
  • R. Çakıcı

    Automatic induction of a CCG grammar for Turkish

  • Chang, N. C.-L. (2008). Constructing grammar: A computational model of the emergence of early constructions....
  • Charniak, E. (1997). Statistical parsing with a context-free grammar and word statistics. In Proceedings of the 14th...
  • N. Chomsky

    Aspects of the theory of syntax

    (1965)
  • N. Chomsky

    Lectures on government and binding

    (1981)
  • N. Chomsky

    The minimalist program

    (1995)
  • C. Christodoulopoulos et al.

    Two decades of unsupervised POS tagging—How far have we come?

  • Chrupała, G., Kádár, Á., & Alishahi, A. (2015). Learning language through pictures. In Proceedings of the 53nd annual...
  • S. Clark et al.

    Parsing the WSJ using CCG and log-linear models

  • Cited by (69)

    • Communicative Feedback in language acquisition

      2023, New Ideas in Psychology
      Citation Excerpt :

      to request water). However, more recent theories on language acquisition do highlight synergies when learning form and meaning (e.g., Abend et al., 2017; Babineau et al., 2022; Christophe et al., 2008; Dupoux, 2018; Feldman et al., 2013; Fourtassi et al., 2020; Landau & Gleitman, 1985; Räsänen & Rasilo, 2015), as well as when leveraging information about how language is used in context to learn various linguistic structures (Bohn & Frank, 2019; Clark, 2016, 2018; Tomasello, 2005). Most relevant to our proposal are the studies showing that children do not wait to have mastered the form and meaning before they start using language to communicate with people around them (Bates et al., 1975; Halliday, 1975; Snow et al., 1996).

    • Neuro-cognitive development of semantic and syntactic bootstrapping in 6- to 7.5-year-old children

      2021, NeuroImage
      Citation Excerpt :

      A core question in this theoretical debate is how the systems for meaning and grammar interact to support language competence across development. One mechanism that characterizes this dynamic relationship is the bootstrapping account of language development (e.g., Abend et al., 2017; Pinker, & MacWhinney, 1987). Decades of research in child language acquisition suggest that children's vocabulary development is foundational for acquiring grammar (i.e., semantic bootstrapping; Bates & Goodman, 1997; Dale et al., 2000; Marchman & Bates, 1994).

    View all citing articles on Scopus

    The work was supported in part by ERC Advanced Fellowship 249520 GRAMPLUS, EU IST Cognitive Systems IP EC-FP7-270273 Xperience, ARC Discovery grants DP 110102506 and 160102156, and a James S McDonnell Foundation Scholar Award.

    1

    Now at the Departments of Computer Science & Cognitive Science, The Hebrew University of Jerusalem.

    2

    Now at Google Research.

    3

    Now at the Berkeley Institute of Data Science, University of California, Berkeley.

    View full text