A large-scale sentence pattern dictionary (SP-dictionary) for Japanese compound and complex sentences has been developed. The dictionary has been compiled based on the
non-compositional language model
. Sentences with 2 or 3 predicates are extracted from a Japanese-to-English parallel corpus of 1 million sentences, and the compositional constituents contained within them are generalized to produce a SP-dictionary containing a total of 215,000 pattern pairs. In evaluation tests, the SP-dictionary achieved a syntactic coverage of 92% and a semantic coverage of 70%.