2017 | OriginalPaper | Buchkapitel
Patent Classification on Subgroup Level Using Balanced Winnow
verfasst von : Eva D’hondt, Suzan Verberne, Nelleke Oostdijk, Lou Boves
Erschienen in: Current Challenges in Patent Information Retrieval
Verlag: Springer Berlin Heidelberg
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
In the past decade research into automated patent classification has mainly focused on the higher levels of International Patent Classification (IPC) hierarchy. The patent community has expressed a need for more precise classification to better aid current pre-classification and retrieval efforts (Benzineb and Guyot, Current challenges in patent information retrieval. Springer, New York, pp 239–261, 2011). In this chapter we investigate the three main difficulties associated with automated classification on the lowest level in the IPC, i.e. subgroup level. In an effort to improve classification accuracy on this level, we (1) compare flat classification with a two-step hierarchical system which models the IPC hierarchy and (2) examine the impact of combining unigrams with PoS-filtered skipgrams on both the subclass and subgroup levels. We present experiments on English patent abstracts from the well-known WIPO-alpha benchmark data set, as well as from the more realistic CLEF-IP 2010 data set. We find that the flat and hierarchical classification approaches achieve similar performance on a small data set but that the latter is much more feasible under real-life conditions. Additionally, we find that combining unigram and skipgram features leads to similar and highly significant improvements in classification performance (over unigram-only features) on both the subclass and subgroup levels, but only if sufficient training data is available.