nach oben

Erschienen in:

2020 | OriginalPaper | Buchkapitel

PatentTransformer-1.5: Measuring Patent Claim Generation by Span Relevancy

verfasst von : Jieh-Sheng Lee, Jieh Hsiang

Erschienen in: New Frontiers in Artificial Intelligence

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

PatentTransformer is our codename for patent text generation based on Transformer-based models. Our long-term goal of patent claim generation is to realize “augmented inventing” for inventors by leveraging new Deep Learning techniques. We envision the possibility of building an “auto-complete” function for inventors to conceive better inventions in the era of artificial intelligence. In order to generate patent claims with reasonable quality, a fundamental question is how to measure the quality. In PatentTransformer-1.5, we tackle the problem from the perspective of claim span relevancy as a proof of concept. Patent claim language was rarely explored in the NLP field. In this work, we propose a span-based approach and a generic framework to measure patent claim generation quantitatively. In order to study the effectiveness of patent claim generation, we define a metric to measure whether two consecutive spans in a generated patent claims are relevant. We treat such relevancy measurement as a span-pair classification problem, following the concept of natural language inference. Technically, the span-pair classifier is implemented by fine-tuning a pre-trained language model. The patent claim generation is implemented by fine-tuning the other pre-trained model. Specifically, we fine-tune a pre-trained Google BERT model to measure the patent claim spans generated by a fine-tuned OpenAI GPT-2 model. In this way, we re-use two of the state-of-the-art pre-trained models in the NLP field. Our result shows the effectiveness of the span-pair classifier after fine-tuning the pre-trained model. It further validates the quantitative metric of span relevancy in patent claim generation. Particularly, we found that the span relevancy ratio measured by BERT becomes lower when the diversity in GPT-2 text generation becomes higher.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Exploring Relevant Parts Between Legal Documents Using Substructure Matching

Nächstes Kapitel A Summary of the COLIEE 2019 Competition

Cooperative Patent Classification System. https://www.cooperativepatentclassification.org. Accessed 2 Mar 2020

Lee, J.-S., Hsiang, J.: Patent Claim Generation by Fine-Tuning OpenAI GPT-2 (2019)

Lee, J.-S., Hsiang, J.: PatentBERT: Patent Classification with Fine-Tuning a pre-trained BERT Model (2019)

Radrof, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners. Technical report, OpenAI (2018)

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171–4186 (2018)

(GitHub) google-research/bert. https://github.com/google-research/bert. Accessed 2 Mar 2020

(GitHub) openai/gpt-2. https://github.com/openai/gpt-2. Accessed 2 Mar 2020

Wang, A., Cho, K.: BERT has a mouth, and it must speak: BERT as a Markov random field language model. In: Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation (NeuralGen), pp. 30–36 (2019)

10.

Liu, Y., et al.: RoBERTa: A Robustly Optimized BERT Pretraining Approach (2019)

11.

Song, K., Tan, X., Qin, T., Lu, J., Liu, T.-Y.: MASS: masked sequence to sequence pre-training for language generation. In: Proceedings of the 36th International Conference on Machine Learning, Long Beach, California (2019)

12.

Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R., Le, Q.V.: XLNet: generalized autoregressive pretraining for language understanding. In Advances in Neural Information Processing Systems 32 (NIPS 2019), Pre-proceedings, pp. 5754–5764 (2019)

13.

Sun, Y., et al.: ERNIE 2.0: a continual pre-training framework for language understanding. In: Accepted by AAAI 2020 (2019)

14.

GPT-2: 1.5B Release. https://openai.com/blog/gpt-2-1-5b-release. Accessed 2 Mar 2020

15.

USPTO Open Data Portal. https://developer.uspto.gov/. Accessed 2 Mar 2020

16.

Google Patents Public Datasets on BigQuery. https://console.cloud.google.com/bigquery?p=patents-public-data. Accessed 2 Mar 2020

17.

(GitHub) PatentTransformer. https://github.com/jiehsheng/PatentTransformer. Accessed 2 Mar 2020

18.

Cer, D., et al.: Universal sentence encoder for English. In: EMNLP 2018 - Conference on Empirical Methods in Natural Language Processing, System Demonstrations, Proceedings, pp. 169–174 (2018)

19.

Google Patents Public Datasets on BigQuery. https://console.cloud.google.com/bigquery?p=patents-public-data. Accessed 2 Mar 2020

20.

Universal-sentence-encoder (V4). https://tfhub.dev/google/universal-sentence-encoder/4. Accessed 2 Mar 2020

21.

Universal-sentence-encoder-large (V5). https://tfhub.dev/google/universal-sentence-encoder-large/5. Accessed 2 Mar 2020

Titel: PatentTransformer-1.5: Measuring Patent Claim Generation by Span Relevancy
verfasst von: Jieh-Sheng Lee
Jieh Hsiang
Verlag: Springer International Publishing
Buch: New Frontiers in Artificial Intelligence
Print ISBN: 978-3-030-58789-5

Electronic ISBN: 978-3-030-58790-1

Copyright-Jahr: 2020
DOI: https://doi.org/10.1007/978-3-030-58790-1_2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"