Top

Artificial Life and Robotics

Published in:

05-04-2024 | Original Article

Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations

Authors: Kazuteru Miyazaki, Masaaki Ida

Published in: Artificial Life and Robotics | Issue 2/2024

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Character-level convolutional neural networks (CLCNNs) are commonly used to classify textual data. CLCNN is used as a more versatile tool. For natural language recognition, after decomposing a sentence into character units, each unit is converted into a corresponding character code (e.g., Unicode values) and the code is input into the CLCNN network. Thus, sentences can be treated like images. We have previously applied a CLCNN to verify whether a university’s diploma and/or curriculum policies are well written. In this study, we experimentally confirm the effectiveness of CLCNN using tweet data. In particular, we focus on the effect of the number of units on performance using the following two types of data; one is a real and public tweet dataset on the reputation of a cell phone, and the other is the NTCIR-13 MedWeb task, which consists of pseudo-tweet data and is a well-known collection of tests for multi-label problems. Results of experiments conducted by varying the number of units in the all-coupled layer confirmed the agreement of the results with the theorem introduced in the Amari’s book (Amari in Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., 2014). Furthermore, in the NTCIR-13 MedWeb task, we analyze two kinds of experiments, the effects of kernel size and weight perturbation. The results of the difference in the kernel size suggest the existence of an optimal kernel size for sentence comprehension. The results of perturbations to the convolutional layer and pooling layer indicate the possibility of relationship between the numbers of degrees of freedom and network parameters.

previous article Analysis of spatial distribution characteristics of facial skin temperature on stress coping

next article Self-adjusting PID control system using a neural network for a binary power plant

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

The three policies are: diploma policy, which concerns graduation certification; curriculum policy, which concerns course contents and their organization; and admission policy, which concerns enrollment acceptance.

These figures are recreated based on Refs. [12, 13].

This figure is recreated based on Ref. [12].

Although it is possible to increase the percentage of exact matches by also performing dropout after batch normalization, it is not used in this paper to ascertain the effect of perturbations more accurately.

This table is reprinted from Ref. [15].

Amari S (2014) Mathematical Science New Development of Information Geometry, For Senior & Graduate Courses. SAIENSU-SHA Co., Ltd ((in Japanese))

Belkin M, Hsu D, Ma S, Mandal S (2019) Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc Natl Acad Sci 116(32):15849–15854MathSciNetCrossRef

Hastie T, Montanari A, Rosset S, Tibshirani RJ (2019) Surprises in high-dimensional least squares interpolation. arXiv:1903.08560

Keskar NS, Nocedal J, Tang PTP, Mudigere D, Smelyanskiy M (2019) On large-batch training for deep learning: generalization gap and sharp minima. In: International Conference on Learning Representations

Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp 1746–1751

Miyazaki K, Ida M (2012) Proposal and evaluation of the active course classification support system with exploitation-oriented learning, the 9th European workshop on reinforcement learning (EWRL-9), Sept. 9, 2011. Athens Royal Olympic Hotel, Lecture Notes in Computer Science 7188:333–344CrossRef

Miyazaki K, Ida M, Yoshikane F, Nozawa T, Kita H (2005) Development of a course classification support system for the awarding of degrees using syllabus data. IPSJ J 46(3):782–791 (in Japanese)

Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781

Miyazaki K, Takahashi N, Mori R (2019). Research on Consistency between Diploma Policies and Nomenclature of Major Disciplines: Deep Learning Approach, Proc. of 2019 7th International Conference on Information and Education Technology (ICIET2019)

10.

Miyazaki K, Ida M (2020) Construction of Consistency Judgment System of Diploma Policy and Curriculum Policy using Character-level CNN. Electronics and Communications in Japan 102(12):30–39CrossRef

11.

Miyazaki K (2020). Classification of Medical Data using Character-level CNN, The 3rd International Conference on Information Science and System, pp.43-47

12.

Miyazaki K, Ida M (2021). Evaluation of Character-level CNN using NTCIR-13 MedWeb Task, 2021 Annual Conference on Electronics, Information and Systems Institute of Electrical Engineers of Japan (IEEJ), 6 pages (in Japanese)

13.

Miyazaki K, Ida M (2021). Evaluation of Character-Level CNNs using the NTCIR-13 MedWeb Task, the 22nd International Symposium on Advanced Intelligent Systems (ISIS2021), 6 pages

14.

Miyazaki K, Yamaguchi S, Mori R, Yoshikawa Y, Saito T, Suzuki T (2022). Proposal and evaluation of a course classification support system emphasizing communication with the sub-committees within the Committee of Validation and Examination for Degrees, Preliminary Soft-Proceedings 4th EAI International Conference on Artificial Intelligence for Communications and Networks, pp.122-129

15.

Miyazaki K, Ida M (2023). Effectiveness of Character-level CNN and its Examination of Perturbation for Weights, 28th International Symposium on Artificial Life and Robotics (AROB 28th 2023), 5 pages

16.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) 2017. Attention Is All You Need, Neural Information Processing Systems (NIPS

17.

Wakamiya S, Morita M, Kano Y, Ohkuma T, Aramaki E (2017). Overview of the NTCIR-13 MedWeb Task, In Proceedings of the 13th NTCIR Conference on Evaluation of Information Access Technologies (NTCIR-13), pp. 40-49

18.

http://research.nii.ac.jp/ntcir/permission/ntcir-13/perm-en-MedWeb.html [accessed: 2023-12-21]

19.

Yanaka H, Mineshima K (2022). Compositional Evaluation on Japanese Textual Entailment and Similarity (arXiv, data), Transactions of the Association for Computational Linguistics (TACL), Vol.10, pp.1266-1284

20.

Yang Y, Zhang Y, Tar C, Baldridge J (2019). PAWS-X: A cross- lingual adversarial dataset for paraphrase identification, In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp.3687-3692

21.

Zhang X, Zhao J, LeCun Y (2015). Characterlevel Convolutional Networks for Text Classification, arXiv:1509.01626

22.

https://www.db.info.gifu-u.ac.jp/sentiment_analysis/ [accessed: 2023-12-21]

23.

https://retty.me [accessed: 2023-12-21]

Title: Performance evaluation of character-level CNNs using tweet data and analysis for weight perturbations
Authors: Kazuteru Miyazaki
Masaaki Ida
Publication date: 05-04-2024
Publisher: Springer Japan
Published in: Artificial Life and Robotics / Issue 2/2024
Print ISSN: 1433-5298
Electronic ISSN: 1614-7456
DOI: https://doi.org/10.1007/s10015-024-00944-9

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Other articles of this Issue 2/2024

Investigation of a single object shape for efficient learning in bin picking of multiple types of objects

Feasibility study of wearable capillary refill time measurement device

Self-adjusting PID control system using a neural network for a binary power plant

Investigating the effects of material ratio scenarios on soft robot design based on morphology–material–control coevolution

Urban scale pedestrian simulation in Kobe City center

A CPG-based gait planning method for bipedal robots