A model for multi-label classification and ranking of learning objects

https://doi.org/10.1016/j.eswa.2012.02.021Get rights and content

Abstract

This paper describes an approach that uses multi-label classification methods for search tagged learning objects (LOs) by Learning Object Metadata (LOM), specifically the model offers a methodology that illustrates the task of multi-label mapping of LOs into types queries through an emergent multi-label space, and that can improve the first choice of learners or teachers. In order to build the model, the paper also proposes and preliminarily investigates the use of multi-label classification algorithm using only the LO features. As many LOs include textual material that can be indexed, and such indexes can also be used to filter the objects by matching them against user-provided keywords, we then did experiments using web classification with text features to compare the accuracy with the results from metadata (LO feature).

Highlights

► We use multi-label classification methods for search tagged learning objects (LOs). ► The methodology illustrates the task of multi-label mapping of LOs into types queries. ► We use of multi-label classification algorithm using only the LOs features. ► We also did experiments using web classification with text features. ► Multi-label classifiers such as RAKEL was very effective.

Introduction

The concept of reusable learning objects has evolved into a central component within the current context of e-learning. Chiappe, Segovia, and Rincon (2007) recently described a learning object (LO) as a digital, self-contained and reusable entity with a clearly instructional content, containing at least three internal and editable components: content, learning activities, and elements of context. Additionally, LOs should have an external information structure, the metadata, which can facilitate its identification, storage and retrieval. Given this definition, it is possible to arrive at a certain consensus regarding LOs: they must be a minimal content unit (self-contained) that intends to teach something (instructional purpose) and can be reused (reusability) on different platforms without any compatibility problems.

A study by Bauer and Stefan (2008) pointed out that for administration and exchange of LOs, meaningful metadata are required. Typically, learning material is not limited to text, but includes multimedia content, such as images, audio and video. Metadata not only describe the content, but also refer to e.g. didactical methods, domain of usage and relationships to other LOs (Motelet, Baloian, & Pino, 2006). “The feasibility of the LO paradigm strongly depends on having efficient mechanisms for retrieving relevant LOs for each application context. This can be achieved by tagging LOs with metadata, which will allow for cataloging and classifying them” (Sierra & Fernández-Valmayor, 2008).

Currently, web sites have introduced a number of innovative techniques, known as Web 2.0 that allows its users to interact with others to exchange content, in contrast to non-interactive Web sites where users are limited to passive viewing information. These techniques have changed the way people create, share and organize information on the Web, encouraging the active involvement of end users. The advent of Web technologies allowing for large numbers of users to participate in content production, sometimes termed “collective intelligence” by OReilly (2008) emerging from the contribution of many has been discussed as a promising phenomenon that requires further investigation. The gaining, recovering and classification of LOs for each application context can be achieved by tagging LOs with metadata: the annotation of LOs could be moved from few authors to a potentially much larger number of users with what has come to be called “collaborative tagging” (Bauer & Stefan, 2008).

This paper describes an approach that uses multi-label classification methods for searching LOs tagged by Learning Object Metadata (LOM) (IEEE-LTSC, 2002), specifically the model offers a methodology that illustrates the task of multi-label mapping of LOs into types queries through an emergent multi-label space, and that can improve the first choice of learners or teachers. In order to build a model to classify and catalog the LOs in types queries, the paper also proposes and preliminarily investigates the use of multi-label classification algorithm using only the LO features. As many LOs include textual material that can be indexed, and such indexes can also be used to filter the objects by matching them against user-provided keywords, we then did experiments using web classification with text features to compare the accuracy with the results from metadata (LO feature).

This paper is structured as follows: Section 2 explains the main concepts and characteristics that establish LOs as the fundamental base within the current context of web-based e-learning. In Section 3, we present the tagging for LOs. Section 4 provides some background information on the problem of multi-label classification, the details of the dataset used in this paper and experimental results comparing the two multi-label classification algorithms. In Sections 5 Initial experiments of web classification using text features, 6 Initial experiments of LO classification using LO feature extraction we present the results of experiments on two datasets: using web classification with text features and metadata (LO feature), to compare the accuracy of the results. We conclude with Section 7, which explains some of the more relevant aspects and Section 8 the future work.

Section snippets

Current context of e-learning

Existing LO standards and specifications focus on facilitating the search, evaluation, acquisition, and reuse of LOs so that they can be shared and exchanged across different learning systems. The most notable standards used for LOs with metadata are: DublinCore (DCMI, 2007) and, most importantly, the IEEE-LOM (IEEE-LTSC, 2002). Since 2002, the Learning Object Metadata (LOM) has been the standard for specifying the syntaxes and semantics of LOM. It uses a hierarchical structure that is commonly

Using tagging for LOs

A tag is simply a word you use to describe a Web resource; tags are the most popular terms with which the user describes these resources. Therefore tags promise to be a unique source of information about the similarity between resources, a common form of navigation and organization of these resources. The tagging is made from the emergence of social software applications such as Delicious,, Flickr,. A study by Begelman et al. (2006) pointed out that tagging is a great collaboration tool.

Multi-label classification

In this research what is intended to be demonstrated is that multi-label classification can be applied to the organization of LOs to illustrate the idea of using collaborative tagging in finding a LO between learning materials of different heterogeneous LOR. According to Tsoumakas, Katakis, and Vlahavas (2010) the learning from multi-label data has attracted recently significant attention, motivated by an increasing number of new applications, to name a few typical like: social network (Mika,

Data set

The data set for the initial experimental study contained 1000 examples from two repositories: Lornet and Merlot. For each example, the title and description were used to extract the text features for the classification, and the class labels came from the keyword field.

Text feature extraction

From the title and description of the 1000 examples, we extracted 997 terms (features) after removing stop words and stemming, which are two widely used text preprocessing methods in text mining and information retrieval. We used

Data set

For the experiment, we used the same metadata (LO features) like in Section 5.1 on the same 1000 examples to perform classification using MULAN to compare the accuracy with the results from pure text features.

Classification

Table 2 shows the experiment’s classification results using the LO features, which indicate that the RAKEL algorithm is more efficient than MLkNN in four measures (Accuracy, Hamming Loss, One-Error, Average Precision). Furthermore, the MLkNN algorithm significantly outperforms RAKEL in

Conclusions

The search and classification of services for educational content, and specifically LOs, presented in this report constitute the core of the development of distributed, open computer-based educational systems. For this reason, the research in this area has been so active in recent years.

We also tried to utilize a multi-label classification algorithm in order to build a model to classify and catalog the LOs in types queries.

The RAKEL algorithm was very effective; it significantly outperforms all

Future work

For future work we want to experiment with:

  • 1.

    From a data collection point of view:

    • Obtain more examples with, relatively, a fewer number of labels. And try to see if we can obtain data with multi-labels if applicable.

    • Obtain more external text information, such as tags and comments.

    • If we can find some examples whose keywords are of hierarchical structures, that would be better.

  • 2.

    Combining these two kinds of features in various ways, for example, combining features, optimization-based integration. If

References (41)

  • M. Boutell et al.

    Learning multi-label scene classification

    Pattern Recognition

    (2004)
  • M.L. Zhang et al.

    Mlknn: A lazy learning approach to multi-label learning

    Pattern Recognition

    (2007)
  • M. Bauer et al.

    Thalmann metadata generation for learning objects an experimental comparison of automatic and collaborative solutions

    (2008)
  • Begelman, G., Keller P., & Smadja, F. (2006). Automated tag clustering: Improving search and exploration in the tag...
  • Brinker, K., Fürkranz, J., & Hüllermeier, E. (2006). A unified model for multi-label classification and ranking. In...
  • C. Cattuto et al.

    Semiotic dynamics and collaborative tagging

    PNAS

    (2007)
  • A. Chiappe et al.

    Toward an instructional design model based on learning objects

    Educational Technology Research and Development

    (2007)
  • DublinCore Metadata Initiative (DCMI) (2007)....
  • Del.icio.us....
  • Diplaris, S., Tsoumakas, G., Mitkas, P., & Vlahavas, I. (2005). Protein classification with multiple algorithms. In...
  • Flickr....
  • Goarany, K., Kulczycki, G., & Blake, M.B. (2010). Mining social tags to predict mashup patterns, SMUC10, ACM...
  • S.A. Golder et al.

    The structure of collaborative tagging systems

    (2005)
  • Hassan-Montero, Y., & Herrero-Solana, V. (2006). Improving tag-clouds as visual information retrieval interfaces. In...
  • IEEE Learning Technology Standard Committee (IEEE-LTSC) (2002). WG12 Learning Object Metadata....
  • Katakis, I., Tsoumakas, G., & Vlahavas, I. (2008). Multilabel text classification for automated tag suggestion. In...
  • Li, L., & Ogihara, M. (2003). Detecting emotion in music. In Proceedings of the international symposium on music...
  • Mathes, A. (2004). Folksonomies cooperative classification and communication through shared metadata. Computer Mediated...
  • McCallum, A. (1999). Multi-label text classification with a mixture model trained by EM. In Proceedings of the AAAI 99...
  • Mika, P. (2005). Ontologies are us: A unified model of social networks and semantics. In Proceedings of...
  • Cited by (28)

    • Methods of expert estimations concordance for integral quality estimation

      2014, Expert Systems with Applications
      Citation Excerpt :

      Each scale defines a method of transformation to be applied: for instance, any monotonic transformation can be applied to the ordinal scale. Decision making and preference learning propose several methods to estimate the integral quality of objects (Fuernkranz & Huellermeier, 2011; Lopez, de la Prieta, Ogihara, & Wong, 2012). Unsupervised methods construct the estimation using the objects description and the quality criterion.

    • Recommendation of programming activities by multi-label classification for a formative assessment of students

      2013, Expert Systems with Applications
      Citation Excerpt :

      In Profile 6, c10, a non-pertinent class, has been recommended above c19, a pertinent class. The multi-label classification is a viable strategy for recommendation systems based on similarities found in both items (Katakis et al., 2008; Song et al., 2011; López et al., 2012) and user profiles. In this paper, we demonstrate the feasibility of this last approach through the ML-kNN algorithm.

    View all citing articles on Scopus
    View full text