21st CS is often used as an umbrella term to describe a wide range of competencies, “habits of mind”, and attributes considered central to citizenship in the 21st century (Voogt et al.
2013). Theoretical constructs commonly employed and studied under this perspective are, for instance, creativity and critical thinking, complex problem solving, communication and collaboration, self-regulation, computer and information literacy (e.g. see Geisinger
2016; Griffin and Care
2014 for conceptual clarification). These 21st CS are considered to be of growing importance in the context of current and future societal challenges, in particular with regard to the changing nature of learning and work driven by the evolution of the job market under the influence of automatization, digitalization and globalization. The discussion around 21st CS also emphasises competencies which enable responsible action when faced with complex and often unpredictable problems of societal relevance. Increasingly, this shift of focus towards complex competencies relating to authentic, complex problems is also being called for in current psychological research on problem solving, where the past emphasis on primarily cognitive and rational forms of learning is being criticised (compare Dörner and Funke
2017). Here, we discuss first the challenges of clarifying the constructs of 21st century learning in order to consider how they might be measured. Then, we examine how IT may enable such assessment.
4.1 Challenges of Clarifying Constructs for Assessment of 21st-Century Skills
21st CS are complex constructs comprising multiple elements, attitudes, behaviours or modes of action and thought that are transferable across situations and contexts. Many of these constructs lack a sharp definition and/or display varying degrees of theoretical overlap in their definitions or the meaning of their sub-concepts (Shute et al.
2016). To give an example, the construct Collaborative Problem Solving (CoIPS), described in Care et al. (
2016), includes sub-concepts (e.g., critical thinking) which also appear in other constructs such as creativity (Lucas
2016). Certain skills defined in the construct “Computer and Information Literacy”, for example “evaluating information” (Ainley et al.
2016) form a part of the CoIPS construct, and so forth. This overlap on the level of theoretical constructs becomes even more pronounced on the level of concept operationalisation in the shape of certain behavioural, cognitive and emotional patterns (Shute et al.
2016).Many of the 21st CS constructs, such as collaborative and complex problem solving and computer and information literacy, have recently been studied more closely in comprehensive research projects such as those associated with the PISA surveys and the “Assessment and Teaching of 21st Century Skills” (ATC21) Project (for example see Griffin and Care
2014). However, incorporation into curricula and integration of formative and summative assessment practices of 21st CS in schools often lags behind (Erstad and Voogt
2018). On the one hand, typical barriers might be attributed to certain social and educational policies, such as the traditional organisation of the curriculum by subjects or accountability structures which prioritise typical indicators of academic success, such as mathematics, science, or language literacy. On the other hand, the complexity of 21st CS constructs presents another significant challenge to their assessment, which can only insufficiently be addressed by the classic repertoire of methods, e.g., multiple-choice questions or self-report measures (Ercikan and Oliveri
2016; Shute and Rahimi
2017). Furthermore, 21st CS contain an assortment of diverse, but interconnected skills and competencies, which are latent and thus not directly measurable constructs. Therefore, we argue that they must first be linked to specific complex and context-dependent, and therefore possibly dynamic, behavioural patterns via a theoretical model. If, for example, the aim is to assess the quality of collaboration in a group, a number of questions arise: What would constitute a good measure? The quality of the end-product, the creativity of the solution or the satisfaction of the team members with the social interactions in that group? Normative considerations enter the equation here as well. Furthermore, how do different patterns of learning activities relate to a (latent) trait, e.g., creativity? And how stable are these patterns with regard to different types of problems, or social/cultural contexts of the learning situation? The translation of theoretical (and normative) considerations into an adequate measurement model and the derivation of meaningful interpretations of learners’ performances which then enable possible adjustments of learning processes is not only important for summative measurement. When making use of the new possibilities for tracking and analysis of learning activities in digital environments it is crucial to explicitly state and theoretically justify ascriptions of meaning and possibilities for interpretation when analysing this data in the context of formative assessment.
4.2 New Opportunities Provided by IT for Assessing 21st Century Skills
Considering the challenges for formative assessment of 21st CS that go hand in hand with the endeavour to capture, visualise and feedback these complex cognitive, emotional and behavioural patterns, IT-based developments create high hopes for new opportunities (Shute and Rahimi
2017; Webb and Gibson
2015). An example would be the assessment of multidimensional learner characteristics, such as cognitive, metacognitive and affective, using authentic digital tasks, such as games and simulations (Shute and Rahimi
2017). Working in digital learning environments also brings with it a set of expanded possibilities with respect to documentation and analysis of large and highly complex sets of data on learning processes, including log-file and multichannel data, in varying learning scenarios (Ifenthaler et al.
2017). For example, the retrieval of the time dimension, the context, and the sequence of occurrence of different behaviours, which could also involve the use of certain strategies, the posting of certain comments or the retrieval of specific learning content at given times in the problem-solving process, allow for the digital analysis of these “traces of learning” through sequence analysis or social network analysis. Furthermore, behavioural patterns of interest can be combined with data derived through more “traditional” methods, such as test-scores for digital literacy, self-report measures for motivation, self-efficacy, personality or information obtained from data in open language-based formats, e.g., reflective thoughts in chats, blogs or essays, which can be put through digitally assisted analysis e.g., natural language processing.
Some of the current research on digitally assisted assessment explicitly focuses on the “theory-driven measurement” of 21st CS. Examples are recently designed tests for collaborative problem solving (Herde and Greiff
2016), complex problem solving (Greiff et al.
2016) or ICT-literacy (Ainley et al.
2016; Siddiq et al.
2017). In tests for collaborative problem solving, as developed in the international project, ATC21S (Griffin and Care
2014), as well as in the PISA assessments (Herde et al.
2016), learners interact with peers (ATC21S) or an intelligent virtual agent (PISA) to solve problems of varying complexity. These assessments use (more or less) controlled digital learning scenarios for capturing and analysing a variety of behavioural process data in order to create indicators which form scale values, competence levels or prototypes. A game-based example is the learning environment “Use Your Brainz”, where four areas of problem-solving competence can be assessed: analysing, planning, using tools and resources, monitoring and evaluating (Shute et al.
2016). The development of these tests provides a good illustration of the complexity of the design process, starting with theory-based modelling of analytic categories, the development of a learning environment in which the heterogeneous data sources can be captured, and the design of supportive tools for automated analysis and feedback. Feedback, in these test environments, is usually designed for teachers, researchers or other stakeholders in educational administration, who can identify areas of development for learners or classrooms. The challenge remains to identify the types of information and the feedback format that will provide effective learning impulses directly to learners, as discussed earlier.
In addition to the body of research focusing on theory-driven measurement, other studies take what might be characterised as a more “data-driven” approach. Here, the new possibilities for continuous “quiet” capture and analysis of rich process data in digital learning environments, such as learning management systems, blogs, wikis etc., can be used to explore and identify behavioural patterns in relation to 21st CS. For example, specific performance outcomes may be measured, or certain learning patterns or “error patterns” may be correlated with a large number of other user data, to allow predictions regarding effective next steps towards obtaining specific skills, such as critical thinking. Greif et al. (2016), for instance, analysed log-files of performance data from a computer-based assessment of complex problem solving using the “MicroDYN approach”. They found certain behavioural patterns were associated with better performance. Similarly, the identification of particular decision patterns occurring during a digital game can be typical of pupils with differing creative thinking skills. In addition, with regard to automated assessments of collaborative processes, the knowledge contributions and interaction patterns of different learners can be analysed in real time and compared with ideal/typical interaction patterns in order to derive recommendations for the use of effective cooperation strategies for learners or for effective group compositions for teamwork (Berland et al.
2015; Fidalgo-Blanco et al.
2015). Going beyond the data analysis process to provide a tool to enable learners to engage in peer support, Lin and Lai (
2013) used Social Network Analysis, as discussed earlier.
In both the theory- and the data-driven approach, the focus is often on the identification of meaningful information from which recommendations for the next steps of the learning process can be derived. Although these steps are not always fully automated, the results of the data analysis guide and structure the decisions of learners and teachers to a large extent. If, instead, one focuses on the processing of data by the learners themselves, real-time feedback can be seen as the trigger for self-regulating, cognitive and metacognitive learning processes and thus contributes to the development of competences in this area. Generally speaking, this applies to most 21st CS, which in some form all include reflexive, metacognitive processes, whether it is about adopting differing perspectives in collaborative problem solving, weighing up diverse lines of argumentation and reflecting one’s personal attitudes in critical thinking or the use of particular problem-solving heuristics in creative thinking. Research here focuses on the development of tools for the visualization and presentation of data for learners and for pedagogical scaffolding of learning processes in order to initiate effective cognitive and metacognitive processes. Computer-based assessments for learning which include such tools, e.g., articulating the rationale for making a specific response, stating confidence, adding and recommending answer notes may support the development of self-regulated learning skills (see for example Chen
2014; Mahroeian and Chin
2013; Marzouk et al.
2016). In the context of self-regulated learning, the question of what kind of feedback will actually engage individual learners and motivate them to become self-regulated learners becomes critical (Tempelaar et al.
2013). In summary, research challenges for self-regulated learning in the context of assessment of 21st-century learning include:
Learning analytics and educational data mining generate high hopes for a renewed focus on formative assessment of 21st CS (Gibson and Webb
2015; Spector et al.
2016). In addition, a continuous unobtrusive background measurement of performance (Shute
2011; Webb et al.
2013) enables minimal disruption of learning processes and immediate feedback, which is very important for the automation and routinisation of self-regulatory learning strategies. Furthermore, progress with automated, real-time natural language processing opens new possibilities in the area of reflective and critical thinking. However, meaningful analysis of the data collected is often very difficult and requires strong theoretical grounding and modelling as well as verification of validity, gained for instance in complex evidence-based design processes. Due to the complexity of the 21st CS constructs, validity of detected behavioural patterns should be investigated in a comprehensive manner, i.e., not only via correlations with certain outcome measures, but also by identification of causal chains that lead to such outcomes. Case studies using think-aloud protocols might be a promising approach here (Siddiq and Scherer
2017). With regard to validity issues of complex and collaborative problem solving, formative assessment of 21st CS should aim to address authentic and complex learning opportunities and, when possible, not limit itself to “simpler” problems for ease of measurement (Dörner and Funke
2017). In this context, game environments and virtual worlds have a great potential for development, but require a concerted interdisciplinary effort by a variety of stakeholder groups.