Abstract
Course quality is multifaceted, being determined by instructor, students, and external conditions. Consequently, any attempt at measurement should reflect this diversity, so that stable evaluations can be made that reflect both personal (instructor) and situational (student and external conditions) variables. This study extends previous research by examining the stability of both dimensions across different courses, student populations, and universities. In addition, the sample (N = 692 courses) was drawn from 6 traditional and technical German universities that have a different ethos of student interaction with academic staff than those in many other Western countries. Using the Heidelberg Inventory, it was found that instructor variables were reliable across courses given by the same instructor, but student scales or background variables were less consistent across courses in which the content was identical. It was concluded that the instrument was both reliable and valid for student evaluations of both teaching performance and course quality within a European context.
Similar content being viewed by others
REFERENCES
Asendorpf, J., and Wallbott, H. G. (1979). Maβe der Beobachterü bereinstimmung: ein systematischer Vergleich [Measures of observer agreement: a systematic comparison]. Zeitschrift für Sozialpsychologie 10: 243–252.
Bausell, R., Schwartz, S., and Purohit, A. (1975). An examination of the conditions under which various student rating parameters replicate across time. Journal of Educational Measurement 12: 273–280.
Biggs, J. B. (1990). Asian students' approaches to learning: implications for teaching overseas students. Keynote discussion paper. Proceedings of the 8th Australasian Tertiary Learning Skills and Language Conference (pp. 1–51). Brisbane: Q.U.T.
Bortz, J., and Döring, N. (1995). Forschungsmethoden und Evaluation [Research methods and evaluation]. Berlin: Springer.
Bortz, J., Lienert, G. A., and Boehnke, K. (1990). Verteilungsfreie Methoden in der Biostatistik [Distribution-free methods in biostatistics]. Berlin: Springer.
Cashin, W. E. (1990). Student ratings of teaching: recommendations for use. Manhattan, KS: Center for Faculty Evaluation and Development (IDEA No. 22).
Cashin, W. E. (1996). Developing an effective faculty evaluation system. Manhattan, KS: Center for Faculty Evaluation and Development (IDEA No. 33).
Cronbach, L. J., Gleser, G. C., Nanda, H., and Rajaratnam, N. (1972). The Dependability of Behavioral Measurements: Theory of Generalizability for Scores and Profiles. New York: John Wiley.
Cruickshank, D. R., and Kennedy, J. J. (1986). Teacher clarity. Teaching & Teacher Education 2: 43–67.
Daniel, H.-D. (1996). Evaluierung der universitären Lehre durch Studenten und Absol-venten [Evaluation of university teaching through students and graduates]. Zeitschrift für Sozialisationsforschung und Erziehungssoziologie 16: 149–164.
Dunkin, M. J., and Barnes, J. (1986). Research on teaching in higher education. In M. C. Wittrock (ed.), Handbook of research on teaching, pp. 754–777. New York: Macmillan.
Emmer, E. T., Evertson, C. M., and Brophy, J. E. (1979). Stability of teacher effects in junior high classrooms. American Educational Research Journal 16: 71–75.
Feger, H. (1992). Vergleichende bewertung von lehrveranstaltungen—anmerkungen zur methodik [Comparative evaluation of courses—notation to methods used]. In D. Grühn and H. Gattwinkel (ed.), Evaluation von Lehrveranstaltungen. űberfrachtung eines sinnvollen Instrumentes? [Evaluation of courses. The overloading of a meaning-ful instrument?], pp. 127–142. Berlin: FU-Dokumentationsreihe.
Feldman, K. A. (1977). Consistency and variability among college students in rating their teachers and courses: a review and analysis. Research in Higher Education 6: 223–274.
Feldman, K. A. (1978). Course characteristics and college students' ratings of their teachers: What we know and what we don't. Research in Higher Education 9: 199–242.
Frey, P. W. (1978). A two-dimensional analysis of student ratings of instruction. Research in Higher Education 9: 69–91.
Gillmore, G. M. (1977). How large is the course effect? A note on Romney's course effect vs. teacher effect on students' ratings of teacher competence. Research in Higher Education 7: 187–189.
Gillmore, G. M., Kane, M. T., and Naccarato, R. W. (1978). The generalizability of student ratings of instruction: estimation of the teacher and course components. Journal of Educational Measurement 15: 1–13.
Greenwald, A. G. (1997). Validity concerns and usefulness of student ratings of instruction. American Psychologist 52: 1182–1186.
Hage, N. el (1996). Lehrevaluation und studentische Veranstaltungskritik [Teaching evaluation and students' course criticism]. Bonn: Bundesministerium für Bildung, Wissenschaft, Forschung, und Technologie.
Hanges, P. J., Schneider, B., and Niles, K. (1990). Stability of performance: an interactionist perspective. Journal of Applied Psychology 75: 658–667.
Hogan, T. P. (1973). Similarity of student ratings across instructors, courses, and time. Research in Higher Education 1: 149–154.
Holloway, S. D. (1988). Concepts of ability and effort in Japan and the United States. Review of Educational Research 58: 327–345.
Kane, M. T., Gillmore, G. M., and Crooks, T. J. (1976). Student evaluations of teaching: the generalizability of class means. Journal of Educational Measurement 13(3): 171–183.
Lienert, G. A., and Raatz, U. (1994). Testaufbau und Testanalyse [Test construction and test analysis]. Weinheim: Beltz.
Marsh, H. W. (1982). The use of path analysis to estimate teacher and course effects in student ratings of instructional effectiveness. Applied Psychological Measurement 6: 47–60.
Marsh, H. W. (1983). Multidimensional ratings of teaching effectiveness by students from different academic settings and their relation to student/course/instructor characteristics. Journal of Educational Psychology 75: 150–166.
Marsh, H. W., and Bailey, H. W. (1993). Multidimensional students' evaluations of teaching effectiveness: A profile analysis. Journal of Higher Education 64: 1–18.
Marsh, H. W., and Roche, L. A. (1997). Making students' evaluations of teaching effectiveness effective. American Psychologist 52: 1187–1197.
McKeachie, W. J. (1997). Student ratings. American Psychologist 52: 1218–1225.
Meredith, G. M. (1975). Structure of student-based evaluation ratings. Journal of Psychology 91: 3–9.
Murray, H. G., Rushton, J. Rh., and Paunonen, S. V. (1990). Teacher personality traits and student instructional ratings in six types of university courses. Journal of Educa-tional Psychology 82: 250–261.
Preiβer, R. (1993). Abschluβbericht zur ersten Phase des Studienreformprojekts “Evaluation der Lehre” an der Technischen Universität Berlin [Final report on the first phase of the educational reform project “Evaluation of Teaching” at the Technological University Berlin]. Berlin: TU-Bericht.
Rindermann, H. (1996a). Untersuchungen zur Brauchbarkeit studentischer Lehrevaluati-onen [Investigation into the usefulness of student course evaluations]. Landau: Empirische Pädagogik.
Rindermann, H. (1996b). Zur Qualität studentischer Lehrveranstaltungsevaluationen: Eine Antwort auf Kritik an der Lehrevaluation [Quality of student course evaluations: An answer to criticism of the teaching evaluations]. Zeitschrift für Pädagogische Psychologie 10: 129–145.
Rindermann, H. (1997). Die studentische Beurteilung von Lehrveranstaltungen: Forschungsstand und Implikationen für den Einsatz von Lehrevaluationen [The student judgement of courses: The state of the current research and implications for the use of teaching evaluations]. Tests und Trends (Jahrbuch der Pädagogischen Diagnostik 11), pp. 12–53. Weinheim: Beltz.
Rindermann, H. (1998). űbereinstimmung und Divergenz bei der studentischen Beurtei-lung von Lehrveranstaltungen: Methoden zu ihrer Berechnung und Konsequenzen für die Lehrevaluation [Agreement and divergence in the student judgement of courses. Methods of computation and consequences for teaching evaluation]. Zeitschrift für Differentielle und Diagnostische Psychologie 19: 73–92.
Rindermann, H. (1999). Bedingungs-und Effektvariablen in der Lehrevaluationsforschung [Concept and examination of the Munich multifactored model of course quality]. Unterrichtswissenschaft 27: 357–380.
Rindermann, H., and Amelang, M. (1994). Das Heidelberger Inventar zur Lehrveranstaltungs-Evaluation (HILVE). Handanweisung [The Heidelberg Inventory on Teaching Evaluation]. Heidelberg: Asanger.
Romney, D. (1976). Course effect vs. teacher effect on students' ratings of teaching competence. Research in Higher Education 5: 345–350.
Rosenthal, R. (1987). Judgment Studies. Design, Analysis, and Meta-Analysis. Cambridge: Cambridge University Press.
Rosenthal, R. (1991). Some indices of the reliability of peer review. Behavioral and Brain Sciences 14: 160–161.
Seiler, L. H., Weybright, L. D., and Stang, D. J. (1977). How useful are published evaluations ratings to students selecting courses and instructors? Teaching of Psychology 4: 174–177.
Shavelson, R., and Russo, N. A. (1977). Generalizability of measures of teacher effectiveness. Educational Research 19: 171–183.
Shrout, P. E., and Fleiss, J. L. (1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin 86: 420–428.
Süllwold, F. (1992). Welche Realität wird bei der Beurteilung von Hochschullehrern durch Studierende erfaβt? [What reality is expressed through the judgement of university instructors by students?] Mitteilungen des Hochschulverbandes 40: 34–35.
Terry, R. L., and McIntosh, D. E. (1988). Do students' expectancies affect their course evaluations? Educational and Psychological Measurement 48: 787–798.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Rindermann, H., Schofield, N. Generalizability of Multidimensional Student Ratings of University Instruction Across Courses and Teachers. Research in Higher Education 42, 377–399 (2001). https://doi.org/10.1023/A:1011050724796
Issue Date:
DOI: https://doi.org/10.1023/A:1011050724796