Peer assessment

Last updated April 02, 2024

Peer assessment, or self-assessment, is a process whereby students or their peers grade assignments or tests based on a teacher's benchmarks.^[1] The practice is employed to save teachers time and improve students' understanding of course materials as well as improve their metacognitive skills. Rubrics are often used in conjunction with self- and peer-assessment.^[2]

Advantages

Saves teachers' time

Student grade assignments can save teacher's time^[3] because an entire classroom can be graded together in the time that it would take a teacher to grade one paper. Moreover, rather than having a teacher rush through each paper, students are able to take their time to correct them. Students can spend more time on a paper because they only have to grade one and can therefore do a more thorough job.^[4]

Faster feedback

Having students grade papers in class or assess their peers' oral presentations decreases the time taken for students to receive their feedback. Instead of them having to wait for feedback on their work, self- and peer-assessment allow assignments to be graded soon after completion. Students then do not have to wait until they have moved onto new material and the information is no longer fresh in their minds.^[1]

The faster turnaround time of feedback has been also shown to increase the likelihood of adoption by the feedback recipient. A controlled experiment conducted in a Massive Open Online Course (MOOC) setting found that students' final grades improved when feedback was delivered quickly, but not if delayed by 24 hours.^[5]

Pedagogical

Teacher's evaluation role makes the students focus more on the grades not seeking feedback.^[6] Students can learn from grading the papers^[1] or assessing the oral presentations of others. Often, teachers do not go over test answers and give students the chance to learn what they did wrong. Self and peer assessment allow teachers to help students understand the mistakes that they have made. This will improve subsequent work and allow students time to digest information and may lead to better understanding.^[7] A study by Sadler and Good found that students who self-graded their tests did better on later tests. The students could see what they had done wrong and were able correct such errors in later assignments. After peer grading, students did not necessarily achieve higher results.^[8]

Peer feedback can also enhance learners' audience awareness, promote collaborative learning, and develop a sense of ownership of the text. Text analysis of the survey conducted by Fan, Yumei, and Jinfen Xu showed that these students provided content-focused and form-focused evaluative feedback to their peers, while their peers provided feedback on manuscripts or orals in class.^[9]

Turpin and Kristen M’s experiments found that peer feedback is very important for learners to participate in the learning process. Secondly, the quality of peer feedback is also crucial. Systematically trained feedback allows peers to provide high-quality, actionable feedback.^[10]

Metacognitive

Through self- and peer-assessment students are able to see mistakes in their thinking and can correct any problems in future assignments. By grading assignments, students may learn how to complete assignments more accurately and how to improve their test results.^[1]

Professors Lin-Agler, Moore, and Zabrucky conducted an experiment in which they found "that students are able to use their previous experience from preparing for and taking a test to help them build a link between their study time allocation."^[11] Students can not only improve their ability to study for a test after participating in self- and peer- assessment but also enhance their ability to evaluate others through improved metacognitive thinking.^[12]

However, in the experiments of Christensen, Vibeke and Peter Hobel, they found that the content that students evaluated was all about content, structure and form. Feedback on the sentences themselves is almost non-existent.^[13]

Attitude

If self- and peer-assessment are implemented, students can come to see tests not as punishments but as useful feedback.^[12] Hal Malehorn says that by using peer evaluation, classmates can work together for "common intellectual welfare" and that it can create a "cooperative atmosphere" for students instead of one where students compete for grades.^[2] In addition, when students assess the works of their fellow students, they also reflect on their own works. This reflective process stimulates action for improvement.^[14]

However, in the Supreme Court Case Owasso Independent School District v. Falvo , the school was sued following victimization of an individual after other students learned that he had received a low test score.^[15] Malehorn attempts to show what the idealized version of peer-assessment can do for classroom attitude. In practice, situations where students are victimized can result as seen in the Supreme Court Case.

In an experimental investigation by Yallop, Roger, Piia Taremaa, and Djuddah Leijen, they found that the feedback process can induce strong negative emotions in reviewers (e.g., frustration) and feedback recipients (e.g., anxiety). Therefore, the feedback process has a large impact. This influence can shape attitudes. A positive attitude encourages greater participation in the feedback process.^[16]

Defects

Research by Garner, Joe and Oliver Hadingham found serious flaws. Students take this group very seriously and rarely give negative comments. And there’s no constructive criticism of other people’s writing. In the study, anonymous evaluation was used to solve this problem, allowing students to evaluate others without pressure, but they also lost the ability to interact.^[17]

Teacher grading agreement

One concern about self- and peer-assessment is that students may give higher grades than teachers. Teachers want to reduce grading time but not at the cost of losing accuracy.^[18]

Support

A study by Saddler and Good has shown that there is a high level of agreement between grades assigned by teachers and students as long as students are able to understand the teacher's quality requirements. They also report that teacher grading can be more accurate as a result of using self- and peer-assessment. If teachers look at how students grade themselves, then they have more information available from which to assign a more accurate grade.^[19]

Opposition

However, Saddler and Good warn that there is some disagreement. They suggest that teachers implement systems to moderate grading by students in order to catch unsatisfactory work.^[19] Another study reported that grade inflation did occur as students tended to grade themselves higher than a teacher would have. This would suggest that self- and peer-assessment are not an accurate method of grading due to divergent results.^[20]

Comparison

According to the study by Saddler and Good, students who peer grade tend to undergrade and students who are self graded tend to overgrade. However, a large majority of students do get within 5% of the teacher’s grade. Relatively few self graders undergrade and relatively few peer graders tend to overgrade.^[18]

Perhaps one of the most prominent models of peer-assessment can be found in design studios.^[21]^[22] One of the benefits of such studios comes from structured contrasts which can help novices notice differences that might otherwise have been accessible only for experts.^[23] In fact, it is a well known strategy for designers to use comparisons to get inspired.^[24]^[25] Some researchers designed systems that support comparative examples to surface helpful comparisons in educational settings.^[26]^[27]^[28] However, what makes a good comparison remains unclear; the general guidance of good feedback by Sadler describes three characteristics: specific, actionable, and justified,^[29] and has widely been adopted in feedback research. However, with each piece of work to be evaluated differing so vastly in content, the path towards those qualities in a specific feedback performance remains largely unknown. Effective feedback is not only written actionably, specifically, and in a justified manner, but more importantly, contains good content; good in the sense that it points out relevant things, brings in new insights, and changes the minds of its recipients to consider the problem from a different angle, or re-represent it completely. This requires content-specific customization.

Rubrics

Purpose

Students need guidelines to follow before they are able to grade more open ended questions. These often come in the form of rubrics, which lay out different objectives and how much each is worth when grading.^[12] Rubrics are often used for writing assignments.^[30]

Examples of objectives

Expression of ideas
Organization of content
Originality
Subject knowledge
Content
Curriculum alignment
Balance
Voice

Group work

One area in which self- and peer-assessment is being applied is in group projects. Teachers can give projects a final grade but also need to determine what grade each individual in the group deserves. Students can grade their peers and individual grades can be based on these assessments. Nevertheless, there are problems with this grading method. If students grade each other unfairly, the overall grades skew in different directions.^[31]

Overgenerosity

Some students may give all the other students remarkably high grades which will cause their scores to be lower compared to the others. This can be addressed by having students grade themselves and thus their generosity will also extend to themselves and raise their grades by the same amount. However, this does not compensate for students who feel they did not work at their best, and self-grade themselves too harshly.^[32]

Creative accounting

Some students will award everybody low marks and themselves exceedingly high marks to bias the data. This can be countered by checking student’s grades and making sure that they are consistent with where in the group their peers graded them.^[33]

Individual penalization

If all of the students go against one student because they feel that the individual did little work, then they will receive an exceptionally low grade. This is permissible if the student in question really did do truly little work but may require the instructor's intervention before it ends up as the final result.^[33]

Classroom participation

While it is difficult to grade students on participation in a classroom setting because of its subjective nature, one method of grading participation is to use self- and peer-assessment. Professors Ryan, Marshall, Porter, and Jia conducted an experiment to see if using students to grade participation was effective. They found that there was a difference between a teacher's evaluation of participation and a student's. However, there was no academic significance, indicating that student's final grades were not affected by the difference in a teacher's evaluation and a student's. They concluded that self- and peer-assessment is an effective way to grade classroom participation.^[34]

At scale

The peer-assessment mechanism is also the gold-standard in many creative tasks varied from reviewing the quality of scholarly articles or grant proposals to design studios. However, as the number of assessments to be done increases, challenges arise. One is that because no one providing assessment has a global understanding of the entire pool of submissions, local biases in judgment may be introduced (e.g. the range of a scale used to assess may be affected by the pool of submissions the assessor reviews) and noises in the ranking aggregated from individual peer-assessment may be added. On the other hand, because the ranked outcome is of utmost interest in many situations (e.g. allocating research grants to proposals or assigning letter grades to students), ways to systematically aggregate peer-wise assessment to recover the ranked order of submissions has many practical implications.

To tackle this, some researchers studied (1) evaluation schemes (e.g. ordinal grading,^[35] (2) algorithms to aggregate pairwise evaluation to more robustly estimate the global ranking of submissions,^[36] and (3) produce more optimal pairs to exchange feedback either by considering conflicts of interest^[37] or (4) by modeling a framework that reduces the error between individual- and community-level judgment on the value of a scholarly article.^[38]

Legality

The legality of self- and peer-Assessment was challenged in the United States Supreme Court case of Owasso Independent School District v. Falvo . Kristja Falvo sued the school district where her son attended school because it used peer-assessment and he was teased about a low score. The teacher's right to use self- and peer-assessment was upheld by the court.^[39]

Notes

Related Research Articles

<span class="mw-page-title-main">Peer review</span> Evaluation of work by one or more people of similar competence to the producers of the work

Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work. It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer review methods are used to maintain quality standards, improve performance, and provide credibility. In academia, scholarly peer review is often used to determine an academic paper's suitability for publication. Peer review can be categorized by the type of activity and by the field or profession in which the activity occurs, e.g., medical peer review. It can also be used as a teaching tool to help students improve writing assignments.

A teaching method is a set of principles and methods used by teachers to enable student learning. These strategies are determined partly on subject matter to be taught, partly by the relative expertise of the learners, and partly by constraints caused by the learning environment. For a particular teaching method to be appropriate and efficient it has take into account the learner, the nature of the subject matter, and the type of learning it is supposed to bring about.

Instructional scaffolding is the support given to a student by an instructor throughout the learning process. This support is specifically tailored to each student; this instructional approach allows students to experience student-centered learning, which tends to facilitate more efficient learning than teacher-centered learning. This learning process promotes a deeper level of learning than many other common teaching strategies.

Educational assessment or educational evaluation is the systematic process of documenting and using empirical data on the knowledge, skill, attitudes, aptitude and beliefs to refine programs and improve student learning. Assessment data can be obtained from directly examining student work to assess the achievement of learning outcomes or can be based on data from which one can make inferences about learning. Assessment is often used interchangeably with test, but not limited to tests. Assessment can focus on the individual learner, the learning community, a course, an academic program, the institution, or the educational system as a whole. The word "assessment" came into use in an educational context after the Second World War.

Educational technology is the combined use of computer hardware, software, and educational theory and practice to facilitate learning. When referred to with its abbreviation, "EdTech," it often refers to the industry of companies that create educational technology. In EdTech Inc.: Selling, Automating and Globalizing Higher Education in the Digital Age, Tanner Mirrlees and Shahid Alvi (2019) argue "EdTech is no exception to industry ownership and market rules" and "define the EdTech industries as all the privately owned companies currently involved in the financing, production and distribution of commercial hardware, software, cultural goods, services and platforms for the educational market with the goal of turning a profit. Many of these companies are US-based and rapidly expanding into educational markets across North America, and increasingly growing all over the world."

In the realm of US education, a rubric is a "scoring guide used to evaluate the quality of students' constructed responses" according to James Popham. In simpler terms, it serves as a set of criteria for grading assignments. Typically presented in table format, rubrics contain evaluative criteria, quality definitions for various levels of achievement, and a scoring strategy. They play a dual role for teachers in marking assignments and for students in planning their work.

An intelligent tutoring system (ITS) is a computer system that imitates human tutors and aims to provide immediate and customized instruction or feedback to learners, usually without requiring intervention from a human teacher. ITSs have the common goal of enabling learning in a meaningful and effective manner by using a variety of computing technologies. There are many examples of ITSs being used in both formal education and professional settings in which they have demonstrated their capabilities and limitations. There is a close relationship between intelligent tutoring, cognitive learning theories and design; and there is ongoing research to improve the effectiveness of ITS. An ITS typically aims to replicate the demonstrated benefits of one-to-one, personalized tutoring, in contexts where students would otherwise have access to one-to-many instruction from a single teacher, or no teacher at all. ITSs are often designed with the goal of providing access to high quality education to each and every student.

Computer-supported collaborative learning (CSCL) is a pedagogical approach wherein learning takes place via social interaction using a computer or through the Internet. This kind of learning is characterized by the sharing and construction of knowledge among participants using technology as their primary means of communication or as a common resource. CSCL can be implemented in online and classroom learning environments and can take place synchronously or asynchronously.

A course evaluation is a paper or electronic questionnaire, which requires a written or selected response answer to a series of questions in order to evaluate the instruction of a given course. The term may also refer to the completed survey form or a summary of responses to questionnaires.

Formative assessment, formative evaluation, formative feedback, or assessment for learning, including diagnostic testing, is a range of formal and informal assessment procedures conducted by teachers during the learning process in order to modify teaching and learning activities to improve student attainment. The goal of a formative assessment is to monitor student learning to provide ongoing feedback that can help students identify their strengths and weaknesses and target areas that need work. It also helps faculty recognize where students are struggling and address problems immediately. It typically involves qualitative feedback for both student and teacher that focuses on the details of content and performance. It is commonly contrasted with summative assessment, which seeks to monitor educational outcomes, often for purposes of external accountability.

Assessment in computer-supported collaborative learning (CSCL) environments is a subject of interest to educators and researchers. The assessment tools utilized in computer-supported collaborative learning settings are used to measure groups' knowledge learning processes, the quality of groups' products and individuals' collaborative learning skills.

Corrective feedback is a frequent practice in the field of learning and achievement. It typically involves a learner receiving either formal or informal feedback on their understanding or performance on various tasks by an agent such as teacher, employer or peer(s). To successfully deliver corrective feedback, it needs to be nonevaluative, supportive, timely, and specific.

Peer feedback is a practice where feedback is given by one student to another. Peer feedback provides students opportunities to learn from each other. After students finish a writing assignment but before the assignment is handed in to the instructor for a grade, the students have to work together to check each other's work and give comments to the peer partner. Comments from peers are called as peer feedback. Peer feedback can be in the form of corrections, opinions, suggestions, or ideas to each other. Ideally, peer feedback is a two-way process in which one cooperates with the other.

Goal orientation, or achievement orientation, is an "individual disposition towards developing or validating one's ability in achievement settings". In general, an individual can be said to be mastery or performance oriented, based on whether one's goal is to develop one's ability or to demonstrate one's ability, respectively. A mastery orientation is also sometimes referred to as a learning orientation.

Online communication between home and school is the use of digital telecommunication to convey information and ideas between teachers, students, parents, and school administrators. As the use of e-mail and the internet becomes even more widespread, these tools become more valuable and useful in education for the purposes of increasing learning for students, and facilitating conversations between students, parents, and schools.

A massive open online course or an open online course is an online course aimed at unlimited participation and open access via the Web. In addition to traditional course materials, such as filmed lectures, readings, and problem sets, many MOOCs provide interactive courses with user forums or social media discussions to support community interactions among students, professors, and teaching assistants (TAs), as well as immediate feedback to quick quizzes and assignments. MOOCs are a widely researched development in distance education, first introduced in 2008, that emerged as a popular mode of learning in 2012, a year called the "Year of the MOOC".

Teaching and learning centers are independent academic units within colleges and universities that exist to provide support services for faculty, to help teaching faculty to improve their teaching and professional development. Teaching centers also routinely provide professional development for graduate students as they prepare for future careers as teaching faculty. Some centers also may provide learning support services for students, and other services, depending on the individual institution. Teaching and learning centers may have different kinds of names, such as faculty development centers, teaching and learning centers, centers for teaching and learning, centers for teaching excellence, academic support centers, and others; a common abbreviation is TLC.

Educator effectiveness is a United States K-12 school system education policy initiative that measures the quality of an educator performance in terms of improving student learning. It describes a variety of methods, such as observations, student assessments, student work samples and examples of teacher work, that education leaders use to determine the effectiveness of a K-12 educator.

Language MOOCs are web-based online courses freely accessible for a limited period of time, created for those interested in developing their skills in a foreign language. As Sokolik (2014) states, enrolment is large, free and not restricted to students by age or geographic location. They have to follow the format of a course, i.e., include a syllabus and schedule and offer the guidance of one or several instructors. The MOOCs are not so new, since courses with such characteristics had been available online for quite a lot of time before Dave Cormier coined the term 'MOOC' in 2008. Furthermore, MOOCs are generally regarded as the natural evolution of OERs, which are freely accessible materials used in Education for teaching, learning and assessment.

Wooclap is an interactive electronic platform used to create polls and questionnaires. The site's users answer questions anonymously through technology devices such as smartphones or laptops.

References

Andrade, Heidi, and Ying Du "Student responses to criteria-referenced self-assessment." Assessment & Evaluation in Higher Education 32.2 (2007): 159–181.
Gopinath, C. "Alternatives to Instructor Assessment of Class Participation." Journal of Education for Business 75.1 (1999): 10.
Li, Lawrence K. Y. "Some Refinements on Peer Assessment of Group Projects." Assessment & Evaluation in Higher Education 26.1 (2001): 5–18.
Lin-Agler, Lin Miao, DeWayne Moore, and Karen M. Zabrucky "EFFECTS OF PERSONALITY ON METACOGNITIVE SELF-ASSESSMENTS." College Student Journal 38.3 (2004): 453–461.
Malehorn, Hal "Ten measures better than grading." Clearing House 67.6 (1994): 323.
Mok, Magdalena Mo Ching, et al. "Self-assessment in higher education: experience in using a metacognitive approach in five case studies." Assessment & Evaluation in Higher Education 31.4 (2006): 415–433.
Ngar-Fun, Liu, and David Carless "Peer feedback: the learning element of peer assessment." Teaching in Higher Education 11.3 (2006): 279–290.
Ryan, Gina J., et al. "Peer, professor and self-evaluation of class participation." Active Learning in Higher Education 8.1 (2007): 49–61.
Sadler, Philip M., and Eddie Good "The Impact of Self- and Peer-Grading on Student Learning." Educational Assessment 11.1 (2006): 1–31.
Searby, Mike, and Tim Ewers "An evaluation of the use of peer assessment in higher education: A case study in the School of Music" Assessment & Evaluation in Higher Education 22.4 (1997): 371.
Strong, Brent, Mark Davis, and Val Hawks "SELF-GRADING IN LARGE GENERAL EDUCATION CLASSES." College Teaching 52.2 (2004): 52–57.
van den Berg, Ineke, Wilfried Admiraal, and Albert Pilot "Peer assessment in university teaching: evaluating seven course designs." Assessment & Evaluation in Higher Education 31.1 (2006): 19–36.
Raman, Karthik, and Thorsten Joachims. "Methods for ordinal peer grading." Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2014.
Chen, Xi, et al. "Pairwise ranking aggregation in a crowdsourced setting." Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 2013.
Kotturi, Yasmine, et al. "Rising above Conflicts of Interest: Algorithms and Interfaces to Assess Peers Impartially." 2013.
Noothigattu, Ritesh, Nihar B. Shah, and Ariel D. Procaccia. "Choosing How to Choose Papers." arXiv preprint arXiv:1808.09057 (2018).
Kulkarni, Chinmay E., Michael S. Bernstein, and Scott R. Klemmer. "PeerStudio: rapid peer feedback emphasizes revision and improves performance." Proceedings of the second (2015) ACM conference on learning@ scale. ACM, 2015.
Dannels, Deanna P., and Kelly Norris Martin. "Critiquing critiques: A genre analysis of feedback across novice to expert design studios." Journal of Business and Technical Communication 22.2 (2008): 135-159.
Schwartz, Daniel L., Jessica M. Tsang, and Kristen P. Blair. The ABCs of how we learn: 26 scientifically proven approaches, how they work, and when to use them. WW Norton & Company, 2016.
Herring, Scarlett R., et al. "Getting inspired!: understanding how and why examples are used in creative design practice." Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2009.
Newman, Mark W., and James A. Landay. "Sitemaps, storyboards, and specifications: a sketch of Web site design practice." Proceedings of the 3rd conference on Designing interactive systems: processes, practices, methods, and techniques. ACM, 2000.
Cambre, Julia, Scott Klemmer, and Chinmay Kulkarni. "Juxtapeer: Comparative peer review yields higher quality feedback and promotes deeper reflection." Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 2018.
Kang, Hyeonsu B., et al. "Paragon: An Online Gallery for Enhancing Design Feedback with Visual Examples." Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 2018.
Potter, Tiffany, et al. "ComPAIR: A New Online Tool Using Adaptive Comparative Judgement to Support Learning with Peer Feedback." Teaching & Learning Inquiry 5.2 (2017): 89-113.
Sadler, D. Royce. "Formative assessment and the design of instructional systems." Instructional science 18.2 (1989): 119-144.
Goldschmidt, Gabriela, Hagay Hochman, and Itay Dafni. "The design studio “crit”: Teacher–student communication." AI EDAM 24.3 (2010): 285-302.