How Kritik Ensures Accountability in Peer Assessment

How Peer Assessment Benefits Student Learning

On average, students' grading power increases by 255% over the course of a semester in Kritik. Grading powers, a score out of 6, refer to how effectively students evaluate their peers and the scores are adjusted by the Kritik platform automatically after each activity. This means that students on average are becoming better evaluators by learning how to identify and communicate critical and motivational feedback to their peers over time.

Peer assessment benefits student learning through increased student engagement, improved motivations to learn, and an increased efficiency in grading workflow. However, many instructors refrain from implementing peer assessment due to a lack of understanding on how to manage it — particularly with larger class sizes — and how to ensure reliability and validity of the process (Falkichov & Goldfinch, 2000). Ultimately, this deprives students of the benefits of peer assessment— but there is a way forward.

Dr. Karen Freberg, Professor of Marketing and Communications at the University of Louisville and West Virginia University, notes that she has seen a positive change in her classroom’s responsiveness and attitude towards her courses because her students felt more confident with sharing ideas and demonstrating class concepts.

“I’ve seen a huge difference in writing and strategic thinking and concepts based on utilizing Kritik in my classes.”

Peer Evaluations v.s. Instructor Evaluations

In 2000, a meta-analysis of 52 studies examined the comparability of peer evaluations to instructor evaluations. The research measured a mean correlation of 0.69, suggesting definite evidence of agreement between peer and instructor grading (Falchikov & Goldfinch, 2000). The levels of education and subjects across these studies did not affect the comparability of peer evaluations and instructor evaluations overall (Falchikov & Goldfinch, 2000). It should be noted that well-designed studies showed more positive results about the relationship between peer and teacher grading, and clear instructions and grading criteria also influenced the quality of peer evaluations.

More recently, a 2020 meta-analysis found that peer assessment enhanced student learning in ways that professor assessment could not. With a high comparability between peer grading and instructor grading, peer assessment presented a more impactful and positive effect on student academic performance because students felt more motivated to learn and apply their knowledge when assessing their peers’ work (Double et al., 2020).

Diving into the intuitive peer assessment features in Kritik

The peer assessment process in Kritik follows three stages: the Create Stage, Evaluate Stage, and Feedback Stage. To better illustrate the Kritik grading system, let’s introduce Jessica, a first year English student. In Kritik, Jessica will take part in a three-stage peer assessment process.


Through this process, she will receive three grades that make up the overall activity score and multiple points of feedback from her peers.

  • Creation score: the weighted average* of the evaluations and evaluators’ grading power (*the actual grade the evaluators marked them based on the rubric multiplied by the evaluators’ respective grading power)
  • Evaluation score: Jessica’s evaluation score is composed of two parts: the grading score and the written evaluation score. Her grading score is determined by how closely she graded her classmates compared to other evalutions using the rubric; and her written evaluation score is determined by the quality of her written evaluations based on peer feedback.
  • Feedback score: the feedback score is a participation score, so long as Jessica provides feedback to all of the evaluations she has received, she will receive a 100%. However, the feedback she provides will impact her peers’ evaluation scores.
  • Overall score: the grade Jessica achieves on an activity by calculating the overall scores according to their weight

Jessica’s Creation score is determined by how her peers evaluate her work; her Evaluation score is determined by the quality of her own evaluations; and her Feedback score is determined by the number of evaluations she provides feedback on.

If an instructor manually grades the Creation (e.g. resolving a grade dispute, grading a late submission, etc.), the evaluator’s grading score is compared to the instructor’s grade instead of their peers.


Grading score v.s. Grading power 

Kritik uses a Grading Score and Grading Power to ensure accountability and meaning to the peer assessment process while providing students and professors with a measurable outcome along the way. To differentiate the two:

  • Grading score: as mentioned previously, the grading score is a component of the student's evaluation score.
  • Grading power: A student’s grading power changes over the course of the semester based on how well they evaluate their peers from activity to activity. Not only is this score an indication of their progress, but it has a real effect on the grading process. More specifically, students with a higher grading score who have proven their competency of evaluating their peers have a higher impact on the evaluations than their peers.

What happens at the start of the semester before students have built up their grading scores?

At the beginning of the course, the professor releases calibration activity to set the grading score and to introduce the students to peer assessment process. Multiple calibration activities can be set throughout the term to adjust students’ grading power overtime.

For example, Jessica might score very similarly to her professor, as calculated in the first calibration activity scheduled at the beginning of the term. Before her course started, she had a default grading power of Beginner, but after her calibration activity, she marked very closely to her professor, so she leveled up to Beginner 2. This means that she will have more impact on her peers’ Creation scores when she evaluates them compared to students who are still at  Beginner level.

https://downloads.intercomcdn.com/i/o/230812363/850675097b93674375a169bb/ks+borders.png

A good analogy for grading power is comparing weighted assignments. If Essay A is worth 50% of your overall mark and Essay B is only worth 30%, then Essay A will have more impact on your overall mark. Thus, if Student A has a higher grading power (50%) and Student B has a lower grading power (30%), then how Student A grades you will impact your evaluation score more than Student B.

Even after the calibration activity, Kritik uses AI to conduct micro-calibrations after every activity. Students’ grading power changes based on how closely they assess one another. Moreover, when instructors manually regrade or adjust students’ scores, the students’ grading power will change accordingly.

The role of AI in Kritik’s grading power

With a better understanding of the scoring system in Kritik and how grading power works, let’s get into why these things matter… and how this design inherently protects the validity and accuracy of peer grading.

Calibration activities

As mentioned before, Kritik has calibration activities that instructors can set up throughout the term. Calibration activities are a unique component of Kritik peer assessment activities that guarantee meaningful and valid comparability between peer grading and instructor grading. The Kritik AI compares student evaluations to the baseline created by the professor, which:

  1. Ensures accuracy in peer grading
  2. Prevents gaming the system (where students give one another the highest potential score)
  3. Requires students to refer back to the activity objectives and rubric to properly and meaningfully evaluate the assignment

The calibration feature ultimately increases efficiencies in grading time and workflow as multiple students evaluate one another’s work working towards the model and expectations set by the professor. Setting a calibration activity will also discourage students from colluding to grade one another highly, as they will understand that their grading power is calibrated based on their professor's evaluations.

Quantity AND quality: conducting multiple peer assessments

Kritik assigns a default of 5 evaluations per student, and distributes evaluations evenly across all students. Having multiple assessments introduces dynamic feedback, and the weighted average of these evaluations ensures that they are still being marked closely to what the instructor would mark them individually.

Worried about students evaluating others first, seeing their peers’ work, and then submitting their assignment after? Don’t worry: students can only evaluate their peers after the Create stage, meaning they are guided through the process in a controlled manner, so that they can properly reflect and take the time to provide meaningful evaluations to their peers.

Professor Elliot Currie from the University of Guelph notes how Kritik has improved his students’ quality of work and feedback with multiple evaluations:

“The students put a fair amount of time and effort into their assessments. They did want to receive customized feedback, so they felt the need to put effort into their assessments. The Evaluation score tracks how the students perform in their assessment, and they got better at providing feedback throughout the term. Kritik's calibration and grade dispute features allow me to ensure students are on the right track.”


Important Feature in Kritik: Anonymity

Assignments are double blind meaning students will not see who they are evaluating, nor will they know who evaluated them. Double blind peer assessment activities also lead to an improved feedback quality and a positive student experience as students feel less pressure to grade or complete their work anonymously. As Professor Michael Jones, Kritik user and professor of Communications at Sheridan College notes:

“Kritik has this level of anonymity so they don’t know who they’re evaluating which we like because it removes that assessment bias and it makes them more comfortable.”


Rubrics

Peer assessment activities require students to evaluate their peers using criteria offered by instructors. Creating rubrics with clear criteria will allow students to understand course expectations and demonstrate their knowledge by creating and assessing work. Moreover, clear criteria guides assessors to make decisions on what classifies good work and not, and be used consistently across the class.

Moreover, the written evaluation portion of the Evaluate stage allows for specific feedback with strengths and weaknesses. The Feedback stage allows for feedback on peers’ evaluations for more strengths and weaknesses, introducing an honest, dynamic dialogue between students to better understand the course.

Check out our community of practice article on crafting detailed rubrics for higher education.

Keeping students accountable through peer-to-peer learning

The goal of Kritik is to empower students to take control of their learning . Professor Heidi Engelhardt from the University of Waterloo talks about how the integrated peer assessment in Kritik improved her students’ academic performances overall and increased her grading efficiency.

“[Students are] coming from a culture of grade inflation: ‘justify why I did not get 100’ tends to be what the mindset is. So, sure, you provide a rubric, but they are expecting 100. So I made sure they knew that, at least to the criteria, it wasn’t just adequate that got you four stars out of four. It was knocking that ball out of the park. I said, ‘Look, if you dispute a grade, it’s not because that guy didn’t like your colour scheme and you want to get back at him. You could have had an 89. If you dispute that, the mark is thrown away, and I’m evaluating it—  and I don’t give 89’s lightly! Once they got that, it was really good. So the assignment that just finished: zero disputes!”

Peer assessment when delivered effectively offers dynamic benefits compared to traditional grading by introducing new perspectives and better immersing students in the coursework through evaluation roles. Kritik takes the guesswork out of peer assessment and ensures a seamless process that not only makes managing the process easier for the professor, but makes certain the process is consistent and appropriately structured throughout the semester and regardless of activity— individual or group-based.


References

Double, K. S., McGrane, J. A., & Hopfenbeck, T. N. (2020). The Impact of Peer Assessment on Academic Performance: A Meta-analysis of Control Group Studies. Educational Psychology Review, 32(2), 481–509. https://doi.org/10.1007/s10648-019-09510-3

Falchikov, N., & Goldfinch, J. (2000). Student peer assessment in higher education: A meta-analysis comparing peer and teacher marks. Review of Educational Research, 70(3), 287–322. https://doi.org/10.2307/1170785

Virginia Li
Education Researcher

Heading