However, it is frequently the case that each item is rated by only a subset of the reviewers. If the reviewers differ in the way they assign their ratings, such that some reviewers are more generous than others, simply averaging an item's ratings may result in a biased score. If an item is only reviewed by a small number of reviewers and some of them happen to be sticklers, the item may receive a lower average rating than it deserves. In theory, this bias could be greatly reduced by estimating how generous each of the reviewers is, relative to the other reviewers, and adjusting his or her ratings accordingly.
The following technical report describes a model that simultaneously determines the generosity of each reviewer and a score for each item that is adjusted to account for the varying generosities of its reviewers.
This form allows you to try out the analysis methods described in the paper on your own data. Just copy and paste the data to the box below, set the parameters, and press the "Analyze It" button.
Each line of the data should include three fields, separated by white space.
The first field is the name of the item, the second is the name of the
reviewer, and the third is the rating. The item and reviewer names can be
strings or numbers, but they cannot contain white space. For example:
paper1 fred 4 paper2 fred 6 paper2 jill 3Also, be sure to set the minimum and maximum possible ratings to the end-points of your scale and the increment to the smallest possible difference between two ratings. If the scale is continuous and some ratings use the absolute maximum or minimum value, the increment should be set to a small value (like 1e-6), rather than to 0.