|
Kappa The Kappa Statistic is used frequently to measure agreement between repeat measurements of the same test, particularly when comparing results obtained by different individuals. This is termed inter-rater reliability. Kappa measures agreement beyond chance. Consider the following. Bill and Steve each toss a coin twice. On the first toss, both get heads, on the second toss, both get tails, on the third and fourth the get opposite results. If placed in a contingency table the results look like this.
If we used this data to calculate Kappa, we would obtain a result of
0, indicating no agreement beyond what would be expected by chance alone.
Now consider the following. The same individuals read a series of x-rays to determine the presence or absence of pneumonia. They agree on pneumonia presence in 75 cases., on it's absence in 20 cases, and they disagree a total of 5 times. Here, agreement is clearly better than chance.
Which gives a kappa value of 0.86, indicating excellent agreement Definitions vary, however poor reliability is often defined as a kappa
of <0.4, fair reliability as .4-.6, good reliability as
>0.6 to 0.8, and execellent as >0.8
|